
1. Introduction
In Spring Batch, the CompositeItemReader is a tool for combining multiple ItemReader instances into a single reader. This is particularly useful when we need to read data from multiple sources or in a specific sequence. For example, we might want to read records from a database and a file simultaneously or process data from two different tables in a specific order.
The CompositeItemReader simplifies handling multiple readers in a batch job, ensuring efficient and flexible data processing. In this tutorial, we’ll go through the implementation of a CompositeItemReader in Spring Batch and look at examples and test cases to validate its behavior.
2. Understanding the CompositeItemReader
The CompositeItemReader works by delegating the reading process to a list of ItemReader instances. It reads items from each reader in the order they’re defined, ensuring that data is processed sequentially.
This is especially useful in scenarios like:
- Reading from multiple databases or tables
- Combining data from files and databases
- Processing data from different sources in a specific sequence
Additionally, the CompositeItemReader is part of the org.springframework.batch.item.support package, and it was introduced in Spring Batch 5.2.0.
3. Implementing a CompositeItemReader
Let’s walk through an example where we read data from two different sources: a flat file and a database. The goal is to combine product data from both sources into a single stream for batch processing. Some products are in the flat file, while others are in the database, ensuring all available records are processed together.
3.1. Create Product Class
Before we set up the readers, we need a Product class that represents the structure of the data being processed. This class encapsulates details about a product, such as its ID, name, stock availability, and price. We’ll use this model while reading from both the CSV file and the database, ensuring consistency in data handling.
The Product class serves as a data transfer object (DTO) between our readers and the batch job:
public class Product {
private Long productId;
private String productName;
private Integer stock;
private BigDecimal price;
public Product(Long productId, String productName, Integer stock, BigDecimal price) {
this.productId = productId;
this.productName = productName;
this.stock = stock;
this.price = price;
}
// Getters and Setters
}
The Product class represents each record that will be processed by our batch job. Now that our data model is ready, we’ll create individual ItemReader components for the CSV file and the database.
3.2. Flat File Reader for Product Data
The first reader fetches data from a CSV file using FlatFileItemReader. We configure it to read a delimited file (products.csv) and map its fields to the Product class:
@Bean
public FlatFileItemReader<Product> fileReader() {
return new FlatFileItemReaderBuilder<Product>()
.name("fileReader")
.resource(new ClassPathResource("products.csv"))
.delimited()
.names("productId", "productName", "stock", "price")
.linesToSkip(1)
.targetType(Product.class)
.build();
}
Here, the delimited() method ensures the data fields are separated using a delimiter (by default, a comma). The names() method defines the column names matching the attributes of the Product class, while the targetType(Product.class) method maps the fields to the class attributes.
3.3. Database Reader for Product Data
Next, we define a JdbcCursorItemReader to retrieve product data from a database table named products. This reader executes an SQL query to fetch product details and maps them to our Product class.
Below is the implementation of the database reader:
@Bean
public JdbcCursorItemReader<Product> dbReader(DataSource dataSource) {
return new JdbcCursorItemReaderBuilder<Product>()
.name("dbReader")
.dataSource(dataSource())
.sql("SELECT productid, productname, stock, price FROM products")
.rowMapper((rs, rowNum) -> new Product(
rs.getLong("productid"),
rs.getString("productname"),
rs.getInt("stock"),
rs.getBigDecimal("price")))
.build();
}
The JdbcCursorItemReader reads product records from the database one row at a time using a cursor, making it efficient for batch processing. The rowMapper() function maps each column from the result set to the corresponding field in the Product class.
4. Combining Readers Using CompositeItemReader
Now that both our CSV and database readers are configured to read product data, we can integrate them using CompositeItemReader:
@Bean
public CompositeItemReader<Product> compositeReader() {
return new CompositeItemReader<>(Arrays.asList(fileReader(), dbReader()));
}
By configuring our CompositeItemReader, we can sequentially process data from multiple sources.
Initially, the FlatFileItemReader reads product records from the CSV file, parsing each row into a Product object. Once all rows from the file have been processed, the JdbcCursorItemReader takes over and starts fetching product data directly from the database.
5. Configuring the Batch Job
Once we’ve defined our readers for both the CSV file and the database, the next step is to configure the batch job itself. In Spring Batch, a job consists of multiple steps, where each step handles a specific part of the processing pipeline:
@Bean
public Job productJob(JobRepository jobRepository, Step step) {
return new JobBuilder("productJob", jobRepository)
.start(step)
.build();
}
@Bean
public Step step(ItemReader compositeReader, ItemWriter productWriter) {
return new StepBuilder("productStep", jobRepository)
.<Product, Product>chunk(10, transactionManager)
.reader(compositeReader)
.writer(productWriter)
.build();
}
In this case, our job contains a single step that reads product data, processes it in chunks of 10, and writes it to the desired output.
The productJob bean is responsible for defining the batch job. It starts execution with the productStep, which is configured to handle product data processing.
With this setup, our batch job first reads product data from both sources using the CompositeItemReader, processes it in chunks of ten, and writes the transformed or filtered data using productWriter(). This ensures a smooth and efficient batch processing pipeline.
6. Running the Batch Job
Now that we’ve configured the readers and the job, the next step is to run the batch job and observe the behavior of CompositeItemReader. We’ll run the job within a Spring Boot application to see how it processes data from both the CSV file and the database.
In order to trigger the batch job programmatically, we’ll need to use JobLauncher. This allows us to launch the job and monitor its progress:
@Bean
public CommandLineRunner runJob() {
return args -> {
try {
jobLauncher.run(productJob, new JobParametersBuilder()
.addLong("time", System.currentTimeMillis())
.toJobParameters());
} catch (Exception e) {
// handle exception
}
};
}
In this example, we create a CommandLineRunner bean to run the job when the application starts. This invokes the productJob using JobLauncher. We also add unique JobParameters with a timestamp to ensure the job runs uniquely each time.
7. Testing the Composite Item Reader
To ensure the CompositeItemReader works as expected, we’ll test the functionality of the CompositeItemReader to ensure it reads products correctly from both CSV and database sources.
7.1. Preparing Test Data
We’ll first prepare a CSV file containing product data, which serves as the input for CompositeItemReader:
productId,productName,stock,price
101,Apple,50,1.99
Then, we also insert a record into the products table:
@BeforeEach
public void setUp() {
jdbcTemplate.update("INSERT INTO products (productid, productname, stock, price) VALUES (?, ?, ?, ?)",
102, "Banana", 30, 1.49);
}
7.2. Test the Sequential Reading Order
Now, we’ll test CompositeItemReader to verify that it processes products in the correct order, reading from both the CSV and the database sources. In this test, we read a product from the CSV file followed by the database and assert that the values match our expectations:
@Test
public void givenTwoReaders_whenRead_thenProcessProductsInOrder() throws Exception {
StepExecution stepExecution = new StepExecution(
"testStep",
new JobExecution(1L, new JobParameters()),
1L);
ExecutionContext executionContext = stepExecution.getExecutionContext();
compositeReader.open(executionContext);
Product product1 = compositeReader.read();
assertNotNull(product1);
assertEquals(101, product1.getProductId());
assertEquals("Apple", product1.getProductName());
Product product2 = compositeReader.read();
assertNotNull(product2);
assertEquals(102, product2.getProductId());
assertEquals("Banana", product2.getProductName());
}
7.3. Test With One Reader Returning Null Results
In this section, we test the behavior of CompositeItemReader when multiple readers are used and one of the readers returns null. This is important to ensure that CompositeItemReader skips over any readers that return no data and continues to the next reader until it finds valid data:
@Test
public void givenMultipleReader_whenOneReaderReturnNull_thenProcessDataFromNextReader() throws Exception {
ItemStreamReader<Product> emptyReader = mock(ItemStreamReader.class);
when(emptyReader.read()).thenReturn(null);
ItemStreamReader<Product> validReader = mock(ItemStreamReader.class);
when(validReader.read()).thenReturn(new Product(103L, "Cherry", 20, BigDecimal.valueOf(2.99)), null);
CompositeItemReader<Product> compositeReader = new CompositeItemReader<>(
Arrays.asList(emptyReader, validReader));
Product product = compositeReader.read();
assertNotNull(product);
assertEquals(103, product.getProductId());
assertEquals("Cherry", product.getProductName());
}
8. Conclusion
In this article, we learned how to implement and test a CompositeItemReader that allows us to process data from multiple sources in a specific sequence. By chaining readers together, we can process data from files, databases, or other sources in a specific sequence.
As always, the source code is available over on GitHub.
The post Composite Item Reader in Spring Batch first appeared on Baeldung.