Logo

dev-resources.site

for different kinds of informations.

How to Insert Records in Batch with Spring Data

Published at
2/12/2023
Categories
spring
springdata
springjpa
batchinsert
Author
czetsuya
Author
8 person written this
czetsuya
open
How to Insert Records in Batch with Spring Data

1. Introduction

When working with a REST API, we often deal with a single entity to create, update, delete, list, and get. But there are also times when we need to manage entities in batches, and this mostly happens in the back office. For example, in an ecommerce platform, promotions are governed by uploading a CSV file in the backend. And this CSV file contains a list of records that needs to be processed.

Let's see how we can process a list of promotions.

2. Solutions for Saving a List of Promotions

For the succeeding examples, let's assume that we have a BaseEntity class that is being extended by PromotionEntity.

2.1 Reading, Processing, and Saving One Promotion at a Time

Not the ideal solution, but I still see this approach in production.

BaseEntity

@Getter
@Setter
@SuperBuilder
@NoArgsConstructor
@AllArgsConstructor
@MappedSuperclass
@JsonInclude(JsonInclude.Include.NON_NULL)
public abstract class BaseEntity implements Serializable {

  @Serial
  private static final long serialVersionUID = 3986494663579679129L;

  public static final int NB_PRECISION = 23;
  public static final int NB_SCALE = 12;

  @Id
  @GeneratedValue(strategy = GenerationType.IDENTITY)
  @Column(name = "id")
  private Long id;

  @Version
  @Column(name = "version")
  private Integer version;
}
Enter fullscreen mode Exit fullscreen mode

And in the service layer, we process the record one by one:

@Transactional(rollbackOn = {SQLException.class})
public void createPromotions(List<Promotion> promotions) {
promotionRepository.saveAll(promotions.stream()
        .map(service2EntityMapper::asPromotionEntity)
        .toList());
        }
Enter fullscreen mode Exit fullscreen mode
  • Line 1 - it's always a good idea to wrap these kinds of operations (create/delete/update) in a transaction
  • Line 3 - we stream the list of promotions
  • Line 4 - we convert each promotion POJO to an entity
  • Line 5 - we collect the promotion entities
  • Line 3 - we save a list of promotion entities
  • Note that the saveAll(List) method does not necessarily save records in batch, as it only calls save(E) underneath.

We need the bare minimum configuration for this setup for the data source.

spring:
  application:
    name: czetsuyatech
  datasource:
    url: jdbc:h2:mem:testdb
    driverClassName: org.h2.Driver
    username: sa
    password:
Enter fullscreen mode Exit fullscreen mode

Summary

  • It's always a good idea to wrap the create/update/delete operations in a transaction
  • Use stream when dealing with lists of entities
  • Use DTOs and POJOs when passing data from one layer to another
  • For table id, use GenerationType.IDENTITY, when dealing with simple requirements
  • I would use this approach when I need to process each record in a different transaction. In that case, I need to annotate this method with Transaction.NEVER and call another service that would actually do the data insertion. This means it's possible to only save a fraction of the list due to errors.

2.2 Reading, Processing, and Saving Promotions in Batch

This is the ideal solution, as it efficiently saves a list of records. This is what I will do when dealing with batch inserts.

BaseEntity

@Getter
@Setter
@SuperBuilder
@NoArgsConstructor
@AllArgsConstructor
@MappedSuperclass
@JsonInclude(JsonInclude.Include.NON_NULL)
public abstract class BaseEntity implements Serializable {

  @Serial
  private static final long serialVersionUID = 3986494663579679129L;

  @Id
  @GeneratedValue(generator = "ID_GENERATOR", strategy = GenerationType.AUTO)
  @Column(name = "id")
  private Long id;

  @Version
  @Column(name = "version")
  private Integer version;
}
Enter fullscreen mode Exit fullscreen mode
  • Line 14 - we are using an id generator that we will define in the promotion entity
  • Line 20 - we added a version field annotated with @version for optimistic locking

PromotionEntity

@Getter
@Setter
@SuperBuilder
@NoArgsConstructor
@AllArgsConstructor
@Entity
@Table(name = "promotion")
@GenericGenerator(name = "ID_GENERATOR",
    strategy = "org.hibernate.id.enhanced.SequenceStyleGenerator",
    parameters = {@Parameter(name = "sequence_name", value = "promotion_seq")}
)
public class PromotionEntity extends BaseEntity {

  @Column(name = "code")
  private String code;

  // ...
}
Enter fullscreen mode Exit fullscreen mode
  • Line 12 - we extend the BaseEntity
  • Line 8 - we create a generic sequence generator that will map to the ID_GENERATOR defined in the BaseEntity

Let's check our revised service layer. I would use a CompletableFuture that I can chain if needed.

@Transactional
public CompletableFuture<Void> createPromotions(Promotions promotions) {

log.debug("Creating promotions");

return CompletableFuture.runAsync(() -> promotionRepository.saveAll(
    promotions.getValues().stream()
        .map(service2EntityMapper::asPromotionEntity)
        .toList())
);
}
Enter fullscreen mode Exit fullscreen mode
  • Line 1 - annotate the method with Transaction so that it can revert when an error occurred
  • Line 2 - our service method that returns a CompletableFuture
  • Line 6 - here, we wrap our saveAll method inside an async block
  • Line 7 - we stream to the list of promotions
  • Line 8 - we convert each of the POJO to an entity
  • Line 9 - we collect the list of entities
  • Line 6 - we save all the entities, again it doesn't necessarily mean that we're saving in batch

To enable batch saving, we must make changes to our configuration file.

application.xml

spring:
  application:
    name: czetsuyatech.com
  datasource:
    url: jdbc:h2:mem:testdb
    driverClassName: org.h2.Driver
    username: sa
    password:
  jpa:
    database-platform: org.hibernate.dialect.H2Dialect
    properties:
      hibernate:
        jdbc:
          batch_size: 5
        generate_statistics: true
    show-sql: false
  flyway:
    enabled: true
    locations: classpath:db/migrations
    baseline-on-migrate: true
    user: sa
    password: sa
  h2:
    console:
      enabled: true
      path: /db
Enter fullscreen mode Exit fullscreen mode
  • Line 4 - our h2 data source
  • Line 14 - you need to define the batch size. Along with sequenced id, it enables batch saving
  • Line 15 - useful when checking that you have successfully enabled batch
  • Line 17 - is all about the flyway configuration

Summary

  • It's always a good idea to wrap the create/update/delete operations in a transaction
  • Use stream when dealing with lists of entities
  • Use DTOs and POJOs when passing data from one layer to another
  • Use a custom sequence generator
  • Define batch_size hibernate property in application XML
  • I would use this approach to speed up the processing of records. The downside is that each batch is managed in one transaction, which means failing one record in that batch means the whole set will not be saved.

3. Challenge

Using the above approach, try to save records in different sizes: 100, 1000, 10000, and 1000000. You should be able to see a noticeable improvement in inserts.

springdata Article's
30 articles in total
Favicon
Handling Asynchronous Execution with Transactions in Spring: A Common Pitfall and How to Solve It
Favicon
Streamlined Java Data Querying
Favicon
Jakarta Persistence API (JPA) example application: Northwind sample database
Favicon
Spring Data JPA: About Specification interface
Favicon
What is implementation class of org.springframework.data.jpa.repository.JpaRepository interface?
Favicon
Spring Data JPA demonstration
Favicon
Spring Data & MongoDB — Pagination+Sorting w/ Aggregation and Lookup
Favicon
Demystifying Spring Data and Spring Data JPA: A Question-Based Guide: Simplify Your Data Access Layer
Favicon
How to Insert Records in Batch with Spring Data
Favicon
Um olhar sobre PagingAndSortingRepository
Favicon
Spring Data JPA Interview Questions and Answers
Favicon
Quick Tip: Kotlin, Spring Data, and Optionals
Favicon
Safely Evolving Database with Liquibase, Spring Data, and Spring Boot
Favicon
How to use Spring Data DynamoDB example | Spring Boot
Favicon
How to start SpringBoot ElasticSearch using Spring Data
Favicon
Apache Solr – How to start Spring Data Solr with SpringBoot
Favicon
Spring Boot + Angular 6 example | Spring Data + REST + MongoDb CRUD example
Favicon
SpringData Reactive Cassandra Repositories | SpringBoot
Favicon
Spring Data Rest – How to create a RestAPIs with JPA Many-to-Many relational entities
Favicon
SpringData DynamoDB CRUD Example | SpringBoot
Favicon
Spring Data Rest – How to create a RestAPIs with JPA One-to-Many relational entities | SpringBoot + MySql + HAL Browser
Favicon
How to access MySQL database with Spring Data Rest application
Favicon
Kotlin Spring Boot + Angular 6 CRUD + PostgreSQL example | Spring Data JPA + REST APIs example
Favicon
Spring Data Redis Messaging – PubSub example | Spring Boot
Favicon
Spring Data Redis CRUD Operations example with Spring Boot
Favicon
Your own custom Spring Data repository
Favicon
SpringData MongoDB GridFsTemplate to save, retrieve, delete binary files (Image, Text files)
Favicon
How to use SpringData MongoRepository to interact with MongoDB
Favicon
How to start Spring Data Cassandra with SpringBoot
Favicon
Angular + Spring WebFlux + Spring Data Reactive Cassandra example

Featured ones: