dev-resources.site
for different kinds of informations.
Split a Huge CSV File into Multiple Smaller CSV Files #eg69
Problem description & analysis
Below is CSV file sample.csv:
v2aowqhugt,q640lwdtat,8cqw2gtm0g,ybdncfeue8,3tzwyiouft,…
f0ewv2v00z,x2ck96ngmd,9htr2874n5,fx430s8wqy,tw40yn3t0j,…
p2h6fphwco,kldbn6rbzt,8okyllngxz,a8k9slqfms,bqz5fb7cm9,…
st63tcbfv8,2n862vqzww,2equ0ydeet,0x5tidunc6,npis28avpj,…
bn1u58s39a,mg7064jlrb,edyj3t4s95,zvuf9n29ai,1m0yn8uh0n,…
…
The file contains a huge volume of data that cannot be wholly loaded into the memory. 100000 rows at most can be loaded at a time into the available memory space. So we need to split the file into multiple smaller CSV files containing 100000 rows each, as shown below:
sample1.csv 100000 rows
sample2.csv 100000 rows
…
sample[n].csv less than or equal to 100000 rows
Solution
Write the script p1.dfx below in esProc:
Explanation
A1Â Create a cursor for the original CSV file.
A2 Loop through A1’s cursor to read in 100000 rows at one time.
B2 Export A2’s rows to sample[n].csv. #A2 represents the loop number which starts from 1.
Read How to Call an SPL Script in Java to learn how to integrate the script code into a Java program.
Featured ones: