Logo

dev-resources.site

for different kinds of informations.

Split a Huge CSV File into Multiple Smaller CSV Files #eg69

Published at
10/31/2024
Categories
csv
sql
programming
esproc
Author
esproc_spl
Categories
4 categories in total
csv
open
sql
open
programming
open
esproc
open
Author
10 person written this
esproc_spl
open
Split a Huge CSV File into Multiple Smaller CSV Files #eg69

Problem description & analysis

Below is CSV file sample.csv:

v2aowqhugt,q640lwdtat,8cqw2gtm0g,ybdncfeue8,3tzwyiouft,…

f0ewv2v00z,x2ck96ngmd,9htr2874n5,fx430s8wqy,tw40yn3t0j,…

p2h6fphwco,kldbn6rbzt,8okyllngxz,a8k9slqfms,bqz5fb7cm9,…

st63tcbfv8,2n862vqzww,2equ0ydeet,0x5tidunc6,npis28avpj,…

bn1u58s39a,mg7064jlrb,edyj3t4s95,zvuf9n29ai,1m0yn8uh0n,…

…

The file contains a huge volume of data that cannot be wholly loaded into the memory. 100000 rows at most can be loaded at a time into the available memory space. So we need to split the file into multiple smaller CSV files containing 100000 rows each, as shown below:

sample1.csv  100000 rows

sample2.csv  100000 rows

…

sample[n].csv  less than or equal to 100000 rows

Solution

Write the script p1.dfx below in esProc:
Explanation

A1  Create a cursor for the original CSV file.

A2  Loop through A1’s cursor to read in 100000 rows at one time.

B2  Export A2’s rows to sample[n].csv. #A2 represents the loop number which starts from 1.

Read How to Call an SPL Script in Java to learn how to integrate the script code into a Java program.

SPL open source address

Download

esproc Article's
30 articles in total
Favicon
Add records that meet the criteria before each group after grouping :From SQL to SPL
Favicon
Multi combination condition grouping and aggregation #eg93
Favicon
Split a Huge CSV File into Multiple Smaller CSV Files #eg69
Favicon
Group & Summarize a CSV File #eg68
Favicon
Getting positions of members according to primary key values #eg58
Favicon
Getting members according to primary key values #eg63
Favicon
How to Access Databases using One SQL Statement #eg71
Favicon
Filter a CSV file and re-arrange it by category #eg60
Favicon
Getting positions of members based on a specified condition #eg46
Favicon
Convert Each Whites-space-separated Text Block into a Row #eg62
Favicon
Perform Distinct on Ordered Numbers in a Text File #eg61
Favicon
Parse a csv file having a primary-sub tables structure #eg41
Favicon
Convert CSV Data into Multilevel JSON #eg56
Favicon
Add a compute column to a csv file #eg40
Favicon
SQL, in each group modify the null value of a specified column as its neighboring value #eg43
Favicon
Get the whole group where at least one member meets the specified condition #eg36
Favicon
Parse a csv file where field values are enclosed by quotation marks and contain carriage return #eg35
Favicon
Replace Duplicate Digits in Every 9-digit Number in a Text File with Non-duplicate Ones #eg52
Favicon
Reverse Rows in a Text File #eg51
Favicon
The Difference between Each Value in a Certain Column and Its Previous One and Display Result
Favicon
Java, perform COUNT on each group of a large csv file #eg33
Favicon
SQL, extract unique values of JSON format field from each group #eg42
Favicon
Multi-condition filtering #eg48
Favicon
Getting members based on a specified condition #47
Favicon
Read specified columns from a csv file #eg44
Favicon
Something could double the development efficiency of Java programmers
Favicon
Java, fill each row having a null value in a csv file with values in the directly previous row #eg32
Favicon
To Index Data is To Sort Data
Favicon
Clear duplicate lines and lines having missing values from a csv file #eg24
Favicon
SQL, Set different flags for different groups according to whether there are duplicate values #eg19

Featured ones: