dev-resources.site
for different kinds of informations.
A Guide to Parsing CSV Files in Go
Among all the programming languages available, Go, or Golang, has become more popular in the past years. In 2023 Go re-entered the top 10 spot of most popular languages, according to InfoWorld. This language has garnered so much clout because of its simplicity, efficiency and ability to compile directly to machine code, offering many speed benefits.
As someone new to Go, I would like to dive deeper into this language to find out its potential. The best way to learn something new is by doing it. So that is why I started on a project to not only hone my Go skills, but also solve a common task: CSV parsing and data manipulation. Through this blog post I will illustrate how to parse a CSV file containing rows of articles. These articles need to be filtered based on a property and then written to a new CSV file. The accompanying source code to this project can be found on GitHub.
Prerequisites
To get started with the project, you will need the following:
- Go version 1.22.1 or later
- IDE of choice (e.g. Visual Studio Code)
1. Initialize the project
The first step in creating our Go project involves setting up a new directory called demo-go-csv-parser
. Then navigate into this directory.
mkdir demo-go-csv-parser
cd demo-go-csv-parser
The next step is to initialize a Go module called demo-go-csv-parser
go mod init demo-go-csv-parser
A go.mod
file will be created inside your directory. This module file is used to organize and manage dependencies, similar to the package.json
of the Node.js ecosystem.
2. Install the CSV dependency
The dependency that we're going to use is called gocsv. This package provides a variety of built-in functions for parsing CSVs using Go structs. To include the dependency in your project, run the following command:
go get github.com/gocarina/gocsv
3. Write the code
It's time to dive into the coding aspect of the project to get a taste of how the Go programming language works. To maintain clarity, we're going to break down the coding process into the following sections:
- Create main file
- CSV file setup
- Read file
- Filter articles
- Write file
By decomposing the whole project into bite-size chunks, we can tackle each part with more attention.
### Create main file
In the root folder of the
demo-go-csv-parser
project, create a file calledmain.go
. Populate the file with the following:
package main
import (
"github.com/gocarina/gocsv"
"os"
)
The first line indicates the package name for the file. Every Go file needs to start with a package declaration. The second part of the file consists of the import block with two dependencies:
- gocsv: the external package that we've previously installed
- os: this built-in Go dependency will be used for I/O functionalities to read and write CSV files.
CSV file
For this project, we're going to use a sample CSV file that you can find at this GitHub link. Feel free to download and include this sample file in your project. Most CSV files are structured and have a header to denominate what each column will be used for. In Go, we can map each of CSV row into a custom data structure called struct
. This struct will contain fields corresponding to a CSV column.
In our CSV file, the first two columns in this CSV file are named Title
and URL
. Mapping these two into a Go struct called Article
would look like this:
type Article struct {
Title string
URL string
}
The gocsv dependency supports the usage of tags to indicate what the column name is in the CSV file. This is a handy feature in cases where you would have spaces in the column name or if the column name deviates from the actual field name we would like to use in the Go struct.
Considering all the columns of our CSV file, we can add all the columns with the csv tags to the final Article
struct which should look like this:
type Article struct {
Title string `csv:"Title"`
URL string `csv:"URL"`
DocumentTags string `csv:"Document tags"`
SavedDate string `csv:"Saved date"`
ReadingProgress string `csv:"Reading progress"`
Location string `csv:"Location"`
Seen string `csv:"Seen"`
}
Read file
We are going to get to the crux of the project. We need to be able to read a CSV file called example.csv
located in the root directory. To achieve this, we're going to write a separate ReadCsv()
function to achieve this:
func ReadCsv() []*Article {
// Try to open the example.csv file in read-write mode.
csvFile, csvFileError := os.OpenFile("example.csv", os.O_RDWR, os.ModePerm)
// If an error occurs during os.OpenFIle, panic and halt execution.
if csvFileError != nil {
panic(csvFileError)
}
// Ensure the file is closed once the function returns
defer csvFile.Close()
var articles []*Article
// Parse the CSV data into the articles slice. If an error occurs, panic.
if unmarshalError := gocsv.UnmarshalFile(csvFile, &articles); unmarshalError != nil {
panic(unmarshalError)
}
return articles
}
The function ReadCsv
can be broken down into the following parts:
- The function returns an slice of pointers to the elements of type
Article
. - We use
os.OpenFile()
to openexample.csv
with specific flags in read-write mode. The flagos.ModePerm
indicates a the file mode for creating new files if necessary. ThisopenFile()
either returns a file handle (csvFile
) or an error (csvFileError
). - Immediately in the next step, we check for errors. The
nil
is Go's equivalent to null or empty. If an error was stored in the variable, we exit the function and report the error. - The
defer csvFile.Close()
makes sure that the openedcsvFile
is always closed regardless of when the function return happens. This is best practice for file resource management. - With the file open and error handling in place, we're going to proceed to parse the CSV content. The
gocsv.UnmarshalFile()
function is provided with the file handle and the reference articles slice. It reads the CSV rows and populates the slice withArticle
instances. - If the parsing of the csvFile completes without errors, the
articles
variable will be returned correctly. ### Filter articles After successfully parsing the CSV file and storing its contents into anarticles
slice, the next step is to filter this slice. We want to only retain the articles whose location is set to inbox. We're going to create a function calledGetInboxArticles
to achieve this:
func GetInboxArticles(articles []*Article) []*Article {
// Initialize an empty slice to store inbox articles
var inboxArticles []*Article
// Iterate through each article in the provided slice.
for _, article := range articles {
// Check if the article's Location is equal to inbox
if article.Location == "inbox" {
// If the article's location is inbox, add it to the inboxArticles slice
inboxArticles = append(inboxArticles, article)
}
}
return inboxArticles
}
Let's closely examine this function:
- This function accepts a slice of pointers to the
Article
struct and returns a slice of the same type. - We create an empty slice called
inboxArticles
that will store the articles that meet the inbox criteria. - We create a for loop that's going to iterate through each element of the
articles
slice. If the location property of the article is equal toinbox
, we append this element to theinboxArticles
slice. - After the loop has finished, we return the slice
inboxArticles
.
Write file
Now that we've extracted the inbox articles, we want to persist this data into a new CSV file. Writing contents to a CSV file will be similar to reading the contents as in the previous steps. We're going to create a function WriteCsv
that looks like this:
func WriteCsv(articles []*Article) {
// Open result.csv for writing; create it if it doesn't exist, or overwrite it if it already exists.
resultFile, resultFileError := os.OpenFile("result.csv", os.O_WRONLY|os.O_CREATE|os.O_TRUNC, os.ModePerm)
// Check for errors when opening or creating the file. If there's an error, panic.
if resultFileError != nil {
panic(resultFileError)
}
defer resultFile.Close()
// Marshal the articles into the CSV format and write them to the result.csv file
if marshalFileError := gocsv.MarshalFile(&articles, resultFile); marshalFileError != nil {
panic(marshalFileError)
}
}
Let's go through this piece of code step by step:
- We create a function
WriteCSV
that accepts an input argument ofArticle
slice. - We use
os.OpenFile()
to create or open the fileresults.csv
. The flags passed into the function ensure that the file is write-only, will be created if it doesn't exist, and overwritten if it already exists. - After trying to open the file
result.csv
, we check if the variableresultFileError
contains an error. If it does contain an error, we exit the function with the panic operator. - For good I/O hygiene we ensure that the
resultFile
is closed whenever the function exits with thedefer
. - Finally, we are going to write the contents of the
articles
to theresultFile
withgocsv.MarshalFile()
. TheMarshlFile
function expects to arguments: reference to the slice, and the CSV file to which it should write the contents. If there was an error during the marshaling process, the function will panic. ### Putting it all together We've written three helper functions: Reading a CSV file, writing to a CSV file and filtering articles. We're going to combine all of these three into a main function like this:
func main() {
articles := ReadCsv()
filteredArticles := GetInboxArticles(articles)
WriteCsv(filteredArticles)
}
Run 🚀
With all the Go code in place, it's time to run it! This can be done with the following command:
go run main.go
If done correctly, your project will have a new file named result.csv
. Congratulations, you have just run your first Go project! 🎉
Takeaway
For our everyday task of processing a CSV file, we can see that Go's simplicity, efficiency and easy-to-learn syntax shine brightly. This makes it easy for new learners to pick up this powerful language and rich ecosystem of packages and utilities. Keeping an eye on new tools and languages like Go can expand your skills toolset and offer you a different vantage point to think and solve problems. Of course the best tool for the job will depend on your project's requirements. Perhaps you will consider integrating Go for your next software project. Happy coding! 🧑💻
If the content was helpful, feel free to support me here:
Featured ones: