Parsing CSV Files with Golang


I love coming across problems that require me to learn something new. I have written a few posts about Go (or Golang) such as Using Recursion on Golang to Solve a Problem and Unit Testing Golang on App Engine. Go is one of my favorite utility languages.

Earlier this year, my wife decided to start her own hair salon and do booth rental (Pretty Hare Salon). We set up square for appointment setting and credit cards. It has nice reporting features but there is literally no forecasting. Forecasting helps us determine pricing and when to increase or decrease marketing/advertisement.

Parameters of the Problem

To work around this lack of forecasting, I found an export feature that exports appointments in csv. Unfortunately it does not list the pricing associated with it. At the time, I implemented something in perl but decided to rewrite in Golang. The application inputs two CSV files (1 – appointments and 2 – pricing) and calculates weekly totals based on that.

Helper Function

Error checking and reporting requirements are fairly minimal for this so I use a helper function that is fairly basic.

func CheckErr(err error) { if err != nil { log.Fatal(err) }
}

Loading CSVs

In order to load the CSV into a variable, I opted to use os.Open() which takes the file path. I pass these in through command-line arguments using os.Args. os.Open returns a pointer to a file.

args := os.Args appointmentsFile := args[1] pricesFiles := args[2] af, err := os.Open(appointmentsFile) CheckErr(err) pf, err := os.Open(pricesFiles) CheckErr(err)

I then open the files return by these with cvs.NewReader

appointments := csv.NewReader(af)
prices := csv.NewReader(pf)

NewReader() expects an io.Reader but *os.File implements a reader so we are good.

These readers return a 2 dimensional slice with the left most dimension being the line number and second dimension being the field.

For the price records I read it in entirety. It is not very long.

priceRecords, err = prices.ReadAll()
CheckErr(err)

Iterating the CSV Line by Line

The Appointment file is the full history and we do not need all the data in it. It can get very large over time so it is best not to completely load it into memory. We also will have logic to ignore anything older than 2 weeks as this is a forecast and we do not care about historicals.

for { appointment, err := appointments.Read() if err == io.EOF { break } // Business logic in here
}

The appointment array in this can then be read as a single dimension. Something similar to what is below exists in the for loop previously shown.

if appointment[2] == "accepted" { // More business logic here
}

Looking Up Data

The pricing information which we fully loaded into the two dimensional slice can be iterated as follows

for _, price := range priceRecords { if price[0] == service { return strconv.ParseFloat(price[8][1:], 64) }
} return 0, errors.New("Could not find service - " + service)

The “service” variable is the service we are looking up. In the price slice, the first or 0 position is the actual service name we are looking up and position 8 is the actual price.

The price field (8) has a dollar sign so I just used string slicing to omit the first character which is the dollar sign so I could parse the float like this

price[8][1:]

The pricing file is highly controlled so I have very little error checking here.

Final Words

In this article we discussed opening and iterating through CSV files using two methods. One involves fully loading the file into memory and the other involves iterating line by line. At this point you should have a good idea of which method to use and when.