Jacob Swanner Development Blog

NimbleCSV: Parsing into Elixir Maps

NimbleCSV is a great option when parsing CSV files in Elixir. By design, it will only give you the data for each row as a List – there’s no option for it to generate a Map (using the headers as keys) for each row. When processing data from CSV files, I much prefer to use maps instead of lists, that way the order of the columns in the file doesn’t matter. Luckily, we can using Stream.transform/3 to generate a map for each row of data:

alias NimbleCSV.RFC4180, as: CSV

"path/to/file.csv"
|> File.stream!()
|> CSV.parse_stream(skip_headers: false)
|> Stream.transform(nil, fn
  headers, nil -> {[], headers}
  row, headers -> {[Enum.zip(headers, row) |> Map.new()], headers}
end)
|> Enum.to_list()

And, that’s it! Quite a bit is happening in a few lines of code. For the initial accumulator, we need to use something that will not match the first row of the CSV, so nil or some other atom is a great choice here. We match on our initial accumulator in the first clause of the function to Stream.transform/3 to know we’re on the headers row and we’ll use the headers as the accumulator from thereon. For the rest of the rows, since we know the order of the headers will correspond to the order of the values in each row, we zip the headers and row values together into a list of {key, value} tuples and then turn that list into a map. After that we have an Enumerable of maps that we can use for processing the data.

To see it in action, I’ll using a String instead of a File:

alias NimbleCSV.RFC4180, as: CSV

"""
name,age
Alex,21
Billie,8
Charlie,32
"""
|> CSV.parse_string(skip_headers: false)
|> Stream.transform(nil, fn
  headers, nil -> {[], headers}
  row, headers -> {[Enum.zip(headers, row) |> Map.new()], headers}
end)
|> Enum.to_list()
# [
#   %{"age" => "21", "name" => "Alex"},
#   %{"age" => "8", "name" => "Billie"},
#   %{"age" => "32", "name" => "Charlie"}
# ]

A similar approach can be taken with Enum.flat_map_reduce/3, if you really wanted:

alias NimbleCSV.RFC4180, as: CSV

"""
name,age
Alex,21
Billie,8
Charlie,32
"""
|> CSV.parse_string(skip_headers: false)
|> Enum.flat_map_reduce(nil, fn
  headers, nil -> {[], headers}
  row, headers -> {[Enum.zip(headers, row) |> Map.new()], headers}
end)
|> elem(0)
# [
#   %{"age" => "21", "name" => "Alex"},
#   %{"age" => "8", "name" => "Billie"},
#   %{"age" => "32", "name" => "Charlie"}
# ]

The call to Kernel.elem/2 at the end is there to grab the resulting enumerable and ignore the accumulator.