Dealing With .csv File Data By NamePosted: August 23, 2012
I am grateful when I get a chance to have a a moment of productive programming; wonder when the boom is going to be lowered whenever a period of more than a day of productive programming comes my way; and check my pulse if there is more than a week of productive programming. That has happened recently by needing to rework a Clojure program.
Recently, my Clojure and Python programming has been taking a decidedly more functional turn. That is write code that relies less on constructs from C or Java. In my case this was moving from database row positional dependence to referencing data by column name.
My work deals in those universal data transfer standards .csv and xml. The re-worked Clojure program in question is an application that compares two insurance reports. The information is in .csv format, and each report comes complete with its own first row of column headers.
My original approach to this problem was positional, and after using clojure-csv to create a sequence of vectors, extracted data from each report row using nth. and column indexes defined as defs.
The code looked confusing, was hard to follow, and so I decided to create a sequence of maps using the saved column headers, transformed into map keys.
(defn gen-map-keys "Takes a sequence, and turns it into viable keys for a map. We opt to change spaces ' ' to dashes '-'." [in-seq] (map (fn [element] (keyword (cstr/replace element " " "-"))) (map #(cstr/trim %1) in-seq)))
Then each column header is zipped with a .csv row:
(defn gen-mapped-sos "Takes a sequence, from a sequence of sequences, and column keys, zipmaps them, and returns the map. The column keys must be sanitized, and already run through the key making process, before being submitted." [col-keys row] (zipmap col-keys row))
By doing things this way, functions became more generic, and easier to understand. This might have been due to the fact data comparison was taking place by map key, reworking the software, or a little of both, but the results were good.