Obtaining, Scrubbing, and Exploring Data at the Command Line
Yesterday I went to a meetup called Obtaining, Scrubbing, and Exploring Data at the Command Line
It was just a mix of old and new things. The standard unix command line tools (cat, awk, grep, sed, less, head, tail, etc) and a mix of new tools (csvkit
- a bunch of small command line tools for csv, jq
- tools for handling json on the command line, xml2json
and json2csv
and others)
The speaker also got into a bit of his own tools. From something big like being able to call R from the command line to small bash functions that uses http://explainshell.com to print out what a shell function does on the command line.
He talked about creating “data science toolkits” and gave some examples of packaging and creating enviroments that can be easily moved around. Here is his example.
Here are the slides from the talk.