Eric Radman : a Journal

Exploring JSON on the Command Line

With Your Favorite Scripting Language

JSON has become the standard data interchange format of our time, and like XML it is not easily interpreted at the command line. For this reason I have often opened up a Ruby or Python interpreter and used the REPL to manually explore a structure in order to create a one-liner capable of parsing

>>> import json
>>> j = json.loads(open('github-tag.json').read())
>>> j.keys()
[u'forced', u'compare', u'pusher', u'sender', u'repository', u'created',
u'deleted', u'commits', u'after', u'head_commit', u'ref', u'base_ref',
u'before']

If you want to browse around in your favorite editor instead you can get Python to pretty-print it

$ python -mjson.tool < github-tag.json > github-tag-pretty.json
$ vim github-tag-pretty.json

If you know the path of keys and indexes that you want to pull out of a JSON file your it your favorite scripting language has a way of extracting the data in one line

$ ruby22 -r json -e 'puts JSON.parse(ARGF.read)["pusher"]' github-tag.json
{"email"=>"ericshane@eradman.com", "name"=>"eradman"}

sed for JSON data

Another tool, jq was created expressly for the purpose of slicing and transforming JSON structures. Like sed, learning jq comes with it's own domain-specific language which provides enumerable ways of producing surprising results.

Flattened JSON

None of the aforementioned methods provide an easy way of exploring the layout of a complex JSON file. One solution to this is separate formatting from parsing. flattenjs provides by formatting using the PostgreSQL path selection syntax which easily navigates hashes and lists:

  SELECT my_json#>>'{push,changes,0,new,target,message}' FROM ... ;

flattenjs transforms an input to this structure which can be pasted into a PostgreSQL query directly, or processed with awk and grep

$ flattenjs < github-tag.json | grep tag0001
{compare} https://github.com/eradman/hook-test/compare/tag0001
{ref} refs/tags/tag0001

Because of the line-oriented syntax, pulling values out is trivial, and we can use our favorite command line tool to filer by key and select a value:

$ flattenjs < tests/github-tag.json | awk '/{head_commit,id}/ { print $NF }'
188260313028f69fb427ba31de714a924b16f951

Last updated on June 03, 2016