iJSON: Parse Streams of JSON in Python
JSON will forever serve as a great alternative for XML, but it has a weakness: big data. This is due to a lack of support for stream processing.
Luckily for Python, there’s now a solution. Ivan Sagalaev developed iJSON, a library for performing SAX-style parsing of JSON.
iJSON allows you to interact with the incoming datastream as a standard iterator.
Example Simple Data use:
from ijson import items
f = urlopen('http://.../')
objects = items(f, 'earth.europe.item')
cities = (o for o in objects if o['type'] == 'city')
for city in cities:
do_something_with(city)
Example Big Data use:
from ijson import parse
f = urlopen('http://.../')
parser = parse(f)
stream.write('<geo>')
for prefix, event, value in parser:
if (prefix, event) == ('earth', 'map_key'):
stream.write('<%s>' % value)
continent = value
elif prefix.endswith('.name'):
stream.write('<object name="%s"/>' % value)
elif (prefix, event) == ('earth.%s' % continent, 'end_map'):
stream.write('</%s>' % continent)
stream.write('</geo>')
To get started:
$ pip install ijson
Discussion
Sign in or Join to comment or subscribe