[Solved] Converting a massive JSON file to CSV


A 48mb json file is pretty small. You should be able to load the data into memory using something like this

import json

with open('data.json') as data_file:    
    data = json.load(data_file)

Dependending on how you wrote to the json file, data may be a list, which contains many dictionaries. Try running:

type(data)

If the type is a list, then iterate over each element and inspect it. For instance:

for row in data:
    print(type(row))
    # print(row.keys())

If row is a dict instance, then inspect the keys and within the loop, start building up what each row of the CSV should contain, then you can either use pandas, the csv module or just open a file and write line by line with commas yourself.

So maybe something like:

import json

with open('data.json') as data_file:    
    data = json.load(data_file)


with open('some_file.txt', 'w') as f:

    for row in data:
        user = row['username']
        text = row['tweet_text']
        created = row['timestamp']
        joined = ",".join([user, text, created])
        f.write(joined)

You may still run into issues with unicode characters, commas within your data, etc…but this is a general guide.

1

solved Converting a massive JSON file to CSV