Skip to content Skip to sidebar Skip to footer

Write Json Format Using Pandas Series And Dataframe

I'm working with csvfiles. My goal is to write a json format with csvfile information. Especifically, I want to get a similar format as miserables.json Example: {'source': 'Napoleo

Solution 1:

Consider removing the Series() around the scalar value, country. By doing so and then upsizing the dictionaries of series into a dataframe, you force NaN (later converted to null in json) into the series to match the lengths of other series. You can see this by printing out the dfForce dataframe:

from pandas import Series
from pandas import DataFrame

country = 'Germany'    
sourceTemp = ['Mexico', 'USA', 'Argentina']
value = [1, 2, 3]

forceData = {'source': Series(country),
             'target': Series(sourceTemp),
             'value': Series(value)}
dfForce = DataFrame(forceData)

#     source     target  value# 0  Germany     Mexico      1# 1      NaN        USA      2# 2      NaN  Argentina      3

To resolve, simply keep country as scalar in dictionary of series:

forceData = {'source': country,
             'target': Series(sourceTemp),
             'value': Series(value)}
dfForce = DataFrame(forceData)

#     source     target  value# 0  Germany     Mexico      1# 1  Germany        USA      2# 2  Germany  Argentina      3

By the way, you do not need a dataframe object to output to json. Simply use a list of dictionaries. Consider the following using an Ordered Dictionary collection (to maintain the order of keys). In this way the growing list dumps into a text file without appending which would render an invalid json as opposite facing adjacent square brackets ...][... are not allowed.

from collections import OrderedDict
...

data = []

for element in newcountries:
    bills = csvdata['target'][csvdata['country'] == element]
    frquency = Counter(bills)

    for k,v in frquency.items():
        inner = OrderedDict()
        inner['source']  = element
        inner['target'] = k
        inner['value'] = int(v)

        data.append(inner)

newData = json.dumps(data, indent=4)

with open('data.json', 'w') as savetxt:
    savetxt.write(newData)

Post a Comment for "Write Json Format Using Pandas Series And Dataframe"