I convert AWS JSON data to CSV like this

For projects using AWS, there are quite a few times when you want to document a list of AMIs.

In such a case, there are times when you want to get the data in JSON with the AWS official CLI, convert it to CSV, and then compile it into a document. Tabular data, which is common in documents, is very incompatible with JSON, which is a problem.

In such a case, do this. For example, suppose you can get the JSON of the AMI list like this.

`images.json`


{
    "Images": [
        {
            "VirtualizationType": "hvm", 
            "Name": "hogehoge_1", 
            "Hypervisor": "xen", 
            "SriovNetSupport": "simple", 
            "ImageId": "ami-99999991", 
            "State": "available", 
            "BlockDeviceMappings": [
                {
                    "DeviceName": "/dev/xvda", 
                    "Ebs": {
                        "DeleteOnTermination": true, 
                        "SnapshotId": "snap-9999999d", 
                        "VolumeSize": 100, 
                        "VolumeType": "standard", 
                        "Encrypted": false
                    }
                }
            ], 
            "Architecture": "x86_64", 
            "ImageLocation": "999999999993/hogehoge1", 
            "RootDeviceType": "ebs", 
            "OwnerId": "999999999999", 
            "RootDeviceName": "/dev/xvda", 
            "CreationDate": "2014-12-17T06:35:39.000Z", 
            "Public": false, 
            "ImageType": "machine", 
            "Description": null
        }, 
        {
            "VirtualizationType": "hvm", 
            "Name": "hogehoge_2", 
            "Hypervisor": "xen", 
            "SriovNetSupport": "simple", 
            "ImageId": "ami-99999991", 
            "State": "available", 
            "BlockDeviceMappings": [
                {
                    "DeviceName": "/dev/xvda", 
                    "Ebs": {
                        "DeleteOnTermination": true, 
                        "SnapshotId": "snap-9999999d", 
                        "VolumeSize": 100, 
                        "VolumeType": "standard", 
                        "Encrypted": false
                    }
                }
            ], 
            "Architecture": "x86_64", 
            "ImageLocation": "999999999993/hogehoge1", 
            "RootDeviceType": "ebs", 
            "OwnerId": "999999999999", 
            "RootDeviceName": "/dev/xvda", 
            "CreationDate": "2014-12-17T06:35:39.000Z", 
            "Public": false, 
            "ImageType": "machine", 
            "Description": null
        }, 
    ]
}

Change this json to CSV using python. I made it improvised, so I think there are various rushes, but I'm sorry. I'm really sorry.

`sample.py`


# coding:UTF-8
import json
import codecs

#Variable definition. Kitto
sourcefilename = "images.json"
outfilename = "outfile.csv"
targetDataName = "Images"
#Function definition
def list2str(srclist,startStr,endStr):
    #If it's not a list, don't convert it.
    if not (isinstance(srclist,list)):
        return srclist
    #Change list to string
    resultStr = ""
    resultStr += startStr
    for item in srclist:
        tmp = ""
        if isinstance(item,str):
            tmp += item +","
        elif isinstance(item,list):
            tmp += list2str(item,"","") + ","
        else:
            tmp += str(item) + ","
        resultStr += tmp
    resultStr += endStr
    return resultStr

#File reading
sourceFile = open(sourcefilename,"r")
sourceData = json.load(sourceFile)
sourceFile.close()
#Get the desired data from json data
targetData = sourceData.get(targetDataName)

#Get header
headerSet = set()
for row in targetData:
    colList = row.keys()
    for colName in colList:
        headerSet.add(colName)
headerstr = ""
for headerName in headerSet:
    headerstr += headerName + ","
#Header acquisition completed

#Data acquisition
datalist = list()
for row in targetData:
    rowstr = ""
    for colName in headerSet:
        if isinstance(row.get(colName),list):
            rowstr += list2str(row.get(colName),"\"","\"") + ","
        elif isinstance(row.get(colName),bool):
            tmpStr = str(row.get(colName))
            rowstr += tmpStr + ","
        elif isinstance(row.get(colName),unicode):
            rowstr += row.get(colName) + ","
        elif isinstance(row.get(colName),type(None)):
            rowstr += "None,"
    datalist.append(rowstr)

#Open the burn file
outfile = codecs.open(outfilename,"w", "shift_jis")
outfile.write(headerstr + "\n")
for rowstr in datalist:
    outfile.write(rowstr + "\n")
outfile.close()

Now you can create CSV data. Once converted to CSV data, it can be opened in Excel and edited in various ways.

I didn't know Python before, so I used to make tools in Java, but Python is very convenient. The AWS CLI requires Python, so there's no need to add new Python. I think I'll study Python in earnest.

I hope you find it useful.