How to make a dictionary with a hierarchical structure.

Can it be implemented in such 5 seconds?

A junior A asked me, "I want to create a dictionary with a hierarchical structure in Python." Of course, "Jean can do that in 5 seconds" I thought. But in the end, it took 30 minutes to answer the question. I spent about 30 minutes explaining it. Also, if you are asked to implement it, you will forget it, so I would like to keep it as a memo.

In the first place, your junior's question is not just that you want to make a double-structured or triple-structured dictionary, but you receive the following input.

`example.csv`


A1,B1,C1,3
A1,B1,C2,1
A1,B1,C3,5
A1,B2,C1,4
A1,B2,C2,3
A1,B2,C3,1
A1,B3,C1,3
A1,B3,C2,2
A1,B3,C3,5
A2,B1,C1,3
A2,B1,C2,5
A2,B1,C3,3
A2,B2,C1,2
A2,B2,C2,1
A2,B2,C3,3
A2,B3,C1,4
A2,B3,C2,4
A2,B3,C3,5

He wanted to automatically create the following hierarchical dictionary.

{'A1': {'B1': {'C1': 1,
               'C2': 1,
               'C3': 3},
        'B2': {'C1': 3,
               'C2': 3,
               'C3': 4},
        'B3': {'C1': 2,
               'C2': 5,
               'C3': 5}},
 'A2': {'B1': {'C1': 4,
               'C2': 4,
               'C3': 1},
        'B2': {'C1': 1,
               'C2': 3,
               'C3': 3},
        'B3': {'C1': 4,
               'C2': 2,
               'C3': 3}}}

You can't do it in five seconds

So, I had a hard time with this. The first thing I came up with was how to use defaultdict.

import collections
hoge = collections.defaultdict(lambda : collections.defaultdict(lambda : collections.defaultdict(int))

You can create a triple nested dict by using a lambda expression like this. However. This method is subtle. In the first place, it is necessary to know the number of hierarchies in advance, and the number of hierarchies may differ depending on the element, so it is not very general. Furthermore, it is difficult to pickle the defaultdict that defines the default value in the lambda expression. I thought about the following method.

import pprint

def make_tree_dict(inputs):
    tree_dict = {}
    for i, ainput in enumerate(inputs):
        pre_dict = tree_dict
        for j, key in enumerate(ainput):
            if j == len(ainput)-2:
                pre_dict[key] = ainput[-1]
                break
            elif key not in pre_dict:
                pre_dict[key] = {} 
            else:
                pass 
            pre_dict = pre_dict[key] 
    return tree_dict

if __name__ == "__main__":
    pp = pprint.PrettyPrinter(width=10,compact=True)
    inputs = []
    with open("example.csv") as f:
        for line in f:
            line = line.rstrip().split(",")
            inputs.append(line)
    hoge = make_tree_dict(inputs) 
    pp.pprint(hoge)

By actually running the above program, you can get the output of the hierarchical dict as shown above. It's a strange program that the contents of tree_dict are updated even though it is never directly assigned to tree_dict, but it works. I thought I'd post a commentary, but I don't have time, so this time ...

Incidentally, the above script can be applied to inputs with different numbers of layers for each element as shown below.

`example2.csv`


A1,B1,C1,1
A1,B1,C2,D1,3
A1,B1,C3,5
A1,B2,C1,D1,5
A1,B2,C2,2
A1,B2,C3,5
A1,B3,C1,2
A1,B3,C2,D1,4
A1,B3,C2,D2,10
A1,B3,C3,2
A2,B1,C1,4
A2,B1,C2,D1,5
A2,B1,C3,5
A2,B2,C1,D1,6
A2,B2,C2,3
A2,B2,C3,D1,8
A2,B3,C1,2
A2,B3,C2,5
A2,B3,C3,4

You can get a dict like this

`example2_output`


{'A1': {'B1': {'C1': '1',
               'C2': {'D1': '3'},
               'C3': '5'},
        'B2': {'C1': {'D1': '5'},
               'C2': '2',
               'C3': '5'},
        'B3': {'C1': '2',
               'C2': {'D1': '4',
                      'D2': '10'},
               'C3': '2'}},
 'A2': {'B1': {'C1': '4',
               'C2': {'D1': '5'},
               'C3': '5'},
        'B2': {'C1': {'D1': '6'},
               'C2': '3',
               'C3': {'D1': '8'}},
        'B3': {'C1': '2',
               'C2': '5',
               'C3': '4'}}}

Why does it work?

I will add this section if I have time. .. ..