--R Cannot be used. I don't want to use it because it's heavy ――I wish I could do parallel processing later ... ――So I will do it with python
fp-growth ――I want to do pattern mining for a little research and aggregate it, so I use it ――There is little Japanese --A journey to find a library ――This is what came out
First install
pip install pyfpgrowth
Start python
>>> import pyfpgrowth                                                          
>>> transactions = [[1, 2, 5],                                                 
...                 [2, 4],                                                    
...                 [2, 3],                                                    
...                 [1, 2, 4],                                                 
...                 [1, 3],                                                    
...                 [2, 3],                                                    
...                 [1, 3],                                                    
...                 [1, 2, 3, 5],                                              
...                 [1, 2, 3]]                                                 
>>> patterns = pyfpgrowth.find_frequent_patterns(transactions, 2)              
>>> print patterns                                                             
{(1, 2): 4, (1, 2, 3): 2, (1, 3): 4, (1,): 6, (2,): 7, (2, 4): 2, (1, 5): 2, (5,): 2, (2, 3): 4, (2, 5): 2, (4,): 2, (1, 2, 5): 2} 
It was kind of like that. But the values are sorted, so I wondered if I couldn't distinguish between 1 → 2 and 2 → 1.
Recommended Posts