How to try the friends-of-friends algorithm with pyfof

friends of friends What is the algorithm?

The algorithm is super simple, with only one parameter, and let it be the threshold value r.

  1. When there are N points in the space, pay attention to a certain point in it. The distance from that one point to the remaining N-1 points is calculated, and anything less than or equal to r is judged to be a friend.
  2. Next, pay attention to another point and perform the same calculation. If there is something in common with the friend you decided on earlier, add a friend. If there are no elements in common, a new family of friends will be created.
  3. Just repeat this.

Install pyfof

pyfof is a library that enables fast friend-friend clustering (Friends of Friends cluster finding) in python. Instead of simply implementing the friends-of-friends algorithm, it seems that the speedup was made possible by the method R * -tree. (I don't know the details).

Installation is

python


pip install pyfof

It was just OK (@ google colab, 2020.8.19)

Execution example

python


import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import pyfof

npts = 10000
ndim = 2
nptsperdim = int(npts/ndim)
data = np.vstack((np.random.normal(-1,0.2,(nptsperdim,ndim)),\
                  np.random.normal(1,0.2,(nptsperdim,ndim))))

groups = pyfof.friends_of_friends(data, 0.4)

colors = cm.rainbow(np.linspace(0, 1, len(groups)))
for g,c in zip(groups, colors):
    plt.scatter(data[g,0], data[g,1], color=c, s=3)

plt.show() 

Then

スクリーンショット 2020-08-19 14.23.44.png

It can be neatly divided into two classes.

Next, why not put another class in the middle?

python


npts = 10000
ndim = 2
nptsperdim = int(npts/ndim)
data = np.vstack((np.random.normal(-1,0.2,(nptsperdim,ndim)),\
                  np.random.normal(1,0.2,(nptsperdim,ndim)),\
                  np.random.normal(0.,0.2,(nptsperdim,ndim))))

groups = pyfof.friends_of_friends(data, 0.4) # 0.If it is 4, it is too large and all are classified into the same class.

colors = cm.rainbow(np.linspace(0, 1, len(groups)))
for g,c in zip(groups, colors):
    plt.scatter(data[g,0], data[g,1], color=c, s=3)

plt.show()

Then, they all had the same color, that is, the same class.

スクリーンショット 2020-08-19 14.25.00.png

Let's change the range a little

python


npts = 10000
ndim = 2
nptsperdim = int(npts/ndim)
#Reduce Gaussian sigma.
data = np.vstack((np.random.normal(-1,0.1,(nptsperdim,ndim)),\
                  np.random.normal(1,0.1,(nptsperdim,ndim)),\
                  np.random.normal(0.,0.1,(nptsperdim,ndim))))

groups = pyfof.friends_of_friends(data, 0.2) # 0.2 and make the standard a little smaller. And the upper sigma was made smaller.

colors = cm.rainbow(np.linspace(0, 1, len(groups)))
for g,c in zip(groups, colors):
    plt.scatter(data[g,0], data[g,1], color=c, s=3)

plt.show()

Looking at it

スクリーンショット 2020-08-19 14.27.57.png

It was properly classified into 3 classes. This is because the Gaussian sigma was reduced and the criterion was reduced from 0.4 to 0.2.

The code can also be viewed at google colab.

Recommended Posts

How to try the friends-of-friends algorithm with pyfof
Try the Variational-Quantum-Eigensolver (VQE) algorithm with Blueqat
Try to solve the fizzbuzz problem with Keras
How to specify the NIC to scan with amazon-dash
Try to solve the traveling salesman problem with a genetic algorithm (Theory)
[Algorithm x Python] How to use the list
How to fix the initial population with a genetic algorithm using DEAP
How to Learn Kaldi with the JUST Corpus
Try to solve the traveling salesman problem with a genetic algorithm (Python code)
Try to solve the traveling salesman problem with a genetic algorithm (execution result)
Try to solve the programming challenge book with python3
How to delete the specified string with the sed command! !! !!
[Introduction to Python] How to iterate with the range function?
How to create a submenu with the [Blender] plugin
Try to visualize the room with Raspberry Pi, part 1
Try to solve the internship assignment problem with Python
The first algorithm to learn with Python: FizzBuzz problem
[Python] How to specify the download location with youtube-dl
Try to get the contents of Word with Golang
[Neo4J] ④ Try to handle the graph structure with Cypher
[Python] How to rewrite the table style with python-pptx [python-pptx]
Try to specify the axis with PyTorch's Softmax function
Try to factorial with recursion
How to use the generator
How to update with SQLAlchemy?
How to cast with Theano
How to Alter with SQLAlchemy?
How to separate strings with','
How to RDP with Fedora31
How to use the decorator
How to Delete with SQLAlchemy?
How to increase the axis
How to start the program
I tried to simulate how the infection spreads with Python
Try to play with the uprobe that supports Systemtap directly
How to return to the previous directory with the Bash cd command
Finding a solution to the N-Queen problem with a genetic algorithm (2)
How to manipulate the DOM in an iframe with Selenium
Try to automate the operation of network devices with Python
A story about how to deal with the CORS problem
How to get into the python development environment with Vagrant
Try to model a multimodal distribution using the EM algorithm
Try HeloWorld in your own language (with How to & code)
Try to decipher the garbled attachment file name with Python
Finding a solution to the N-Queen problem with a genetic algorithm (1)
[Introduction to Python] How to get data with the listdir function
Try to extract the features of the sensor data with CNN
How to calculate the autocorrelation coefficient
Python: How to use async with
How to use the zip function
How to use the optparse module
Try to operate Facebook with Python
How to use virtualenv with PowerShell
How to deal with imbalanced data
How to install python-pip with ubuntu20.04LTS
How to deal with imbalanced data
Try to profile with ONNX Runtime
How to get the Python version
Try to introduce the theme to Pelican
How to get started with Scrapy
How to get started with Python