friends of friends What is the algorithm?

The algorithm is super simple, with only one parameter, and let it be the threshold value r.

When there are N points in the space, pay attention to a certain point in it. The distance from that one point to the remaining N-1 points is calculated, and anything less than or equal to r is judged to be a friend.
Next, pay attention to another point and perform the same calculation. If there is something in common with the friend you decided on earlier, add a friend. If there are no elements in common, a new family of friends will be created.
Just repeat this.

Install pyfof

pyfof is a library that enables fast friend-friend clustering (Friends of Friends cluster finding) in python. Instead of simply implementing the friends-of-friends algorithm, it seems that the speedup was made possible by the method R * -tree. (I don't know the details).

Installation is

`python`


pip install pyfof

It was just OK (@ google colab, 2020.8.19)

Execution example

`python`


import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import pyfof

npts = 10000
ndim = 2
nptsperdim = int(npts/ndim)
data = np.vstack((np.random.normal(-1,0.2,(nptsperdim,ndim)),\
                  np.random.normal(1,0.2,(nptsperdim,ndim))))

groups = pyfof.friends_of_friends(data, 0.4)

colors = cm.rainbow(np.linspace(0, 1, len(groups)))
for g,c in zip(groups, colors):
    plt.scatter(data[g,0], data[g,1], color=c, s=3)

plt.show()

Then

スクリーンショット 2020-08-19 14.23.44.png

It can be neatly divided into two classes.

Next, why not put another class in the middle?

`python`


npts = 10000
ndim = 2
nptsperdim = int(npts/ndim)
data = np.vstack((np.random.normal(-1,0.2,(nptsperdim,ndim)),\
                  np.random.normal(1,0.2,(nptsperdim,ndim)),\
                  np.random.normal(0.,0.2,(nptsperdim,ndim))))

groups = pyfof.friends_of_friends(data, 0.4) # 0.If it is 4, it is too large and all are classified into the same class.

colors = cm.rainbow(np.linspace(0, 1, len(groups)))
for g,c in zip(groups, colors):
    plt.scatter(data[g,0], data[g,1], color=c, s=3)

plt.show()

Then, they all had the same color, that is, the same class.

スクリーンショット 2020-08-19 14.25.00.png

Let's change the range a little

`python`


npts = 10000
ndim = 2
nptsperdim = int(npts/ndim)
#Reduce Gaussian sigma.
data = np.vstack((np.random.normal(-1,0.1,(nptsperdim,ndim)),\
                  np.random.normal(1,0.1,(nptsperdim,ndim)),\
                  np.random.normal(0.,0.1,(nptsperdim,ndim))))

groups = pyfof.friends_of_friends(data, 0.2) # 0.2 and make the standard a little smaller. And the upper sigma was made smaller.

colors = cm.rainbow(np.linspace(0, 1, len(groups)))
for g,c in zip(groups, colors):
    plt.scatter(data[g,0], data[g,1], color=c, s=3)

plt.show()

Looking at it

スクリーンショット 2020-08-19 14.27.57.png

It was properly classified into 3 classes. This is because the Gaussian sigma was reduced and the criterion was reduced from 0.4 to 0.2.

The code can also be viewed at google colab.

How to try the friends-of-friends algorithm with pyfof

friends of friends What is the algorithm?

Install pyfof

python

Execution example

python

python

python

`python`

`python`

`python`

`python`