In order to draw in two dimensions, it is necessary to give appropriate coordinates to each vertex, but the graph has only information on the vertices and edges. How do you arrange the vertices? ??
This article describes the ** Fruchterman-Reingold algorithm **, which arranges graphs nicely. Python makes it easy to use with a library called networkx
. However, it's too easy and frustrating, so I'll follow the implementation of networkx
on GitHub to see how it works.
The flow of this article is as follows.
For those who are satisfied if it works, I will show an implementation example first. With Google colaboratory, networkx
is already installed, so you can try it immediately by copying.
import networkx as nx
import matplotlib.pyplot as plt
N = 500 #Number of vertices
K = 5
P = 0.05
G = nx.watts_strogatz_graph(N,K,P) #Appropriate graph
pos = nx.random_layout(G, seed=0)
#Draw graph
plt.figure(figsize=(15,15))
nx.draw_networkx(G, pos, with_labels=False, node_shape='.')
plt.axis("off")
plt.show()
Wow…. do not want to see…. It's unbearable to see in a random arrangement.
import networkx as nx
import matplotlib.pyplot as plt
N = 500 #Number of vertices
K = 5
P = 0.05
G = nx.watts_strogatz_graph(N,K,P)
pos = nx.spring_layout(G)
#Draw graph
plt.figure(figsize=(15,15))
nx.draw_networkx(G, pos, with_labels=False, node_shape='.')
plt.axis("off")
plt.show()
It's arranged nicely.
Think a little more strictly about "good placement". For example, suppose you now have the information for the graph $ G $ as follows.
There are innumerable ways to arrange this graph in two dimensions. (You can arrange them randomly as before.) However, if possible, I want to arrange them so that they are easy to see. Let's call an arrangement that meets the following conditions a "good arrangement".
In other words, ** points connected by edges should be placed close to each other, and points not connected by edges should be placed far away **. In the case of an adjacency matrix, the unconnected part is set to ʻinf`, so it seems that all points can be treated in the same way regardless of the presence or absence of edges.
If the vertices overlap, you can't see the sides, which is a problem.
I would like to find a graph layout that satisfies the above two conditions. The ** mechanical model (force-directed graph) ** makes this possible.
And in the previous networkx.spring_layout ()
, ** Fruchterman-Reingold algorithm ** is used. This time I will introduce this. Take a look at the gif below to get an idea of this technique.
From the random initial placement, it will gradually become cleaner! !! ([Borrowed from here] [Explanatory article on mechanical model])
Want to know the algorithm?(y/n)
y
Let's know.
2.Fruchterman-Reingold algorithm
In the mechanical model, the vertices are moved by applying force. The Fruchterman-Reingold algorithm gives you ** Attractive Force ** and ** Repulsive Force **.
The attractive force works on the vertices connected by the sides. Repulsion, on the other hand, works on vertices that are not connected by edges. This corresponds to the previous condition (1) "The connection relationship of the sides is reflected in the distance".
You can define power in any way, but for now let's use the definition proposed in the paper.
Attraction: $ f_a = \ dfrac {d ^ 2} {k} $ Repulsion: $ f_r =-\ dfrac {k ^ 2} {d} $
$ d $: Distance between vertices
It's like a spring formula, so it's intuitive. The graph of the relationship between force and distance is as follows.
From the graphs and formulas we can see that:
--The farther the attraction is, the stronger it works ――The closer the repulsive force is, the stronger it works. --Attractive force and repulsive force become equal at a certain distance $ k $
If the distance $ k $ at which the forces antagonize is set to an appropriate size, the above condition ➁ "Vertex does not overlap" can be satisfied.
Now let's look at the algorithm.
I'm getting an unfamiliar variable $ t $. Temperature $ t $ is a parameter that limits the amount of movement of vertices. In other words, the movement amount should be $ t $ or less with respect to the movement direction vector (let's call it $ v $) calculated in step 2.
We will reduce this temperature $ t $ appropriately. Then, although the amount of movement is large at the beginning, it will not move much at the end. (Guarantees convergence)
Also, in the paper, in step 3, the collision with the wall is treated as elastic reflection in order to place it on the screen. If you want to know more about this, please read [Original paper "Graph drawing by force-directed placement"] [Paper]. (Not considered in the networkx
implementation ...)
In this article, we'll look at how to implement networkx
without reinventing the wheel.
Let's learn by seeing the implementation of people. However, NetworkX 2.4 is the one at that time, and the parts that do not explain exception handling etc. are omitted.
Earlier in the article I used networkx.spring_layout ()
to draw the graph "good". If you take a look at GitHub,
spring_layout = fruchterman_reingold_layout
I'm just referring to it! Next, let's look at the reference destination fruchterman_reingold_layout
.
def fruchterman_reingold_layout(G,
k=None,
pos=None,
#abridgement
seed=None):
"""Long docstring
"""
import numpy as np
G, center = _process_params(G, center, dim)
"""
Various initialized parts (omitted)
"""
"""
From here on down is important
"""
try:
# Sparse matrix
if len(G) < 500: # sparse solver for large graphs
raise ValueError
A = nx.to_scipy_sparse_matrix(G, weight=weight, dtype='f')
if k is None and fixed is not None:
# We must adjust k by domain size for layouts not near 1x1
nnodes, _ = A.shape
k = dom_size / np.sqrt(nnodes)
pos = _sparse_fruchterman_reingold(A, k, pos_arr, fixed,
iterations, threshold,
dim, seed)
except ValueError:
A = nx.to_numpy_array(G, weight=weight)
if k is None and fixed is not None:
# We must adjust k by domain size for layouts not near 1x1
nnodes, _ = A.shape
k = dom_size / np.sqrt(nnodes)
pos = _fruchterman_reingold(A, k, pos_arr, fixed, iterations,
threshold, dim, seed)
if fixed is None and scale is not None:
pos = rescale_layout(pos, scale=scale) + center
pos = dict(zip(G, pos))
return pos
In the try-except
part, the functions to be called are divided according to the size of the graph. (Isn't the if statement useless ...? Please tell me.)
-Call _fruchterman_reingold ()
if $ vertices <500 $
--Call _sparse_fruchterman_reingold ()
for $ vertices \ geq 500 $
** If you have a lot of vertices, it looks like you are using a sparse solver. ** We haven't reached the algorithm part so far, so let's take a look at _fruchterman_reingold ()
.
def _fruchterman_reingold(A, k=None, pos=None, fixed=None,
iterations=50, threshold=1e-4, dim=2,
seed=None):
# Position nodes in adjacency matrix A using Fruchterman-Reingold
# Entry point for NetworkX graph is fruchterman_reingold_layout()
# Sparse version
import numpy as np
if pos is None:
# random initial positions
pos = np.asarray(seed.rand(nnodes, dim), dtype=A.dtype)
else:
# make sure positions are of same type as matrix
pos = pos.astype(A.dtype)
# optimal distance between nodes
if k is None:
k = np.sqrt(1.0 / nnodes)
# the initial "temperature" is about .1 of domain area (=1x1)
# this is the largest step allowed in the dynamics.
t = max(max(pos.T[0]) - min(pos.T[0]), max(pos.T[1]) - min(pos.T[1])) * 0.1
# simple cooling scheme.
# linearly step down by dt on each iteration so last iteration is size dt.
dt = t / float(iterations + 1)
displacement = np.zeros((dim, nnodes))
for iteration in range(iterations):
displacement *= 0
# loop over rows
for i in range(A.shape[0]):
if i in fixed:
continue
# difference between this row's node position and all others
delta = (pos[i] - pos).T
# distance between points
distance = np.sqrt((delta**2).sum(axis=0))
# enforce minimum distance of 0.01
distance = np.where(distance < 0.01, 0.01, distance)
# the adjacency matrix row
Ai = np.asarray(A.getrowview(i).toarray())
# displacement "force"
displacement[:, i] +=\
(delta * (k * k / distance**2 - Ai * distance / k)).sum(axis=1)
# update positions
length = np.sqrt((displacement**2).sum(axis=0))
length = np.where(length < 0.01, 0.1, length)
delta_pos = (displacement * t / length).T
pos += delta_pos
# cool temperature
t -= dt
err = np.linalg.norm(delta_pos) / nnodes
if err < threshold:
break
return pos
This is the implementation part of the algorithm. I will summarize how to handle attractive force and repulsive force.
-Optimal distance between cords:t-= dt
, but $ dt = \ dfrac {t} {\ text {iteration} + 1} $
--Direction vector: $ \ text {delta} $
--Attractive force: $ f_a = \ dfrac {k ^ 2} {\ text {distance} ^ 2} $
--Repulsive force: $ f_r = --A_i * \ dfrac {\ text {distance}} {k} $
The optimal distance formula is $ C = 1, \ text {area} = 1 $ when applied to the paper. Simple! The temperature $ t $ will drop significantly at first and then gradually as ʻiteration` progresses. $ \ Text {delta} $ is not exactly a direction vector, but I wrote it because I am adjusting the norm in the end.
What is very worrisome is the definition of power. ** Both attractive and repulsive forces are different from those in the paper. ** Since the paper is from 1991, is there an improved version after that? Another finding is that in the case of weighted graphs, the repulsive force should be multiplied by the value of the adjacency matrix.
When you try to implement it yourself, I think it is better to implement it with the same settings.
bonus I played with the source code a little to visualize the behavior of the algorithm. (I made 200 iterations by tweaking the parameters.) It's interesting that the pieces of disjointed dots gradually become like strings. I got something close to the reference site "[Algorithm for drawing graphs (networks) beautifully] Mechanical model commentary article" with almost no implementation by myself. You did it.
I explained ** Fruchterman-Reingold algorithm **, which is a graph drawing algorithm, to get a little behind the networkx
that can handle graphs easily. After learning the algorithm, it is important to implement it, but it is also learning to see the implementation of a strong person. We look forward to your questions and suggestions.
[Algorithm for drawing graphs (networks) neatly] Mechanical model commentary article : A really good article!
[Graph drawing by force-directed placement, 1992] [Paper] : Fruchterman and Reingold's paper
networkx GitHub : What I explained a little this time
[Mechanical model (wikipedia)] [Mechanical model_wiki] : I want to see other methods
[[Python] Summary of basic usage of NetworkX 2.0 Qiita] [networkx2 usage Qiita] : Easy-to-understand articles
Recommended Posts