A graph is one of the general data structures built on the edges of points. In familiar examples, friend networks and web pages can also be represented as graphs.
Currently, Python's Networkx is a very useful tool when dealing with graphs. A very useful library that is easy to write and easy to write. If you want to deal with graphs, it seems good to try it first.
-[Python] Summary of basic usage of NetworkX 2.0 -Introduction to NetworkX in Python
However, NetworkX has the problem that it is very heavy for large graphs.
Therefore, in this article, we will introduce how to use C ++, which has high processing speed, for information on graph construction in C ++, while keeping the file format when constructing a graph in NetworkX.
As a prerequisite knowledge, it is assumed that you know how to write C ++. There is a description about Networkx, but there is no problem even if you do not know about NetworkX.
In NetworkX, the origin of vertices is written by separating each vertex with a space for each line.
The file is in this format.
facebook_combined.txt
0 1
0 2
0 3
0 4
0 5
0 6
︙
The format is (start node id) (end node id).
NetworkX can handle such node information in an instant, but C ++ requires a lot of work. As an actual procedure
I will explain in order.
string path = argv[1];
string num_nodes = stoi(argv[2]);
ifstream ifs(path);
vector<vector<int>> nodes;
nodes = vector<vector<int>>(num_nodes);
vector<string> split(string& input, char delimiter){
istringstream stream(input);
string field;
vector<string> result;
while (getline(stream, field, delimiter)) {
result.push_back(field);
}
return result;
}
This is a string split function that is common in C ++. It takes the target string and the split character as arguments, splits the string with the split character, and returns a vector.
assignment
string str;
int from, to;
while(getline(ifs, str)){
//Separated by spaces
vector<string> strvec = split(str, ' ');
from = stoi(strvec.at(0));
to = stoi(strvec.at(1));
nodes[from].push_back(to);
}
The getline function reads the file line by line and the split function is used to interpret the lines separated by spaces. Substitute the information of the start point node id and the end point node id returned as a result into the vector. In this way, the structural information of the node could be stored in the vector.
for(int i = 0; i < num_nodes; i++){
cout << i << "->";
for(int j = 0; j < nodes[i].size(); j++){
cout << nodes[i][j];
if(j != nodes[i].size()-1)cout << ",";
}
cout << endl;
}
1->48,53,54,73,88,92,119,126,133,194,236,280,299,315,322,346
2->20,115,116,149,226,312,326,333,343
3->9,25,26,67,72,85,122,142,170,188,200,228,274,280,283,323
4->78,152,181,195,218,273,275,306,328
︙
In this way, you can check the edge information of the node. It's hard to do so far.
Recommended Posts