I want to draw beautiful tree structure data with Jupyter Notebook or a web application! That is my motivation. I use Python for the convenience of analyzing the original data, and I wanted to repeat the analysis and visualization interactively.
Python There seem to be many ways to draw a tree structure in Python, and the first one that comes to mind is networkx or [graphviz]. I use a graph library such as (https://graphviz.readthedocs.io/en/stable/), but since the purpose is to analyze the graph structure in the first place, I think that elaborate expressions are difficult.
On the other hand, there is ETE Toolkit as a dedicated analysis and visualization tool for tree structures. The model is represented using The Newick format, which seems to be a well-known representation of the tree structure. This also looks pretty good, but considering its affinity on the web, I wanted to use something a little more versatile, such as JSON, to describe the model.
Javascript For drawing graph structure in Javascript, there is a good summary here [https://qiita.com/d0nchaaan/items/c47dc74cfefd92516e28). There are various things, but in terms of versatility and the amount of information, I think it is D3.js. There is also a library for drawing trees.
So, I will try to draw a tree structure with D3.js using Jupyter Notebook. Jupyter Notebook has a good affinity from the beginning because it can execute Javascript and draw HTML while using the Python kernel by using cell magic (%% javascript).
As a method, you can use Javascript cell magic or py_d3. Well, the former is difficult to understand intuitively using RequireJS (see here). The latter seems to wrap Javascript's cell magic nicely. Here, I will explain drawing using py_d3.
Anyway, install py_d3. Use pip. pip install py_d3
First, load the module as follows:
import py_d3
py_d3.load_ipython_extension(get_ipython())
The following is a sample quote from the home page. The first %% d3 5.12.0
specifies the version of d3.js. You can find out which version is available by typing % d3 version
.
%%d3 5.12.0
<g></g>
<style>
element {
height: 25px;
}
div.bar {
display: inline-block;
width: 20px;
height: 75px;
margin-right: 2px;
background-color: teal;
}
</style>
<script>
var dataset = [ 5, 10, 13, 19, 21, 25, 22, 18, 15, 13,
11, 12, 15, 20, 18, 17, 16, 18, 23, 25 ];
d3.select("g").selectAll("div")
.data(dataset)
.enter()
.append("div")
.attr("class", "bar")
.style("height", function(d) {
var barHeight = d * 5;
return barHeight + "px";
});
</script>
You can check the output of the graph as shown below.
By making full use of RequireJS, it is possible to input objects and variables calculated in Python into the defined Javascript function. However, it is not much different from outputting as JSON and reading with d3.js, so if the data is not very large, it is easy to go through the intermediate file. For example, prepare a JSON file with a tree structure as shown below.
import json
data = {
"name": "A",
"children": [
{ "name": "B" },
{
"name": "C",
"children": [{ "name": "D" }, { "name": "E" }, { "name": "F" }]
},
{ "name": "G" },
{
"name": "H",
"children": [{ "name": "I" }, { "name": "J" }]
},
{ "name": "K" }
]
};
json_file = open('test.json', 'w')
json.dump(data, json_file)
Use the tree layout of d3 hierarchy to draw the tree structure. After loading the JSON data structure and creating a tree based on that data, all you have to do is adjust the SVG drawing. Please refer to the d3.js and SVG documents for fine adjustment of drawing.
%%d3 5.12.0
<style>
.link {
fill: none;
stroke: #555;
stroke-opacity: 0.4;
stroke-width: 1.5px;
}
</style>
<svg width="800" height="600"></svg>
<script>
var width = 800;
var height = 600;
var g = d3.select("svg").append("g")
.attr("transform", "translate(80,0)");
console.log("data");
d3.json("test.json")
.then((data) => {
console.log(data);
var root = d3.hierarchy(data);
var tree = d3.tree(root).size([height, width - 160]);
tree(root);
var link = g.selectAll(".link")
.data(root.descendants().slice(1))
.enter()
.append("path")
.attr("class", "link")
.attr("d", (d) => {
return "M" + d.y + "," + d.x +
"C" + (d.parent.y + 100) + "," + d.x +
" " + (d.parent.y + 100) + "," + d.parent.x +
" " + d.parent.y + "," + d.parent.x;
});
var node = g.selectAll(".node")
.data(root.descendants())
.enter()
.append("g")
.attr("class", "node")
.attr("transform", function(d) { return "translate(" + d.y + "," + d.x + ")"; })
node.append("circle")
.attr("r", 8)
.attr("fill", "#999");
node.append("text")
.attr("dy", 3)
.attr("x", function(d) { return d.children ? -12 : 12; })
.style("text-anchor", function(d) { return d.children ? "end" : "start"; })
.attr("font-size", "200%")
.text(function(d) { return d.data.name; });
})
.catch((error)=>{
}
)
</script>