I'm a beginner myself, so I'm sorry to say "Introduction to Graph Database Neo4j in Python for Beginners". Introducing the environment construction to operate Neo4j with Python and the sample to play with the data.
Let's start with how to deploy Neo4J on Mac OS X. Since my environment is Yosemite 10.10.2, I would be grateful if you could let me know in the comments etc. if there are any errors due to the environment difference.
First of all, you need a JDK, but it seems that it does not work well with Java that is included in the Mac from the beginning, so Oracle JDK 7 http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260. Introduce from html.
next, http://neo4j.com/download/ Go to and download Neo4j's Community Edition (free version).
neo4j-community-2.2.0-unix-2.tar will be downloaded, so unzip it and put it in the folder.
tar zvf neo4j-community-2.2.0-unix-2.tar
cd neo4j-community-2.2.0
Then start it.
./bin/neo4j start
It's very easy to install! There is a user interface that can be accessed from a browser, so let's try using it. The access destination is http://localhost:7474/browser/ is. You will be asked to set the ID and Password at the first startup, so set them appropriately.
There seems to be some Python libraries that can connect to Neo4j, but here I decided to use Neo4j RestClient to connect from Python to Neo4j. I will.
pip install neo4jrestclient
pip is great, it's really easy to install! : satisfied:
Follow the neo4j RestClient Tutorial to try it out.
from neo4jrestclient.client import GraphDatabase
url = "http://<User ID>:<Password>@localhost:7474/db/data/"
gdb = GraphDatabase(url)
In the
Next, add two Nodes. They are alice and bob of the same age.
alice = gdb.nodes.create(name="Alice", age=30)
bob = gdb.nodes.create(name="Bob", age=30)
Actually, bob knew about alice from 1980, but alice knew bob three years later ... If you add a node called, it will be as follows.
bob.relationships.create("Knows", alice, since=1980)
alice.relationships.create("Knows", bob, since=1983)
Let's display this. (Proceed on the assumption that python is running on iPython notebook)
Hit the query to display all nodes including node-to-node relationships. Also, setting data_contents = True is the key, and it will not work well without this. (I took a little time without knowing this ...)
gdb.query("MATCH (n)-[r]-(m) RETURN n, r, m", data_contents=True)
The graph is displayed!
To see this in your browser, go to http: // localhost: 7474 / browser /
MATCH (n)-[r]-(m) RETURN n, r, m
And press Enter The graph is displayed: grinning:
Delete the data up to that point once.
# All Delete
gdb.query("MATCH (n) OPTIONAL MATCH (n)-[r]-() DELETE n,r", data_contents=True)
Then add 3 Nodes and Relationships.
#Add Person Node
alice = gdb.nodes.create(name="Alice", age=30)
bob = gdb.nodes.create(name="Bob", age=30)
ken = gdb.nodes.create(name="Ken", age=35)
alice.labels.add("Person")
bob.labels.add("Person")
ken.labels.add("Person")
#Relationship settings
bob.relationships.create("Knows", alice, since=1980)
alice.relationships.create("Knows", bob, since=1983)
alice.relationships.create("Knows", ken, since=2015)
Node refers to each person here, and Relationship refers to the relationship that Alice knows Bob.
When you hit the display query on iPython,
gdb.query("MATCH (n)-[r]-(m) RETURN n, r, m", data_contents=True)
A similar graph is displayed: D
Let's take a look at the graph by defining the person who writes next has a blog and the relationship with that blog as blog owner: "Owner" and subscriber: "Subscribe".
#Add Blog Node
cam_blog = gdb.nodes.create(name="Camera Blog")
comp_blog = gdb.nodes.create(name="Computer Blog")
trav_blog = gdb.nodes.create(name="Travel Blog")
gour_blog = gdb.nodes.create(name="Gourmet Blog")
cam_blog.labels.add("Blog")
comp_blog.labels.add("Blog")
trav_blog.labels.add("Blog")
gour_blog.labels.add("Blog")
#Add Relation
alice.relationships.create("Own", cam_blog)
bob.relationships.create("Own", comp_blog)
ken.relationships.create("Own", trav_blog)
alice.relationships.create("Subscribe", trav_blog)
alice.relationships.create("Subscribe", gour_blog)
bob.relationships.create("Subscribe", cam_blog)
ken.relationships.create("Subscribe", comp_blog)
Also, if you access http: // localhost: 7474 / browser / with a browser and throw the following query,
MATCH (n)-[r]-(m) RETURN n, r, m
A graph like this is displayed. I think it's pretty easy to understand. That's why it is said that graph databases are easier to understand relationships than relational databases.
Let me give you some examples of the basic writing of Cypher.
The following Bob, Alice, Ken indicated by red circles are selected.
match (n:Person) RETURN n
(* The red frame is added for clarity and does not appear on the Neo4J browser screen.)
This time the Blog node is selected.
match (n:Blog) RETURN n
This Cypher query extracts only Bob.
MATCH (b:Person {name:'Bob'}) RETURN b
Select the relationship as well. In the following cases, only those related to Own are applicable.
MATCH (p:Person)-[r:Own]->(b:Blog) RETURN p,r,b;
Finally, randomly generate 100 Person nodes and 100 Blog nodes, and generate and display a graph with 500 Subscribe relationships.
import numpy.random as rd
l = 100
person_list = []
blog_list = []
for i in range(l):
p = gdb.nodes.create(name="person_%d"%i)
p.labels.add("Person")
b = gdb.nodes.create(name="Blog_%d"%i)
b.labels.add("Blog")
person_list.append(p)
blog_list.append(b)
r1 = range(len(person_list))
rd.shuffle(r1)
for i in range(len(blog_list)):
blog_list[i].relationships.create("Own", person_list[r1[i]])
r2 = range(l) * 5
rd.shuffle(r2)
r3 = range(l) * 5
rd.shuffle(r3)
for i,j in zip(r2, r3):
person_list[i].relationships.create("Subscribe", blog_list[j])
It's chaos! : grin:
Install iCypher.
pip install ipython-cypher
You can graph it with networkx with the code below.
neo4j-networkx.py
%load_ext cypher
%matplotlib inline
import networkx as nx
import matplotlib.pyplot as plt
result = %%cypher http://<User ID>:<Password>@localhost:7474/db/data/ MATCH (n)-[r]-(m) RETURN n,r,m;
node_map ={'Person':'#22FF99', 'Blog': '#6622FF' }
node_label={'Person':'Person', 'Blog': 'Blog' }
g = result.get_graph()
pos=nx.get_node_attributes(g,'pos')
plt.figure(figsize=(15,15))
nx.draw(g, node_color=[node_map[g.node[n]['labels'][0]] for n in g],node_size=80, width=0.5, edge_color='#999999')
Recommended Posts