Hit graph query with Java + PGX at explosive speed

1. Introduction

――I had a chance to touch the property graph, so I will write it as a memorandum. ――When you hear the word graph, it is easy to associate it with academic mathematics like graph theory, but when you use a "graph database", you don't need to talk about complicated things like graph theory. --Here, we will accept while focusing on the "graph database".

2. What is a graph?

In a nutshell, a graph is just a set of vertices and edges. Graphs represent entities as nodes and how they relate to the world. There are three data models for graphs.

Property graph

There are the following four features of the property graph.

--The property graph contains nodes and relationships. --The node contains properties (Key / Value). --The relationship has a name and a direction, and there is always a start node and an end node. --Relationships can also include properties.

Example) Simple property graph

image.png

Hypergraph

A hypergraph is a general purpose graph model that allows you to connect relationships to any number of nodes. In the property graph above, the relationship can only have one start node to the end node, but in the hypergraph you can connect to any number of nodes.

Example) Hypergraph showing that Alice and Bob have 3 cars

image.png

triple

A triple has a data structure of "subject-predicate-object". Triples correspond to the RDF (Resource Description Framework) metadata model. Conversely, RDF represents a resource (that is not defined) in triples.

Example) RDF that triples the natural language "Ora Lassila is the author of the resource (http://www.w3.org/Home/Lassila)"

--Sentence structure

subject resource http://www.w3.org/Home/Lassila
predicate Property Creator
Object Property value "Ora Lassila"

--RDF / XML representation

<rdf:RDF>
  <rdf:Description about="http://www.w3.org/Home/Lassila">
    <s:Creator>Ora Lassila</s:Creator>
  </rdf:Description>
</rdf:RDF>

3. Environment (Parallel Graph Analytix)

PGX This time, we will use a graph toolkit called Parallel Graph Analytix (commonly known as PGX). PGX includes a graph query language, a variety of analytical features and support for Machine Learning. The picture below is an overview of PGX. image.png

Oracle Cloud Infrastructure Here's a simple cloud architecture on how to use PGX. For the sake of simplicity, we will prepare the environment based on Public Subnet.

image.png

The contents are as follows.

Client server

Graph server

DB server

--Autonomous Database (19c) <-DBCS is fine --Create a graph with the OPG_APIS.CREATE_PG () procedure


SQL> Execute OPG_APIS.CREATE_PG('Graph',4,8,'USERS');
SQL> select table_name from user_tables;
TABLE_NAME
--------------------
GRAPHGT$
GRAPHIT$
GRAPHSS$
GRAPHVT$
GRAPHGE$

SQL> desc GRAPHVT$
 Name               Null?         Type
 ------------------ ---------- --------------------- 
 VID                NOT NULL   NUMBER
 VL                            NVARCHAR2(3100)
 K                             NVARCHAR2(3100)
 T                             NUMBER(38)
 V                             NVARCHAR2(15000)
 VN                            NUMBER
 VT                            TIMESTAMP(6) WITH TIME ZONE
 SL                            NUMBER
 VTS                           DATE
 VTE                           DATE
 FE                            NVARCHAR2(4000)

4. To the point of throwing a graph query (PGQL)

RDB data

Property graph to create

If you search on Google, you will find a lot of sample data, but since it is boring using them, I will try to make it myself, although it is a shabby graph. Insert into the GRAPHVT $ table and GRAPHGE $ table on the RDB side obediently.

image.png

Node creation

insert into GRAPHVT$ (VID,VL,T,K,V) values (1,'person',1,'name','Sato');
insert into GRAPHVT$ (VID,VL,T,K,VN) values(1,'person',2,'age',40);
insert into GRAPHVT$ (VID,VL,T,K,V) values(2,'person',1,'name','Suzuki');
insert into GRAPHVT$ (VID,VL,T,K,VN) values(2,'person',2,'age',20);
insert into GRAPHVT$ (VID,VL,T,K,V) values(3,'person',1,'name','Yamamoto');
insert into GRAPHVT$ (VID,VL,T,K,VN) values(3,'person',2,'age',35);
insert into GRAPHVT$ (VID,VL,T,K,V) values(4,'person',1,'name','Tanaka');
insert into GRAPHVT$ (VID,VL,T,K,VN) values(4,'person',2,'age',25);

Edge creation

create sequence graph_eid_seq; 
alter sequence graph_eid_seq restart;
insert into GRAPHGE$ (EID,SVID,DVID,EL,K,T,VN) values(graph_eid_seq.nextval,1,2,'knows','weight',3,0.5);
insert into GRAPHGE$ (EID,SVID,DVID,EL,K,T,VN) values(graph_eid_seq.nextval,1,4,'knows','weight',3,0.5);
insert into GRAPHGE$ (EID,SVID,DVID,EL,K,T,VN) values(graph_eid_seq.nextval,4,2,'likes','weight',3,0.8);
insert into GRAPHGE$ (EID,SVID,DVID,EL,K,T,VN) values(graph_eid_seq.nextval,4,3,'knows','weight',3,0.7);
insert into GRAPHGE$ (EID,SVID,DVID,EL,K,T,VN) values(graph_eid_seq.nextval,3,1,'knows','weight',3,0.9);

Connection by JShell

The connection method when using Jshell is described.


[oracle@cli bin] curl -X POST -H 'Content-Type: application/json' -d '{"username": "***", "password": "***"}' http://10.51.0.2:7007/auth/token
->If you type the command correctly, you will get an Access Token

[oracle@cli bin] ./oracle-graph-client-20.3.0/bin/opg-jshell --base_url http://10.51.0.2:7007

enter authentication token (press Enter for no token): <-Copy and paste the Token obtained with the Curl command
For an introduction type: /help intro
Oracle Graph Client Shell 20.3.0
PGX server version: 20.1.1 type: SM
PGX server API version: 3.8.1
PGQL version: 1.3
Variables instance, session, and analyst ready to use.

opg-jshell> GraphConfig cfg = GraphConfigBuilder.forPropertyGraphRdbms()
.setName("Graph")
.addVertexProperty("name",PropertyType.STRING)
.addVertexProperty("age",PropertyType.INTEGER)
.addEdgeProperty("weight",PropertyType.FLOAT)
.setLoadVertexLabels(true)
.setLoadEdgeLabel(true).build(); <-Define the graph to handle

opg-jshell> PgxGraph graph = session.readGraphWithProperties(cfg); <-On from RDB-Load the graph into Memory
graph ==> PgxGraph[name=Graph,N=4,E=5,created=1596986537591]

opg-jshell> graph.queryPgql("SELECT count(v) FROM Graph MATCH (v)").print(10).close(); <-PGQL(1)
+----------+
| count(v) |
+----------+
| 4        |
+----------+

opg-jshell> 
opg-jshell> graph.queryPgql("SELECT id(n), label(n),n.name as name1,n.age as age1,label(e), e.weight, id(m),label(m),m.name as name2,m.age as age2 FROM MATCH (n) -[e]-> (m)").print(10).close();
<- PGQL(2)
+---------------------------------------------------------------------------------------------+
| id(n) | label(n) | name1    | age1 | label(e) | weight | id(m) | label(m) | name2    | age2 |
+---------------------------------------------------------------------------------------------+
| 3     | person   | Yamamoto | 35   | knows    | 0.9    | 1     | person   | Sato     | 40   |
| 4     | person   | Tanaka   | 25   | knows    | 0.7    | 3     | person   | Yamamoto | 35   |
| 4     | person   | Tanaka   | 25   | likes    | 0.8    | 2     | person   | Suzuki   | 20   |
| 1     | person   | Sato     | 40   | knows    | 0.5    | 4     | person   | Tanaka   | 25   |
| 1     | person   | Sato     | 40   | knows    | 0.5    | 2     | person   | Suzuki   | 20   |
+---------------------------------------------------------------------------------------------+
opg-jshell>

Java connection

Here is the connection in Java. This is not necessary if you only touch PGX with Jshell. However, Java is easier because it does not have to be executed interactively.

TokenConnect.java


import oracle.pgx.api.*;
import oracle.pgx.config.*;
import oracle.pg.rdbms.*;
import oracle.pgx.common.types.*;
import java.util.function.Supplier;

public class TokenConnect{
    public static void main(Srting[] args) throws Exception{
        /*The specification is to specify the Token obtained by URL and Curl as the argument.*/
        String baseUrl = args[0];
        String token = args[1];
        ServerInstance instance = Pgx.setInstance(baseUrl,token);
        try (PgxSession session = instance.createSession("my-session")){
            Supplier<GraphConfig> cfg = () ->{return GraphConfigBuilder.forPropertyGraphRdbms()
            .forPropertyGraphRdbms()
            .setName("Graph")
            .addVertexProperty("name",PropertyType.STRING)
            .addVertexProperty("age",PropertyType.INTEGER)
            .addEdgeProperty("weight",PropertyType.FLOAT)
            .setLoadVertexLabels(true)
            .setLoadEdgeLabel(true)
            .build();
        
            PgxGraph graph = session.readGraphWithProperties(cfg.get());
            System.out.println("N = " + graph.getNumVertices()+ " <-> E = " + graph.getNumEdges());
        }
    }
}

[oracle@cli oracle-graph-client-20.3.0] javac -cp 'lib/*' TokenConnect.java
warning: Supported source version 'RELEASE_8' from annotation processor 'org.apache.tinkerpop.gremlin.process.traversal.dsl.GremlinDslProcessor' less than -source '11'
1 warning

[oracle@cli oracle-graph-client-20.3.0] java -cp '.:conf:lib/*' TokenConnect *baseUrl *Token
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.google.inject.internal.cglib.core.$ReflectUtils$1 (file:/home/opc/oracle-graph-client-20.3.0/lib/guice-4.2.2.jar) to method java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int,java.security.ProtectionDomain)
WARNING: Please consider reporting this to the maintainers of com.google.inject.internal.cglib.core.$ReflectUtils$1
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release

N = 4 <-> E = 5
+---------------------------------------------------------------------------------------------+
| id(n) | label(n) | name1    | age1 | label(e) | weight | id(m) | label(m) | name2    | age2 |
+---------------------------------------------------------------------------------------------+
| 3     | person   | Yamamoto | 35   | knows    | 0.9    | 1     | person   | Sato     | 40   |
| 4     | person   | Tanaka   | 25   | knows    | 0.7    | 3     | person   | Yamamoto | 35   |
| 4     | person   | Tanaka   | 25   | likes    | 0.8    | 2     | person   | Suzuki   | 20   |
| 1     | person   | Sato     | 40   | knows    | 0.5    | 4     | person   | Tanaka   | 25   |
| 1     | person   | Sato     | 40   | knows    | 0.5    | 2     | person   | Suzuki   | 20   |
+---------------------------------------------------------------------------------------------+
[oracle@cli oracle-graph-client-20.3.0]

The result of the PGQL query is returned firmly.

5. Summary

--I did a query search from my own graph using PGX. --I was able to confirm that each server can be divided into 3 layers and connected remotely. --I also confirmed that it can be used properly in both Jshell and Java.

reference

--Reference of installation procedure-> PGX Document

Recommended Posts

Hit graph query with Java + PGX at explosive speed
Hello World at explosive speed with Spring Initializr! !! !!
Use Microsoft Graph with standard Java
Build Zabbix server at explosive speed (CentOS 7)
Use PHP + YoutubeDataAPIv3 at explosive speed using Docker-compose