dev-resources.site
for different kinds of informations.
YugabyteDB as a Graph database with PuppyGraph
PostgreSQL can handle any workload without streaming data to a purpose-built database. This includes graph databases using PuppyGraph, which maps relational tables to a graph model.
When resilience, scalability, and elasticity are necessary, YugabyteDB provides a distributed alternative to PostgreSQL.
This article details the step-by-step tutorial to using PuppyGraph to query data in PostgreSQL, but using YugabyteDB instead of PostgreSQL.
Deployment
I Create a file docker-compose.yaml
with the following content:
services:
puppygraph:
image: puppygraph/puppygraph:stable
pull_policy: always
container_name: puppygraph
environment:
- PUPPYGRAPH_USERNAME=puppygraph
- PUPPYGRAPH_PASSWORD=puppygraph123
networks:
yugabytedb_net:
ports:
- "8081:8081"
- "8182:8182"
- "7687:7687"
yugabytedb:
image: yugabytedb/yugabyte:2024.1.2.0-b77
command: yugabyted start --enable_pg_parity_tech_preview --background=false
environment:
- YSQL_USER=puppygraph
- YSQL_PASSWORD=puppygraph123
networks:
yugabytedb_net:
ports:
- "5433:5433"
volumes:
- ./yugabyte-data:/root/var/data
networks:
yugabytedb_net:
name: puppy-yugabytedb
I run the following command to start YugabyteDB and PuppyGraph:
docker compose up -d
I can connect on http://localhost:8080/
Data Preparation
I run the following command to start ysqlsh
, the PostgreSQL shell, to access the database:
docker compose run -it -e PGPASSWORD=puppygraph123 -e PGUSER=puppygraph -e PGHOST=yugabytedb -e PGPORT=5433 -e PGDATABASE=yugabyte yugabytedb ysqlsh
I execute the following SQL statements in the shell to create tables and insert data:
create schema modern;
create table modern.person (id text, name text, age integer);
insert into modern.person values
('v1', 'marko', 29),
('v2', 'vadas', 27),
('v4', 'josh', 32),
('v6', 'peter', 35);
create table modern.software (id text, name text, lang text);
insert into modern.software values
('v3', 'lop', 'java'),
('v5', 'ripple', 'java');
create table modern.created (id text, from_id text, to_id text, weight double precision);
insert into modern.created values
('e9', 'v1', 'v3', 0.4),
('e10', 'v4', 'v5', 1.0),
('e11', 'v4', 'v3', 0.4),
('e12', 'v6', 'v3', 0.2);
create table modern.knows (id text, from_id text, to_id text, weight double precision);
insert into modern.knows values
('e7', 'v1', 'v2', 0.5),
('e8', 'v1', 'v4', 1.0);
analyze;
Modeling a Graph
I use the PuppyGraph schema file schema.json
with the following content:
{
"catalogs": [
{
"name": "postgres_data",
"type": "postgresql",
"jdbc": {
"username": "puppygraph",
"password": "puppygraph123",
"jdbcUri": "jdbc:postgresql://yugabytedb:5433/yugabyte",
"driverClass": "org.postgresql.Driver"
}
}
],
"vertices": [
{
"label": "person",
"mappedTableSource": {
"catalog": "postgres_data",
"schema": "modern",
"table": "person",
"metaFields": {
"id": "id"
}
},
"attributes": [
{
"type": "Int",
"name": "age"
},
{
"type": "String",
"name": "name"
}
]
},
{
"label": "software",
"mappedTableSource": {
"catalog": "postgres_data",
"schema": "modern",
"table": "software",
"metaFields": {
"id": "id"
}
},
"attributes": [
{
"type": "String",
"name": "lang"
},
{
"type": "String",
"name": "name"
}
]
}
],
"edges": [
{
"label": "knows",
"mappedTableSource": {
"catalog": "postgres_data",
"schema": "modern",
"table": "knows",
"metaFields": {
"id": "id",
"from": "from_id",
"to": "to_id"
}
},
"from": "person",
"to": "person",
"attributes": [
{
"type": "Double",
"name": "weight"
}
]
},
{
"label": "created",
"mappedTableSource": {
"catalog": "postgres_data",
"schema": "modern",
"table": "created",
"metaFields": {
"id": "id",
"from": "from_id",
"to": "to_id"
}
},
"from": "person",
"to": "software",
"attributes": [
{
"type": "Double",
"name": "weight"
}
]
}
]
}
I upload the schema by selecting the file schema.json in the
Querying the graph via CLI
I execute the following command to access the PuppyGraph Gremlin Console
docker compose exec -it puppygraph ./bin/console
I run the following queries in the console to query the Graph:
puppy-gremlin> g.V().has("name", "marko").valueMap()
Done! Elapsed time: 0.359s, rows: 1
==>map[age:29 name:marko]
puppy-gremlin> g.V().has("name", "marko").out("knows").out("created").valueMap()
Done! Elapsed time: 0.108s, rows: 2
==>map[lang:java name:lop]
==>map[lang:java name:ripple]
puppy-gremlin>
SQL Queries
I refresh the local cache to check from pg_stat_statements
that the queries are fast:
yugabyte=# select total_time/calls as time, calls, substr(query,1,80)
from pg_stat_statements order by 1
;
Scale-Out YugabyteDB
To add more nodes to the YugabyteDB cluster, you can add a service that starts YugabyteDB and join the first node. Add the following to docker-compose.yaml
:
more-nodes:
image: yugabytedb/yugabyte:2024.1.2.0-b77
command: yugabyted start --join yugabytedb --enable_pg_parity_tech_preview --background=false
networks:
yugabytedb_net:
deploy:
replicas: 0
Start a second node with:
docker compose up -d --scale more-nodes=1
Start a third node with:
docker compose up -d --scale more-nodes=2
With three nodes, the cluster is automatically set to Replication Factor 3 and becomes resilient to one node failure
Conclusion
This tutorial demonstrates using PuppyGraph to query YugabyteDB as a graph database, leveraging the PostgreSQL compatibility of YugabyteDB's distributed architecture. It follows the simple deployment steps from PuppyGraph documentation.
With PuppyGraph, you can directly query your existing data stores without needing ETL pipelines. PuppyGraph and YugabyteDB together provide a powerful solution for modern graph-based analytics.
Featured ones: