Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Review of Graph Databases

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.

Review of Graph Databases

Walk through of available graph databases, query languages and managed solutions available at a time.

Avatar for Arturas Smorgun

Arturas Smorgun

September 05, 2018
Tweet

More Decks by Arturas Smorgun

Other Decks in Technology

Transcript

  1. (review of graph databases) ( )— —>( )<— —( )

    Graph Recap !-> Types of Graph Solutions !-> Query Languages !-> Graphs in AWS
  2. Review of Graph Databases, Arturas Smorgun, 2018 Domain: encyclopedia •

    Vertices: • Author • Page • Edges: • Page relates to other page • Author contributes to a page (authors, edits, reviews)
  3. Review of Graph Databases, Arturas Smorgun, 2018 Graph Algorithms FTW!!

    Breadth First Search Depth First Search Shortest Path Minimum Spanning Tree Maximum Flow Connectivity … …
  4. Review of Graph Databases, Arturas Smorgun, 2018 Types of Graph

    Solutions Graph Recap !-> Types of Graph Solutions !-> Query Languages !-> Graphs in AWS
  5. Review of Graph Databases, Arturas Smorgun, 2018 Resource Description Framework

    (RDF) • W3C Recommendation • Open Source • since 1999 • Information Exchange • Semantic Web
  6. Review of Graph Databases, Arturas Smorgun, 2018 Labeled Property Graph

    (LPG) • Not sure who was first • Some open source • Some proprietary • Emphasis on querying and traversal
  7. Review of Graph Databases, Arturas Smorgun, 2018 RDF • Vertice

    = Resource • Edge = Relationship • Qualify relationships: NO • Identify relationship: NO • Triple: subject, predicate, object • Data mobility: YES • Verifiability: YES • Vertice = Node • Edge = Relationship • Qualify relationships: YES • Identify relationship: YES • No strictly defined structure • Data mobility: Maybe? • Verifiability: No? LPG
  8. Review of Graph Databases, Arturas Smorgun, 2018 Graph DB vs

    Relational DB • Good for recursive loose schema • Vertices for entities • Edges for relationships • Easy to add new entity types • Easy to add new relationships • Easy to traverse relationships • Good for flat strict schema • Tables for entities • Tables or FK for relationships • Easy to add new entity types • Easy to add new relationships • Hard to traverse relationships
  9. Review of Graph Databases, Arturas Smorgun, 2018 Graph DB vs

    Document DB • Loose schema with nester relations • Vertices for entities • Edges for relationships • Easy to add new entity types • Easy to add new relationships • Easy to traverse relationships • Loose schema with duplications • Document with all relations • Document per use case • Easy to add new entity types • Hard to add new relationships • Hard to traverse relationships
  10. Review of Graph Databases, Arturas Smorgun, 2018 Query Languages Graph

    Recap !-> Types of Graph Solutions !-> Query Languages !-> Graphs in AWS
  11. Review of Graph Databases, Arturas Smorgun, 2018 SPARQL • For

    RDF Graphs • W3C Recommendation • widely used • not an xml, yay (the RDF documents can be xml) PREFIX foaf: <http:!//xmlns.com/foaf/0.1!/> SELECT ?name WHERE { ?person foaf:name ?name . }
  12. Review of Graph Databases, Arturas Smorgun, 2018 Cypher • For

    Labeled Property Graphs • created by Neo4j • open sourced and widely available now • “SQL of Graph Databases” (striving to be) MATCH (node1:Label1)-[rel]!->(node2:Label2) WHERE node1.propertyA = {value} RETURN node2.propertyA, node2.propertyB
  13. Review of Graph Databases, Arturas Smorgun, 2018 Gremlin • For

    Labeled Property Graphs • Apache Project • open sourced and used for many different databases and processors • very similar to Cypher, but I found tooling and documentation poorer g.V().has("name","gremlin"). out("knows"). out("knows"). values("name")
  14. Review of Graph Databases, Arturas Smorgun, 2018 GraphQL • NOT

    graph db query language • started by Facebook • aimed at api communication • very useful to query existing data { hero { name } }
  15. Review of Graph Databases, Arturas Smorgun, 2018 Insert in SPARQL

    (RDF XML) vs Cypher <?xml version="1.0"?> <rdf:RDF xmlns="http:!//!!www.w3.org/2002/07/owl" xml:base="http:!//!!www.w3.org/2002/07/owl" xmlns:x="http:!//!!www.example.org/" xmlns:rdfs="http:!//!!www.w3.org/2000/01/rdf-schema#" xmlns:owl="http:!//!!www.w3.org/2002/07/owl#" xmlns:xsd="http:!//!!www.w3.org/2001/XMLSchema#" xmlns:rdf="http:!//!!www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description rdf:about="author_1"> <x:name>David Smith!</x:name> <x:authored rdf:resource="page_1">!</x:authored> <x:authored rdf:resource="page_2">!</x:authored> <x:authored rdf:resource="page_4">!</x:authored> !</rdf:Description> <rdf:Description rdf:about="author_2"> <x:name>Jack Jones!</x:name> <x:authored rdf:resource="page_3">!</x:authored> <x:edited rdf:resource="page_2">!</x:edited> <x:reviewed rdf:resource="page_1">!</x:reviewed> !</rdf:Description> <rdf:Description rdf:about="author_3"> <x:name>James James!</x:name> <x:authored rdf:resource="page_4">!</x:authored> !</rdf:Description> <rdf:Description rdf:about="page_1"> <x:name>City!</x:name> !</rdf:Description> <rdf:Description rdf:about="page_2"> <x:name>Owl!</x:name> !</rdf:Description> <rdf:Description rdf:about="page_3"> <x:name>Branch!</x:name> <x:related_to rdf:resource="page_4">!</x:related_to> !</rdf:Description> <rdf:Description rdf:about="page_4"> <x:name>Some!</x:name> !</rdf:Description> !</rdf:RDF> CREATE (a1:AUTHOR {name: "David Smith"}) CREATE (a2:AUTHOR {name: "Jack Jones"}) CREATE (a3:AUTHOR {name: "James James"}) CREATE (p1:PAGE {name: "City"}) CREATE (p2:PAGE {name: "Owl"}) CREATE (p3:PAGE {name: "Branch"}) CREATE (p4:PAGE {name: "Some"}) CREATE (a1)-[:AUTHORED]!->(p1) CREATE (a1)-[:AUTHORED]!->(p2) CREATE (a1)-[:AUTHORED]!->(p4) CREATE (a2)-[:EDITED]!->(p2) CREATE (a2)-[:AUTHORED]!->(p3) CREATE (a2)-[:REVIEWED]!->(p1) CREATE (a3)-[:EDITED]!->(p3) CREATE (p3)-[:RELATED_TO]!->(p4)
  16. Review of Graph Databases, Arturas Smorgun, 2018 Insert in SPARQL

    (Turtle) vs Cypher @prefix x: <http:!//example.com!/> INSERT DATA { <http:!//example.com/author_1> x:name: "David Smith" x:authored: <http:!//example.com/page_1> x:authored: <http:!//example.com/page_2> x:authored: <http:!//example.com/page_4> <http:!//example.com/author_2> x:name: "Jack Jones" x:authored: <http:!//example.com/page_3> x:edited: <http:!//example.com/page_2> x:reviewed: <http:!//example.com/page_1> <http:!//example.com/author_3> x:name: "James James" x:authored: <http:!//example.com/page_4> <http:!//example.com/page_1> x:name: "City" <http:!//example.com/page_2> x:name: "Owl" <http:!//example.com/page_3> x:name: "Branch" x:related_to: <http:!//example.com/page_4> <http:!//example.com/page_4> x:name: "Some" CREATE (a1:AUTHOR {name: "David Smith"}) CREATE (a2:AUTHOR {name: "Jack Jones"}) CREATE (a3:AUTHOR {name: "James James"}) CREATE (p1:PAGE {name: "City"}) CREATE (p2:PAGE {name: "Owl"}) CREATE (p3:PAGE {name: "Branch"}) CREATE (p4:PAGE {name: "Some"}) CREATE (a1)-[:AUTHORED]!->(p1) CREATE (a1)-[:AUTHORED]!->(p2) CREATE (a1)-[:AUTHORED]!->(p4) CREATE (a2)-[:EDITED]!->(p2) CREATE (a2)-[:AUTHORED]!->(p3) CREATE (a2)-[:REVIEWED]!->(p1) CREATE (a3)-[:EDITED]!->(p3) CREATE (p3)-[:RELATED_TO]!->(p4)
  17. Review of Graph Databases, Arturas Smorgun, 2018 Select in SPARQL

    vs Cypher !// select all @prefix x: <http:!//example.com!/> SELECT DISTINCT ?g WHERE { GRAPH ?g { ?s ?p ?o } } !// select ordered authors @prefix x: <http:!//example.com!/> SELECT { ?g WHERE { GRAPH ?g { ?s x:authored ?o } }. ?s x:name ?name. } ORDER BY ?name !// select all MATCH (n1)-[r]!->(n2) RETURN r, n1, n2 !// select ordered authors MATCH (author)-[r:AUTHORED]!->(page) RETURN author, r, page ORDER BY author.name
  18. Review of Graph Databases, Arturas Smorgun, 2018 Update in SPARQL

    vs Cypher @prefix x: <http:!//example.com> DELETE {<http:!//example.com/author_1> x:name ?o}
 INSERT {<http:!//example.com/author_1> x:name “John Smith”} WHERE {<http:!//example.com/author_1> x:name ?o MATCH (a:AUTHOR {name: “John Smith”) SET c.name = "David Smith" RETURN c
  19. Review of Graph Databases, Arturas Smorgun, 2018 Select all related

    pages in SPARQL vs Cypher !// not sure … !// update graph MATCH (p:PAGE {name: "Some"}) CREATE (p5:PAGE {name: "Some More"}) CREATE (p5)-[:RELATED_TO]!->(p) !// select MATCH (a:AUTHOR {name: "Jack Jones"})-[:AUTHORED]! ->(ap: PAGE), (ap)-[:RELATED_TO*1!..3]-(other:PAGE) RETURN a, ap, other
  20. Review of Graph Databases, Arturas Smorgun, 2018 Ready solutions on

    AWS Graph Recap !-> Types of Graph Solutions !-> Query Languages !-> Graphs in AWS
  21. Review of Graph Databases, Arturas Smorgun, 2018 Amazon Neptune •

    AWS Native Service • Managed • LPG & RDF • Query languages: SPARQL, Gremlin (can do Cypher) • ACID: YES • Limitation: 64TB of data • Proprietary and vendor locked (but compatible) • Do not have dev env (use compatible services) • AWS Marketplace Solution • Hosted (support can be bought) • LPG (RDF via community plugin) • Query language: Cypher • ACID: YES? • Limitation: 34.4B of nodes • Open source and graph native • Have dev env (native or docker) Neo4j Enterprise
  22. Review of Graph Databases, Arturas Smorgun, 2018 HA: Amazon Neptune

    vs Neo4j Enterprise • Master-Slave • Automatic backups to s3 • Replicated across AZ • Failover in 30 seconds • Causal Graph
  23. Review of Graph Databases, Arturas Smorgun, 2018 Internals: Amazon Neptune

    vs Neo4j Enterprise • ¯\_(ツ)_/¯ • Graph Native
  24. Review of Graph Databases, Arturas Smorgun, 2018 Cost: Amazon Neptune

    vs Neo4j Enterprise • Pay as you go, no upfront cost • $0.30..$5.50 per hour (compute) • $0.10 per month (per GB store) • $0.20 per 1 million of requests • backups, replication, disaster recovery included • No upfront cost (check license) • pay for compute resource used • pay for storage used • pay for network traffic • commercial support is very expensive (reportedly ~$200k), need devops to maintain
  25. Review of Graph Databases, Arturas Smorgun, 2018 (CYPHER)-[:IS]->(GREAT) Neo4j has

    awesome tools to start quickly https://neo4j.com/download/ https://neo4j.com/sandbox-v2/