Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Riak Use Cases - Dissecting The Solutions To Ha...

Riak Use Cases - Dissecting The Solutions To Hard Problems

Presented on November 29, 2012, by Basho Hacker Chris Molozian. This is a great slide deck for anyone looking to get a better impression of Riak, where it fits in the database space, and how people are using it in production.

Basho Technologies

November 29, 2012
Tweet

More Decks by Basho Technologies

Other Decks in Technology

Transcript

  1. Riak Use Cases Dissecting the Solutions to Hard Problems NoSQL

    Roadshow - Amsterdam 29 / 11 / 2012 Friday, 30 November 12
  2. whoami $ Name: Matthew Revell $ Title: Community Manager Company:

    Basho Technologies Twitter: @matthewrevell Friday, 30 November 12
  3. whoami $ Name: Matthew Revell $ ./presentation Title: Community Manager

    Company: Basho Technologies Twitter: @matthewrevell Friday, 30 November 12
  4. NoSQL "departs from the relational model altogether; it should therefore

    have been called more appropriately 'NoREL' " ~ Carlo Strozzi Friday, 30 November 12
  5. NoSQL • Divided into a (growing) list of categories (the

    more exotic ones include Multivalue and Tuple stores) • All are “optimized” for record storage • Arguably the largest categories are: "departs from the relational model altogether; it should therefore have been called more appropriately 'NoREL' " ~ Carlo Strozzi Friday, 30 November 12
  6. NoSQL • Divided into a (growing) list of categories (the

    more exotic ones include Multivalue and Tuple stores) • All are “optimized” for record storage • Arguably the largest categories are: Graph "departs from the relational model altogether; it should therefore have been called more appropriately 'NoREL' " ~ Carlo Strozzi Friday, 30 November 12
  7. NoSQL • Divided into a (growing) list of categories (the

    more exotic ones include Multivalue and Tuple stores) • All are “optimized” for record storage • Arguably the largest categories are: Graph Key-Value "departs from the relational model altogether; it should therefore have been called more appropriately 'NoREL' " ~ Carlo Strozzi Friday, 30 November 12
  8. NoSQL • Divided into a (growing) list of categories (the

    more exotic ones include Multivalue and Tuple stores) • All are “optimized” for record storage • Arguably the largest categories are: Graph Key-Value "departs from the relational model altogether; it should therefore have been called more appropriately 'NoREL' " ~ Carlo Strozzi Document Friday, 30 November 12
  9. Graph • Data is represented using: • Nodes - an

    entity of some kind (i.e. User) Friday, 30 November 12
  10. Graph • Data is represented using: • Nodes - an

    entity of some kind (i.e. User) • Edges - the relationship between nodes Friday, 30 November 12
  11. Graph • Data is represented using: • Nodes - an

    entity of some kind (i.e. User) • Edges - the relationship between nodes • Use when the important data is in the edges Friday, 30 November 12
  12. Graph • Data is represented using: • Nodes - an

    entity of some kind (i.e. User) • Edges - the relationship between nodes • Use when the important data is in the edges Node Node Node Edge Edge Edge Friday, 30 November 12
  13. Document • Data is represented using: • Documents - A

    !exible collection of k/v pairs ("elds) • Use when most queries are not primary key Friday, 30 November 12
  14. Document • Data is represented using: • Documents - A

    !exible collection of k/v pairs ("elds) • Use when most queries are not primary key Document UserID (key) : test_user Name : Chris Job : Engineer Friday, 30 November 12
  15. Key-Value • Data is represented using: • Record - A

    key/value pair • Use when most queries are primary key or you can denormalize the data problem to k/v pair Friday, 30 November 12
  16. Key-Value • Data is represented using: • Record - A

    key/value pair • Use when most queries are primary key or you can denormalize the data problem to k/v pair key value namespace key value key value Friday, 30 November 12
  17. What is Riak? • Distributed key/value store + extras •

    Advanced query features Friday, 30 November 12
  18. What is Riak? • Distributed key/value store + extras •

    Advanced query features • Pre/Post commit hooks Friday, 30 November 12
  19. What is Riak? • Distributed key/value store + extras •

    Advanced query features • Pre/Post commit hooks • Multiple storage engines Friday, 30 November 12
  20. What is Riak? • Distributed key/value store + extras •

    Advanced query features • Pre/Post commit hooks • Multiple storage engines • Scales Linearly + Fault Tolerant Friday, 30 November 12
  21. What is Riak? • Distributed key/value store + extras •

    Advanced query features • Pre/Post commit hooks • Multiple storage engines • Scales Linearly + Fault Tolerant • Open Source (Apache 2.0) Friday, 30 November 12
  22. What is Riak? • Distributed key/value store + extras •

    Advanced query features • Pre/Post commit hooks • Multiple storage engines • Scales Linearly + Fault Tolerant • Open Source (Apache 2.0) • Written in Erlang/OTP Friday, 30 November 12
  23. Tunable Consistency • We haven’t solved CAP; no one has

    • With Riak, you tune the CAP values: Friday, 30 November 12
  24. Tunable Consistency • We haven’t solved CAP; no one has

    • With Riak, you tune the CAP values: • N: number of instances of your data Friday, 30 November 12
  25. Tunable Consistency • We haven’t solved CAP; no one has

    • With Riak, you tune the CAP values: • N: number of instances of your data • R: number of nodes Riak reads from Friday, 30 November 12
  26. Tunable Consistency • We haven’t solved CAP; no one has

    • With Riak, you tune the CAP values: • N: number of instances of your data • R: number of nodes Riak reads from • W: number of nodes Riak writes to, before optional further replication Friday, 30 November 12
  27. Tunable Consistency • We haven’t solved CAP; no one has

    • With Riak, you tune the CAP values: • N: number of instances of your data • R: number of nodes Riak reads from • W: number of nodes Riak writes to, before optional further replication • Per cluster, per bucket or per operation Friday, 30 November 12
  28. Con!ict Resolution (1) • Concurrent actors modifying the same data

    cause data divergence. Friday, 30 November 12
  29. Con!ict Resolution (1) • Concurrent actors modifying the same data

    cause data divergence. • Riak provides two solutions to manage this: Friday, 30 November 12
  30. Con!ict Resolution (1) • Concurrent actors modifying the same data

    cause data divergence. • Riak provides two solutions to manage this: • Last Write Wins Naive approach but works for some use cases Friday, 30 November 12
  31. Con!ict Resolution (1) • Concurrent actors modifying the same data

    cause data divergence. • Riak provides two solutions to manage this: • Last Write Wins Naive approach but works for some use cases • Vector Clocks Retain “sibling” copies of data for merging Friday, 30 November 12
  32. Con!ict Resolution (2) node node node node node App App

    App LB VMs Riak Cluster Friday, 30 November 12
  33. Con!ict Resolution (2) node node node node node App App

    App LB VMs Riak Cluster Application layer timestamps, with siblings Friday, 30 November 12
  34. Con!ict Resolution (3) node node node node node LB Riak

    Cluster App App App App App Friday, 30 November 12
  35. Con!ict Resolution (3) Application layer business logic, with siblings node

    node node node node LB Riak Cluster App App App App App Friday, 30 November 12
  36. Sibling Handling "We don't ever do conflict resolution by picking

    a random sibling." Friday, 30 November 12
  37. Sibling Handling "We don't ever do conflict resolution by picking

    a random sibling." "For an array property, we often take the union of all values in all siblings. This works great for array properties that we only ever add to." Friday, 30 November 12
  38. Sibling Handling "We don't ever do conflict resolution by picking

    a random sibling." "For an array property, we often take the union of all values in all siblings. This works great for array properties that we only ever add to." "We often take the maximum sibling value or the minimum sibling value, depending on the semantics of that attribute." Friday, 30 November 12
  39. Sibling Handling "We don't ever do conflict resolution by picking

    a random sibling." "For an array property, we often take the union of all values in all siblings. This works great for array properties that we only ever add to." "We often take the maximum sibling value or the minimum sibling value, depending on the semantics of that attribute." ~ Myron Marston, SEOMoz Friday, 30 November 12
  40. • HTTP REST or optimised binary interface (PB) • O#cial

    Basho supported: • Community: C#, C/C++, Haskell, Clojure, Scala, Go, PHP and many others Client Libraries Friday, 30 November 12
  41. Riak Use Cases • Reliability, !exibility, scalability • Session Data

    • Serving Advertising Friday, 30 November 12
  42. Riak Use Cases • Reliability, !exibility, scalability • Session Data

    • Serving Advertising • Log and Sensor Data Friday, 30 November 12
  43. Riak Use Cases • Reliability, !exibility, scalability • Session Data

    • Serving Advertising • Log and Sensor Data • Content Addressable Storage (CAS) Friday, 30 November 12
  44. Riak Use Cases • Reliability, !exibility, scalability • Session Data

    • Serving Advertising • Log and Sensor Data • Content Addressable Storage (CAS) • Private Cloud [S3 API] - Riak CS Friday, 30 November 12
  45. Riak Use Cases • Reliability, !exibility, scalability • Session Data

    • Serving Advertising • Log and Sensor Data • Content Addressable Storage (CAS) • Private Cloud [S3 API] - Riak CS • Wherever low latency increases revenue Friday, 30 November 12
  46. Rovio is an industry-changing entertainment media company based in Finland,

    and the creator of the globally successful Angry Birds franchise. Friday, 30 November 12
  47. • Store Game Session data in Riak A per-user collection

    of game “states”. Rovio is an industry-changing entertainment media company based in Finland, and the creator of the globally successful Angry Birds franchise. Friday, 30 November 12
  48. • Store Game Session data in Riak A per-user collection

    of game “states”. • Synchronization of user’s data across gaming devices. Rovio is an industry-changing entertainment media company based in Finland, and the creator of the globally successful Angry Birds franchise. Friday, 30 November 12
  49. • Store Game Session data in Riak A per-user collection

    of game “states”. • Synchronization of user’s data across gaming devices. • Buckets: Account - Keyed by user_id Rovio is an industry-changing entertainment media company based in Finland, and the creator of the globally successful Angry Birds franchise. Friday, 30 November 12
  50. An Enterprise Social Network that brings together employees, content, conversations,

    and business data in a single location. Friday, 30 November 12
  51. • Store “Noti"cations” in Riak A per-user sorted set of

    events with calls to action. An Enterprise Social Network that brings together employees, content, conversations, and business data in a single location. Friday, 30 November 12
  52. • Store “Noti"cations” in Riak A per-user sorted set of

    events with calls to action. • Data types consist of: Cursor, Item List, Items {id: 41626118990497, timestamp: 1300845012, category: “likes- message”, properties: {liker_id: 97238, [... etc]} } An Enterprise Social Network that brings together employees, content, conversations, and business data in a single location. Friday, 30 November 12
  53. • Store “Noti"cations” in Riak A per-user sorted set of

    events with calls to action. • Data types consist of: Cursor, Item List, Items {id: 41626118990497, timestamp: 1300845012, category: “likes- message”, properties: {liker_id: 97238, [... etc]} } • Buckets: Cursor - Keyed by user_id + cursor_name Stream - Keyed by user_id An Enterprise Social Network that brings together employees, content, conversations, and business data in a single location. Friday, 30 November 12
  54. • Store “Noti"cations” in Riak A per-user sorted set of

    events with calls to action. • Data types consist of: Cursor, Item List, Items {id: 41626118990497, timestamp: 1300845012, category: “likes- message”, properties: {liker_id: 97238, [... etc]} } • Buckets: Cursor - Keyed by user_id + cursor_name Stream - Keyed by user_id • SOA - known as “Streamie” An Enterprise Social Network that brings together employees, content, conversations, and business data in a single location. Friday, 30 November 12
  55. SEOmoz is the world’s most popular provider of SEO software.

    Their easy to use tools and tutorials make search engine optimization accessible to everyone. Friday, 30 November 12
  56. • Ranking collections of web documents SEOmoz is the world’s

    most popular provider of SEO software. Their easy to use tools and tutorials make search engine optimization accessible to everyone. Friday, 30 November 12
  57. • Ranking collections of web documents • Data types consist

    of: Subscription(s), Ranking List, Ranking History, Recent Ranking Report... (etc) SEOmoz is the world’s most popular provider of SEO software. Their easy to use tools and tutorials make search engine optimization accessible to everyone. Friday, 30 November 12
  58. • Ranking collections of web documents • Data types consist

    of: Subscription(s), Ranking List, Ranking History, Recent Ranking Report... (etc) • Buckets: Ranking List - Keyed by engine+locale+keyword+URL_fragment Subscription - Keyed by user_campaign SEOmoz is the world’s most popular provider of SEO software. Their easy to use tools and tutorials make search engine optimization accessible to everyone. Friday, 30 November 12
  59. Other Users • High Availability Environments (i.e. Health Care) •

    Content Addressable Storage as a Service (like a private Dropbox cloud) Friday, 30 November 12
  60. Other Users • High Availability Environments (i.e. Health Care) •

    Content Addressable Storage as a Service (like a private Dropbox cloud) • Oil/Gas Rig Environment Logging Friday, 30 November 12
  61. Other Users • High Availability Environments (i.e. Health Care) •

    Content Addressable Storage as a Service (like a private Dropbox cloud) • Oil/Gas Rig Environment Logging • Web Gaming Platforms Friday, 30 November 12
  62. Other Users • High Availability Environments (i.e. Health Care) •

    Content Addressable Storage as a Service (like a private Dropbox cloud) • Oil/Gas Rig Environment Logging • Web Gaming Platforms • Product Catalog (and other Retail use cases) Friday, 30 November 12
  63. With any Use Case Consider 3 Things • Query Patterns

    • Inter-connectivity of your data (how much can it be denormalised) Friday, 30 November 12
  64. With any Use Case Consider 3 Things • Query Patterns

    • Inter-connectivity of your data (how much can it be denormalised) • Polyglot solution (SOA + Database) (no single database "ts every problem) Friday, 30 November 12
  65. With any Use Case Consider 3 Things • Query Patterns

    • Inter-connectivity of your data (how much can it be denormalised) • Polyglot solution (SOA + Database) (no single database "ts every problem) Understand your data access patterns and you’ll be able to choose the right database for you Friday, 30 November 12
  66. Want to know more? We will come and give a

    Riak tech talk at your organisation or group: bit.ly/RiakTechTalk Friday, 30 November 12