Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Riak CRDTs

Riak CRDTs

Gordon Guthrie talks on CRDTs May 28th 2015 Meetup.

Datageeks Paris

June 01, 2015
Tweet

More Decks by Datageeks Paris

Other Decks in Programming

Transcript

  1. What is it for? • Riak is a highly-available, low-latency

    KV store • it is an open-source implementation of the Dynamo paper from Amazon • Dynamo underpins the Amazon shopping cart
  2. Why does it matter? • availability matters • latency matters

    • Amazon found every 100ms of latency cost them 1% in sales • Google found an extra 0.5 seconds in search page generation time dropped traffic by 20%
  3. Mapped To Machines Physical Machine 1 Physical Machine 2 Physical

    Machine 3 Physical Machine 4 Physical Machine 5
  4. Machines Go Down Physical Machine 1 Physical Machine 2 Physical

    Machine 3 Physical Machine 4 Physical Machine 5
  5. And Come Back Physical Machine 1 Physical Machine 2 Physical

    Machine 3 Physical Machine 4 Physical Machine 5
  6. Partitions Happen • because Riak is a highly-available data store

    it will continue to store your data • but your data will not be consistent
  7. AP Physical Machine 1 Physical Machine 2 Physical Machine 3

    Physical Machine 4 Physical Machine 5 Partition Val 1 Val 2
  8. And Come Back Physical Machine 1 Physical Machine 2 Physical

    Machine 3 Physical Machine 4 Physical Machine 5
  9. Now you have siblings Physical Machine 1 Physical Machine 2

    Physical Machine 3 Physical Machine 4 Physical Machine 5 {Val 1, Val 2}
  10. Sibling Resolution • Developers hate siblings • Developers hate resolving

    siblings • Developers play Timestamp Roulette and go for Last Write Wins • but with intercontinental lag, NTP failure, clock skew and latency sometimes Timestamps kill your data
  11. Google’s View • “Designing applications to cope with concurrency anomalies

    in their data is very error-prone, time- consuming, and ultimately not worth the performance gains.” • “We have a lot of experience with eventual consistency systems at Google.” • “We find developers spend a significant fraction of their time building extremely complex and error-prone mechanisms to cope with eventual consistency”
  12. The Partition Cycle Consistent Available Available Partitioned Not Consistent Available

    (Eventually) Consistent Available partitioned unpartitioned Developer! Hell Developer! Heaven CRDTs
  13. Understanding CRDTs • Not going to go into a lot

    of detail • Based on the idea of Lamport Vector Clocks • A Lamport Vector clock allows an ‘actor’ in a distributed data system to say 2 things: • this is when I changed this data • and when I changed it I knew all the things every other actor had done upto these times • Its a vector clock because it has a separate clock for every actor
  14. Where Are The Actors? • Actors can be clients-side or

    server-side or both • Riak is implementing an architecture that allows users to use server-side CRDTs • to avoid the risk of ‘actor’ explosion
  15. Client Side Actors • Alice’s Laptop • Alice’s phone •

    Alice’s iPad • Alice’s daughter’s iPad (her battery is flat) • Bob’s new laptop • Bob’s old laptop • etc, etc
  16. How They Work • Lets take a simple example with

    two actions: • add an item to a shopping cart • empty the shopping cart • these actions don’t commute: • add item then clear gives an empty cart • empty cart then add item gives a cart with one item in it
  17. How They Work Consistent Available Available Partitioned Not Consistent Available

    (Eventually) Consistent Available partitioned unpartitioned CRDTs Clear Add! Item [{a, }] [{b, }] {[{a, }], [{b, }]} The ‘actor’ clearing the shopping cart didn’t know there was anything in the cart Therefore there should be something in the cart after merging the two updates
  18. How They Work Consistent Available Available Partitioned Not Consistent Available

    (Eventually) Consistent Available partitioned unpartitioned CRDTs Clear Add! Item [{a, }] [{a, },{b, }] {[{a, }], [{a, },{b, }]} The ‘actor’ clearing the shopping cart knew there was something in the cart Therefore there should be nothing in the cart after merging the two updates
  19. Scary Maths Stuff • Joined Semi-Lattices that have a Bottom

    and a Least Upper Bound! • that have defined properties involving Associativity, Commutativity, Idempotency • with a Merge Function
  20. Native Eventually Consistent Data Types • Flags • Registers •

    Counters • Sets • Maps • can contain Flags, Registers, Counter and Maps • the basis of composable data types
  21. Flags • Can have one of two values: • enabled

    • disabled • Operations: • enable • disable • Examples - use like Booleans • has a tweet been sent? • has the customer signed up to a pricing plan?
  22. Registers • Named Binaries that have contents which have a

    value • Operations: • store new value • Examples • store “My New Post” in the field “Blog Post Title”
  23. Counters • Contain a number which can be incremented or

    decremented • Operations • increment • decrement • Examples • the number of “likes” in Facebook • the number of Twitter followers
  24. Sets • Collections of unique binaries • Operations • add

    an element • remove an element • add a list of elements • remove a list of elements • Examples • shopping cart
  25. Maps • Maps are like hash maps • Operations •

    add a named Flag, Register, Counter, Set or Map • remove a named Flag, Register, Counter, Set or Map • pass through an op for an element in the Map • Example • the Map is used to compose the data model