Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Riak on Retail

Riak on Retail

Overview of Riak, the open source distributed database, for retail and eCommerce platform and services. Covers use cases including shopping carts, product catalogs, and mobile apps; data modeling and querying; architecture and operations.

Basho Technologies

February 12, 2013
Tweet

More Decks by Basho Technologies

Other Decks in Technology

Transcript

  1. What`s in store? •  At a High Level •  For

    Developers •  Under the Hood •  When and Why •  Some Use Cases •  Commercial Extensions •  Latest Release and 1.3
  2. •  Built on Amazon principles (Dynamo paper) •  Key/value data

    model •  with some extras: search, MapReduce, 2i, links, pre- and post-commit hooks, pluggable backends, HTTP and binary interfaces •  Written in Erlang with C/C++ •  Open source under Apache 2 License Riak
  3. Riak’s Design Goals •  High-availability •  Low-latency •  Horizontal Scalability

    •  Fault Tolerance •  Ops Friendliness •  Predictability
  4. Retail / eCommerce Use Cases •  Shopping cart functionality • 

    Must be highly available •  High latency is perceived as unavailability •  Withstands node failure, network partition, datacenter failure •  Many of the same architectural principles that power Amazon’s shopping cart
  5. Retail / eCommerce Use Cases •  Product Catalog •  Up

    to tens of thousands or more inventory items •  Content agnostic: images, video, text, JSON/XML/ HTML documents •  Add and serve product data even under failure conditions •  Scale out without sharding
  6. Retail / eCommerce Use Cases •  API Platforms •  Expose

    data as a platform to internal and external client, developers and partners/affiliates •  Flexible, schemaless design •  RESTful HTTP API, protocol buffers and many client libraries •  Throughput and capacity scales linearly with growth
  7. Retail / eCommerce Use Cases •  Mobile Applications •  Riak

    powers top consumer mobile apps including Bump and Voxer •  Fast, small object storage •  Designed for concurrency to meet mobile client request patterns
  8. Riak is a database that stores keys against values. Keys

    are grouped into a higher-level namespace called buckets.
  9. Riak doesn’t care what you store. It will accept any

    data type; things are stored on disk as binaries.
  10. Examples Type Key Value Item in Product Inventory Product Name,

    SKU or ID JSON, XML or Text, HTML doc Product Advertising Campaign ID Ad Content User Profile Login, Email, UUID User attributes (often, JSON doc) Image or Video Content Content Name, ID or Integer Image or video file format Session Information User/Session ID Session Data
  11. Querying GET/PUT/DELETE MapReduce: Filtering product info by tag, counting items,

    extracting links Full-Text Search: Searching product info or descriptions Secondary Indexes (2i): Tagging products with categories, promotion identifiers, etc.
  12. Client Libraries Ruby, Node.js, Java, Python, Perl, OCaml, Erlang, PHP,

    C, Squeak, Smalltalk, Pharoah, Clojure, Scala, Haskell, Lisp, Go, .NET, Play, and more (supported by either Basho or the community).
  13. It Hurts. •  Hot spots •  Unevenly spread data and

    request patterns •  Resharding is operationally intensive, often manual A - D E - K L - P Q - T U - Z
  14. Don’t Shard. Riak’s Consistent Hashing •  Evenly spreads data around

    the cluster •  Automatically rebalances data when machines are added
  15. When Might Riak Make Sense When you have enough data

    to require >1 physical machine (preferably >5) When availability is more important than consistency (think “critical data”on “big data”) When your data can be modeled as keys and values; don’t be afraid to denormalize
  16. •  Case study on Basho.com •  Millions of users • 

    Highly available, event-based shopping experience •  “Riak is one of those things that just works and doesn’t need our attention on a day-to- day basis, saving both time and money.”
  17. Ad Serving •  OpenX will serve ~4T ad in 2012

    •  Started with CouchDB and Cassandra for various parts of infrastructure •  Now consolidating on Riak and Riak Core •  Video on Ricon2012.com
  18. Mobile Apps •  Bump – easy to share contact info,

    photos, other objects •  Picked Riak for operational ease of use •  “It does what it’s supposed to do; nodes can go down but Riak will still work. It’s great to be able to deal with node failures the next day instead of at 3am.”
  19. •  Copious – eCommerce marketplace •  Uses Riak to store

    all registered accounts and tokens for social media login •  100s of thousands of keys
  20. Riak : Hybrid Solutions •  Riak with Postgres •  Riak

    with Elastic Search •  Riak with Hadoop •  Secondary analytics clusters
  21. Try Us On… •  Amazon AMIs •  EngineYard beta (more

    details next week) •  Microsoft Azure VM Depot •  Riakon.com
  22. Use Cases •  Data locality to serve clients and partners

    at low- latency anywhere in the world •  Failover to other sites in the event of data center failure •  Full sync and real-time sync, can be configured uni- directionally or bi-directionally
  23. Riak Cloud Storage •  Large object support •  S3-compatible API

    •  Multi-tenancy •  Reporting on usage