Upgrade to Pro — share decks privately, control downloads, hide ads and more …

An Overview of the Facebook Cache (Yannick Ging...

An Overview of the Facebook Cache (Yannick Gingras)

PyCon Canada

August 13, 2013
Tweet

More Decks by PyCon Canada

Other Decks in Education

Transcript

  1. Memcache Some numbers ▪ Thousands of servers ▪ > 1G

    Ops/s ▪ >1T items ▪ 98.1% hit rate in “wildcard” ▪ ~90% hit rate in “regional” ▪ <50% hit rate in “pyk”
  2. Python for Memcache mcconf ▪ Short deployment cycle ▪ Pool

    management: allocation, resizing ▪ Spare selection based on hardware requirements ▪ Template-based region bootstrapping ▪ Cluster maintenance and decommission
  3. Python for Memcache Adaptive deployments: mcroll and mcpush ▪ Software

    upgrades ▪ Cold rolls / cache flushing ▪ Rated are adaptive based on health metrics ▪ Global parallelism logic
  4. TAO TAO is a two-level read-through, write-through cache. TAO is

    aware of graph semantic and supports structured queries.
  5. Python for TAO Shard splitting / replication ▪ Extension of

    consistent hashing ▪ Based on client machine ID ▪ Wired with the invalidation pipeline Shard placement ▪ Two-level load distribution ▪ Hash table of hot shards mapped to cold servers ▪ Falls back to consistent hashing if shards are not placed ▪ Candidate shards and destinations are identified by Python services
  6. TAO – another story PHP Arrays “An array in PHP

    is actually an ordered map. A map is a type that associates values to keys.” – http://php.net/manual/en/language.types.array.php Python dictionaries “Unlike sequences, which are indexed by a range of numbers, dictionaries are indexed by keys, which can be any immutable type; strings and numbers can always be keys. […] It is best to think of a dictionary as an unordered set of key: value pairs, with the requirement that the keys are unique (within one dictionary).” – http://docs.python.org/2/tutorial/datastructures.html
  7. More Python in the Cache infrastructure ▪ FBAR – auto-remediation

    engine ▪ Tupperware – job engine used for invalidation pipeline ▪ thrift – language-agnostic service layer, enables many Python clients http://thrift.apache.org/ ▪ Dataswarm – Python frontend to our data warehouse ▪ fbdeploy – job supervisor and BitTorrent deployment
  8. Further reading ▪ Memcache public page: https://www.facebook.com/MemcacheAtFacebook ▪ Memcache paper:

    http://bit.ly/fb-memcache-paper ▪ TAO public note: http://bit.ly/tao-blog-post ▪ TAO Paper: http://bit.ly/fb-tao-paper
  9. (c) 2009 Facebook, Inc. or its licensors. "Facebook" is a

    registered trademark of Facebook, Inc.. All rights reserved. 1.0
  10. Memcache Read after write semantic – remote markers Replica DB

    Memcache Web Server Master DB 2. Write to master 3. Delete from memcache 5. Delete remote marker 4. Mysql replication 1. Set remote marker
  11. Memcache Aggregated deletes • Reduce packet rate by 18x. MC

    MC MC Aqueduct DB Aqueduct DB Aqueduct DB MC MC MC MC Memcache Routers Memcache Routers MC MC MC MC Memcache Routers Memcache Routers
  12. Other TAO Topics ▪ TACO ▪ CCW via failover ▪

    Two-level cache provide read after write semantic ▪ Two-level cache shields against thundering herd
  13. (c) 2009 Facebook, Inc. or its licensors. "Facebook" is a

    registered trademark of Facebook, Inc.. All rights reserved. 1.0