$30 off During Our Annual Pro Sale. View Details »

Building Scalable Stateful Services

Building Scalable Stateful Services

Slides from Scaling Stateful Services Talk given at Nike Tech Talk 8/11/16

Caitie McCaffrey

August 12, 2016
Tweet

More Decks by Caitie McCaffrey

Other Decks in Technology

Transcript

  1. Building Scalable
    Stateful Services

    View Slide

  2. Caitie McCaffrey
    Distributed Systems Engineer
    @caitie
    caitiem.com

    View Slide

  3. Stateless Services
    Service

    View Slide

  4. Stateless Services
    Service

    View Slide

  5. Stateless Services
    Service
    Service

    View Slide

  6. Stateless Services
    Service
    Service

    View Slide

  7. Stateless Services
    Service
    Service

    View Slide

  8. Stateless Services
    Service Service
    Service

    View Slide

  9. Stateless Services
    Service Service
    Service

    View Slide

  10. Stateless Services
    Service Service
    Service

    View Slide

  11. Stateless Services
    Service Service
    Service
    Service

    View Slide

  12. Stateless Services
    Service Service
    Service
    Service

    View Slide

  13. Stateless Services
    Service Service
    Service
    Service

    View Slide

  14. Stateless Services
    Service Service
    Service
    Service Service

    View Slide

  15. Stateless Services
    Service Service
    Service
    Service Service

    View Slide

  16. Data Shipping Paradigm
    Service
    Service
    Service

    View Slide

  17. Data Shipping Paradigm
    Service
    Service
    Service

    View Slide

  18. Data Shipping Paradigm
    Service
    Service
    Service

    View Slide

  19. Data Shipping Paradigm
    Service
    Service
    Service

    View Slide

  20. Data Shipping Paradigm
    Service
    Service
    Service

    View Slide

  21. Data Shipping Paradigm
    Service
    Service
    Service

    View Slide

  22. Data Shipping Paradigm
    Service
    Service
    Service

    View Slide

  23. Data Shipping Paradigm
    Service
    Service
    Service

    View Slide

  24. Data Shipping Paradigm
    Service
    Service
    Service

    View Slide

  25. Data Shipping Paradigm
    Service
    Service
    Service

    View Slide

  26. Data Shipping Paradigm
    Service
    Service
    Service

    View Slide

  27. Data Shipping Paradigm
    Service
    Service
    Service

    View Slide

  28. Data Shipping Paradigm
    Service
    Service
    Service

    View Slide

  29. Data Shipping Paradigm
    Service
    Service
    Service

    View Slide

  30. Data Shipping Paradigm
    Service
    Service
    Service

    View Slide

  31. Service
    Service
    Service
    Service
    Service
    Service
    Service
    Front End
    Data Shipping Paradigm

    View Slide

  32. Service
    Service
    Service
    Service
    Service
    Service
    Service
    Front End
    Data Shipping Paradigm

    View Slide

  33. Service
    Service
    Service
    Service
    Service
    Service
    Service
    Front End
    Data Shipping Paradigm

    View Slide

  34. Service
    Service
    Service
    Service
    Service
    Service
    Service
    Front End
    Data Shipping Paradigm

    View Slide

  35. Service
    Service
    Service
    Service
    Service
    Service
    Service
    Front End
    Data Shipping Paradigm

    View Slide

  36. Service
    Service
    Service
    Service
    Service
    Service
    Service
    Front End
    Data Shipping Paradigm

    View Slide

  37. Service
    Service
    Service
    Service
    Service
    Service
    Service
    Front End
    Data Shipping Paradigm

    View Slide

  38. Service
    Service
    Service
    Service
    Service
    Service
    Service
    Front End
    Data Shipping Paradigm

    View Slide

  39. Overview
    Benefits
    Building
    Real World
    Caution

    View Slide

  40. Data Locality
    For Low Latency & Data Intensive Services

    View Slide

  41. Function Shipping Paradigm
    Service
    Service
    Service

    View Slide

  42. Function Shipping Paradigm
    Service
    Service
    Service

    View Slide

  43. Function Shipping Paradigm
    Service
    Service
    Service

    View Slide

  44. Function Shipping Paradigm
    Service
    Service
    Service

    View Slide

  45. Function Shipping Paradigm
    Service
    Service
    Service

    View Slide

  46. Function Shipping Paradigm
    Service
    Service
    Service

    View Slide

  47. Function Shipping Paradigm
    Service
    Service
    Service

    View Slide

  48. Function Shipping Paradigm
    Service
    Service
    Service

    View Slide

  49. Function Shipping Paradigm
    Service
    Service
    Service

    View Slide

  50. Function Shipping Paradigm
    Service
    Service
    Service

    View Slide

  51. Function Shipping Paradigm
    Service
    Service
    Service

    View Slide

  52. Service
    Service
    Service
    Service
    Service
    Service
    Service
    Front End
    Data Shipping Paradigm

    View Slide

  53. Service
    Service
    Service
    Service
    Service
    Service
    Service
    Front End
    Data Shipping Paradigm

    View Slide

  54. Service
    Service
    Service
    Service
    Service
    Service
    Service
    Front End
    Data Shipping Paradigm

    View Slide

  55. Just Put A
    Cache On It?

    View Slide

  56. Service
    Service
    Service
    Service
    Service
    Service
    Service
    Front End
    Cache

    View Slide

  57. Service
    Service
    Service
    Service
    Service
    Service
    Service
    Front End
    Cache

    View Slide

  58. Service
    Service
    Service
    Service
    Service
    Service
    Service
    Front End
    Cache

    View Slide

  59. Concurrency Issues
    Greater Operational Burden
    Problems with Caching

    View Slide

  60. Sticky Connections
    & Consistency
    Additional Available Consistency Models

    View Slide

  61. Sticky Connections

    View Slide

  62. Sticky Connections

    View Slide

  63. Linearizable
    Sequential
    Causal
    Pipelined Random
    Access Memory
    Read Your
    Write
    Monotonic
    Read
    Monotonic
    Write
    Write From
    Read
    Consistency
    Models
    CP Consistency
    AP Consistency

    View Slide

  64. Linearizable
    Sequential
    Causal
    Pipelined Random
    Access Memory
    Read Your
    Write
    Monotonic
    Read
    Monotonic
    Write
    Write From
    Read
    CP Consistency
    AP Consistency
    AP Consistency
    w/ Sticky Connections
    Consistency
    Models

    View Slide

  65. - Werner Vogel 2007
    “Whether or not read-your-write, session and monotonic
    consistency can be achieved depends in general on the
    "stickiness" of clients to the server that executes the
    distributed protocol for them… Using sessions, which are
    sticky, makes this explicit and provides an exposure level
    that clients can reason about.”

    View Slide

  66. Building Sticky Connections
    For Low Latency & Data Intensive Services

    View Slide

  67. Building Sticky Connections

    View Slide

  68. Building Sticky Connections

    View Slide

  69. Persistent Connections
    Load Balancing Problems
    No Stickiness Once
    Connection Breaks
    Problems

    View Slide

  70. Persistent Connections
    Load Balancing Problems
    No Stickiness Once
    Connection Breaks
    Problems

    View Slide

  71. Routing Logic
    • Cluster Membership
    • Work Distribution
    Problems to Solve

    View Slide

  72. Routing Logic
    • Cluster Membership
    • Work Distribution
    Problems to Solve

    View Slide

  73. Routing Logic
    • Cluster Membership
    • Work Distribution
    Problems to Solve

    View Slide

  74. Cluster Membership
    Dynamic
    Static

    View Slide

  75. Static Cluster Membership
    Machine Outage means
    Partial Service Outage
    Downtime to Add Capacity

    View Slide

  76. Static Cluster Membership
    Machine Outage means
    Partial Service Outage
    Downtime to Add Capacity

    View Slide

  77. Static Cluster Membership
    Machine Outage means
    Partial Service Outage
    Downtime to Add Capacity

    View Slide

  78. Dynamic Cluster Membership
    Fault Tolerance

    View Slide

  79. Dynamic Cluster Membership
    Fault Tolerance

    View Slide

  80. Dynamic Cluster Membership
    Fault Tolerance

    View Slide

  81. Dynamic Cluster Membership
    Fault Tolerance

    View Slide

  82. Dynamic Cluster Membership
    Fault Tolerance

    View Slide

  83. Dynamic Cluster Membership
    Gossip
    Protocols
    Consensus
    Systems
    Availability vs Consistency

    View Slide

  84. Consensus Systems
    for Consistency
    Paxos ETCD
    Zookeeper Raft

    View Slide

  85. Gossip Protocols
    for Availability
    • Assumes Non-Reliable Networks
    • Information Dissemination
    • Pairwise Communication

    View Slide

  86. Gossip Protocols
    for Availability
    • Assumes Non-Reliable Networks
    • Information Dissemination
    • Pairwise Communication

    View Slide

  87. Gossip Protocols
    for Availability
    • Assumes Non-Reliable Networks
    • Information Dissemination
    • Pairwise Communication

    View Slide

  88. Gossip Protocols
    for Availability
    • Assumes Non-Reliable Networks
    • Information Dissemination
    • Pairwise Communication

    View Slide

  89. Gossip Protocols
    for Availability
    • Assumes Non-Reliable Networks
    • Information Dissemination
    • Pairwise Communication

    View Slide

  90. Gossip Protocols
    for Availability
    • Assumes Non-Reliable Networks
    • Information Dissemination
    • Pairwise Communication

    View Slide

  91. Gossip Protocols
    for Availability
    • Assumes Non-Reliable Networks
    • Information Dissemination
    • Pairwise Communication

    View Slide

  92. Gossip Protocols
    for Availability
    • Assumes Non-Reliable Networks
    • Information Dissemination
    • Pairwise Communication

    View Slide

  93. Gossip Protocols
    for Availability
    • Assumes Non-Reliable Networks
    • Information Dissemination
    • Pairwise Communication

    View Slide

  94. Work Distribution
    Consistent
    Hashing
    Distributed
    Hash Tables
    Random
    Placement

    View Slide

  95. Random Placement
    Write
    Anywhere
    Read from
    Everywhere

    View Slide

  96. Random Placement
    Write
    Anywhere
    Read from
    Everywhere

    View Slide

  97. Random Placement
    Write
    Anywhere
    Read from
    Everywhere

    View Slide

  98. Random Placement
    Write
    Anywhere
    Read from
    Everywhere

    View Slide

  99. Random Placement
    Write
    Anywhere
    Read from
    Everywhere

    View Slide

  100. Random Placement
    Write
    Anywhere
    Read from
    Everywhere

    View Slide

  101. Random Placement
    Write
    Anywhere
    Read from
    Everywhere

    View Slide

  102. Consistent Hashing
    Deterministic
    Placement
    Node A
    Node B
    Node C
    NodeD
    Consistent Hashing & Random Trees:
    Distributed caching protocols for
    relieving hot spots on the World
    Wide Web

    View Slide

  103. Consistent Hashing
    Deterministic
    Placement
    Node A
    Node B
    NodeD
    Consistent Hashing & Random Trees:
    Distributed caching protocols for
    relieving hot spots on the World
    Wide Web

    View Slide

  104. Consistent Hashing
    Deterministic
    Placement
    Node A
    Node B
    NodeD
    Consistent Hashing & Random Trees:
    Distributed caching protocols for
    relieving hot spots on the World
    Wide Web

    View Slide

  105. Distributed Hash Table
    Non- Deterministic
    Placement
    Node B
    Node A
    Node C

    View Slide

  106. Distributed Hash Table
    Non- Deterministic
    Placement
    Node B
    Node A
    Node C

    View Slide

  107. Distributed Hash Table
    Non- Deterministic
    Placement
    Node B
    Node A
    Node C

    View Slide

  108. Distributed Hash Table
    Non- Deterministic
    Placement
    Node B
    Node A
    Node C

    View Slide

  109. Distributed Hash Table
    Non- Deterministic
    Placement
    Node B
    Node A
    Node C

    View Slide

  110. Stateful Services
    In the Real World

    View Slide

  111. Scuba is a fast, scalable,
    distributed, in-memory database
    built at Facebook. It is the workhorse
    behind code regression analysis &
    bug report, revenue, and
    performance debugging
    Fan-out request to all
    machines in the cluster
    Compose Results
    Return Results and
    Completeness

    View Slide

  112. Scuba is a fast, scalable,
    distributed, in-memory database
    built at Facebook. It is the workhorse
    behind code regression analysis &
    bug report, revenue, and
    performance debugging
    Fan-out request to all
    machines in the cluster
    Compose Results
    Return Results and
    Completeness

    View Slide

  113. Scuba is a fast, scalable,
    distributed, in-memory database
    built at Facebook. It is the workhorse
    behind code regression analysis &
    bug report, revenue, and
    performance debugging
    Fan-out request to all
    machines in the cluster
    Compose Results
    Return Results and
    Completeness

    View Slide

  114. Uber Ringpop is an open-
    source Node.js library that brings
    application-layer sharding to
    many of their dispatching
    platform services.
    Swim Gossip Protocol
    Consistent Hashing
    +

    View Slide

  115. Uber Ringpop is an open-
    source Node.js library that brings
    application-layer sharding to
    many of their dispatching
    platform services.
    Swim Gossip Protocol
    Consistent Hashing
    +

    View Slide

  116. Uber Ringpop is an open-
    source Node.js library that brings
    application-layer sharding to
    many of their dispatching
    platform services.
    Swim Gossip Protocol
    Consistent Hashing
    +

    View Slide

  117. Orleans Cluster
    Orleans is a runtime and
    Programming model for building
    distributed systems based on the
    Actor Model from the eXtreme
    Computing Group at MSR
    Gossip Protocol
    Consistent Hashing
    +
    +
    Distributed Hash Table
    Actor
    Actor
    Actor
    Actor
    Actor

    View Slide

  118. Orleans Cluster
    Orleans is a runtime and
    Programming model for building
    distributed systems based on the
    Actor Model from the eXtreme
    Computing Group at MSR
    Gossip Protocol
    Consistent Hashing
    +
    +
    Distributed Hash Table
    Actor
    Actor
    Actor
    Actor
    Actor

    View Slide

  119. Orleans Cluster
    Orleans is a runtime and
    Programming model for building
    distributed systems based on the
    Actor Model from the eXtreme
    Computing Group at MSR
    Gossip Protocol
    Consistent Hashing
    +
    +
    Distributed Hash Table
    Actor
    Actor
    Actor
    Actor
    Actor

    View Slide

  120. Orleans Cluster
    Orleans Distributed Hash Table

    View Slide

  121. Orleans Cluster
    Orleans Distributed Hash Table

    View Slide

  122. Orleans Cluster
    Orleans Distributed Hash Table

    View Slide

  123. Orleans Cluster
    Orleans Distributed Hash Table

    View Slide

  124. Orleans Cluster
    Orleans Distributed Hash Table

    View Slide

  125. Orleans Cluster
    Orleans Distributed Hash Table

    View Slide

  126. Orleans Cluster
    Orleans Distributed Hash Table

    View Slide

  127. Orleans Cluster
    Orleans Distributed Hash Table

    View Slide

  128. Caution
    Stateful Services Ahead

    View Slide

  129. Unbounded
    Data Structures
    Implicit Assumptions
    are the Killer of
    Distributed Systems

    View Slide

  130. Memory Management
    Get Ready to Make Friends with the Garbage Collector Profiler

    View Slide

  131. Reloading
    State
    • First Connection
    • Recovering From Crashes
    • Deploying New Code

    View Slide

  132. Fast Restarts at Facebook
    “Our Key Observation is that we can
    decouple the memory lifetime from the
    process lifetime. When we shutdown a
    server for a planned upgrade.”

    View Slide

  133. Make Assumptions Explicit
    for Reliable Distributed Systems

    View Slide

  134. Conclusion
    Data Locality & Available Consistency
    Cluster Membership & Work Distribution
    Successful Stateful Real World Systems
    Caution: Some New Challenges

    View Slide

  135. https://github.com/CaitieM20/ScalingStatefulServices
    @Caitie
    Questions

    View Slide

  136. View Slide