a “single” system; the databases are the “source of truth” • Data ownership by clients Data is “owned” by the clients that create the data and data exists as soon as it is created 2
a “single” system; the databases are the “source of truth” • Data ownership by clients Data is “owned” by the clients that create the data and data exists as soon as it is created • Database is an optimization, bottleneck Databases serve as a “convenience” that make it easier to write applications: think: shared memory registers: however, reduced availability 2
a “single” system; the databases are the “source of truth” • Data ownership by clients Data is “owned” by the clients that create the data and data exists as soon as it is created • Database is an optimization, bottleneck Databases serve as a “convenience” that make it easier to write applications: think: shared memory registers: however, reduced availability • The edge is the source of truth! We need models and abstractions that allow us to write correct applications that operate with distributed data where it is being generated: the edge 2
system that application will be deployed on • Guaranteed outcomes following certain rules Event interleaving, possible partial-orders, update visibility, when and where, etc. 4
system that application will be deployed on • Guaranteed outcomes following certain rules Event interleaving, possible partial-orders, update visibility, when and where, etc. • Required for building applications Otherwise, we may pick a system to deploy our application on where our application returns incorrect results 4
new updates are made to the [replicated, shared] object, eventually all accesses [to any replica] will return the last updated value.” - W. Vogels Rather weak model, but used by many large-scale distributed systems today…
systems wish to remain functioning under network partitions, systems must sacrifice one or the other • Consistency Guarantees on event order and event visibility 8
systems wish to remain functioning under network partitions, systems must sacrifice one or the other • Consistency Guarantees on event order and event visibility • Availability Ability for a system to keep servicing requests under network partitions and/or failures 8
for fault-tolerance to ensure system availability • Two concurrent requests… Tom and Chris attempt to reserve the last available seat on a plane • Two possible paths… If one replica of the system can not reach the other replica, we have two choices: 9
for fault-tolerance to ensure system availability • Two concurrent requests… Tom and Chris attempt to reserve the last available seat on a plane • Two possible paths… If one replica of the system can not reach the other replica, we have two choices: • [Favoring Consistency] Prevent booking Return an error to the user and prevent both bookings 9
for fault-tolerance to ensure system availability • Two concurrent requests… Tom and Chris attempt to reserve the last available seat on a plane • Two possible paths… If one replica of the system can not reach the other replica, we have two choices: • [Favoring Consistency] Prevent booking Return an error to the user and prevent both bookings • [Favoring Availability] Allow concurrent requests However, now the seat is double booked and we must have a “conflict resolution” function for returning the system to a consistent state 9
is periodically recorded • “Potentially outdated” Act of recording this information produces an artifact that is already outdated unless the system has quiesced 11
exchange messages asynchronously • Dropped or delayed Messages can either be dropped or delayed • Examples Letters via the postal service; Text messages; Telephone calls 12
system owns the primary copy of their information • Coordinates updates Members coordinate updates to information they are the primary site for • Information can be cached Information from other sites can be cached by other members in the system 13
system owns the primary copy of their information • Coordinates updates Members coordinate updates to information they are the primary site for • Information can be cached Information from other sites can be cached by other members in the system • Local or incomplete replica Use memory 13
system owns the primary copy of their information • Coordinates updates Members coordinate updates to information they are the primary site for • Information can be cached Information from other sites can be cached by other members in the system • Local or incomplete replica Use memory • Stale replica Outdated printed map 13
system owns the primary copy of their information • Coordinates updates Members coordinate updates to information they are the primary site for • Information can be cached Information from other sites can be cached by other members in the system • Local or incomplete replica Use memory • Stale replica Outdated printed map • Primary site Google Maps or the USGS, etc. 13
all members in the system with the data they create and are responsible for • Contract edges for subgraph Reduce several vertices in the graph to a single vertex: database for those entities 15
all members in the system with the data they create and are responsible for • Contract edges for subgraph Reduce several vertices in the graph to a single vertex: database for those entities • Geo-replicated, EC database Contracted edges per country, placing a replica in each country that served as the primary copy 15
all members in the system with the data they create and are responsible for • Contract edges for subgraph Reduce several vertices in the graph to a single vertex: database for those entities • Geo-replicated, EC database Contracted edges per country, placing a replica in each country that served as the primary copy • Wikipedia (for a given topic) Information about a given topic is stored here, written and coordinated by multiple authors 15
expensive, if the primary site is geographically distant [latency] or unavailable [availability] • Replication introduces challenges Replication can make maintaining consistency much more challenges if caching/replication is pervasive 16
data at a central location for processing won’t scale due to power and DC requirements • Today’s systems assume centralization Both programming models and applications used today assume centralization of data (ie. Spark, etc.) 17
route requests? When we need to retrieve a certain piece of data, how do we know where to route the request to? • How we do specify acceptable staleness? Do we need to route to the primary site or can we use a cache? Does that cache provide a value within an acceptable value of staleness? 20
route requests? When we need to retrieve a certain piece of data, how do we know where to route the request to? • How we do specify acceptable staleness? Do we need to route to the primary site or can we use a cache? Does that cache provide a value within an acceptable value of staleness? • How do we bound latency? How do we select an appropriate cache? How do we choose between a cache and a primary site given we have to match a latency bound? 20
route requests? When we need to retrieve a certain piece of data, how do we know where to route the request to? • How we do specify acceptable staleness? Do we need to route to the primary site or can we use a cache? Does that cache provide a value within an acceptable value of staleness? • How do we bound latency? How do we select an appropriate cache? How do we choose between a cache and a primary site given we have to match a latency bound? • How do we reason about staleness? Across multiple requests for the same object, how do we know which version is newer or older? 20
AP/EC systems Encapsulate AP replication concerns and exist in time and space • Merge to most “recent” result Conflict resolution and provenance information 21
AP/EC systems Encapsulate AP replication concerns and exist in time and space • Merge to most “recent” result Conflict resolution and provenance information • One example: CRDTs Conflict-free Replicated Data Types (Shapiro et al. 2011) 21
AP/EC systems Encapsulate AP replication concerns and exist in time and space • Merge to most “recent” result Conflict resolution and provenance information • One example: CRDTs Conflict-free Replicated Data Types (Shapiro et al. 2011) • Causality Capture causality for object mutations and can identify concurrent operations 21
AP/EC systems Encapsulate AP replication concerns and exist in time and space • Merge to most “recent” result Conflict resolution and provenance information • One example: CRDTs Conflict-free Replicated Data Types (Shapiro et al. 2011) • Causality Capture causality for object mutations and can identify concurrent operations • Concurrency Resolve concurrent operations using a bias (think: concurrent add(e) || remove(e) on the same set for same element) 21
Mergeable data structures form the core data abstraction for a programming model • Programming through composition of mergeable data structures Ensure the mergeability property holds through program transformations, data composition 22
Mergeable data structures form the core data abstraction for a programming model • Programming through composition of mergeable data structures Ensure the mergeability property holds through program transformations, data composition • One example: Lasp Lattice Processing (Meiklejohn, Van Roy 2015) 22
Mergeable data structures form the core data abstraction for a programming model • Programming through composition of mergeable data structures Ensure the mergeability property holds through program transformations, data composition • One example: Lasp Lattice Processing (Meiklejohn, Van Roy 2015) • Correct-by-construction Correct-by-construction distributed programs for infrastructure that provides weak guarantees 22
Mergeable data structures form the core data abstraction for a programming model • Programming through composition of mergeable data structures Ensure the mergeability property holds through program transformations, data composition • One example: Lasp Lattice Processing (Meiklejohn, Van Roy 2015) • Correct-by-construction Correct-by-construction distributed programs for infrastructure that provides weak guarantees • Result provenance Extends CRDT causality/concurrency tracking through to results of applications providing mergeable outcomes 22
shouldn’t be responsible for canonical data and data sharing, but rather serve as a location where particular code will run with clients data • Clients communicate other clients Exchange state for latency reduction, serve as the primary site for their information 23
shouldn’t be responsible for canonical data and data sharing, but rather serve as a location where particular code will run with clients data • Clients communicate other clients Exchange state for latency reduction, serve as the primary site for their information • Servers as business entities Necessary for latency reduction of large data sets, durability, location of “exactly-once” side-effects: ie. charge credit card 23
shouldn’t be responsible for canonical data and data sharing, but rather serve as a location where particular code will run with clients data • Clients communicate other clients Exchange state for latency reduction, serve as the primary site for their information • Servers as business entities Necessary for latency reduction of large data sets, durability, location of “exactly-once” side-effects: ie. charge credit card • One example: Skype Completely peer-to-peer for operation, but a central server is used for authentication and storage of users “address book.” 23
that causality drives interactions in the physical world: relativity, light cones, etc. • Causality in distributed systems Happens-before relationship (Lamport 1978) describes capturing causal relationships between entities in a distributed system 25
about cause/effect; eliminates storage and maintenance of redundant information • Negatives Expensive in storage and maintenance of causal message delivery channels; methods for reduction in state introduce false dependencies 26
consistency provide a better formalism for describing consistency in the world given we know causal relationships hold? • Data has decay Some information may no longer be important after a given period of time, and forgotten 27
consistency provide a better formalism for describing consistency in the world given we know causal relationships hold? • Data has decay Some information may no longer be important after a given period of time, and forgotten • Causality formalism needs explicit decay The formalism for describing causal consistency requires data be explicitly “decayed” through messages (or tombstones — keeping data forever to satisfy the causal relationship) 27
a bunch of rules about driving a car and passed a driver’s test obtaining a license, which I renewed several years later • What does causality imply? Causality captures that I would not have had the license if I had not learned the rules and passed the test 28
a bunch of rules about driving a car and passed a driver’s test obtaining a license, which I renewed several years later • What does causality imply? Causality captures that I would not have had the license if I had not learned the rules and passed the test • What does causal consistency imply? Does causal consistency imply that I can still recover the information that I originally used to pass the test? Does it imply that if I can recall my driver’s license number that I should be able to recall the rules that are implied by that license number? 28
canonical ride state on the device to bootstrap datacenters under failure: device is source of truth • Datacenter-focused designs are limiting Impractical from a storage, bandwidth, and power perspective 29
canonical ride state on the device to bootstrap datacenters under failure: device is source of truth • Datacenter-focused designs are limiting Impractical from a storage, bandwidth, and power perspective • Emerging countries have limited access Sneakernet, USB, Bluetooth still the pervasive model of communication 29
canonical ride state on the device to bootstrap datacenters under failure: device is source of truth • Datacenter-focused designs are limiting Impractical from a storage, bandwidth, and power perspective • Emerging countries have limited access Sneakernet, USB, Bluetooth still the pervasive model of communication • Peer-to-peer designs can provide higher-scale Grow to planetary scale, new programming models needed to embrace these network designs, new abstractions needed 29