Systems Alexey Gotsman IMDEA Software Institute, Spain Hongseok Yang University of Oxford, UK Carla Ferreira NOVA LINCS, DI, FCT, Universidade NOVA de Lisboa, Portugal Mahsa Najafzadeh Sorbonne Universit´ es, Inria, UPMC Univ Paris 06, France Marc Shapiro Sorbonne Universit´ es, Inria, UPMC Univ Paris 06, France Abstract Large-scale distributed systems often rely on replicated databases that allow a programmer to request different data consistency guar- antees for different operations, and thereby control their perfor- mance. Using such databases is far from trivial: requesting stronger consistency in too many places may hurt performance, and request- ing it in too few places may violate correctness. To help program- mers in this task, we propose the first proof rule for establishing that a particular choice of consistency guarantees for various oper- ations on a replicated database is enough to ensure the preservation of a given data integrity invariant. Our rule is modular: it allows reasoning about the behaviour of every operation separately under some assumption on the behaviour of other operations. This leads to simple reasoning, which we have automated in an SMT-based tool. We present a nontrivial proof of soundness of our rule and illustrate its use on several examples. Categories and Subject Descriptors D.2.4 [Software Engineer- ing]: Software/Program Verification; F.3.1 [Logics and Meanings of Programs]: Specifying and Verifying and Reasoning about Pro- grams Keywords Replication; causal consistency; integrity invariants 1. Introduction To achieve availability and scalability, many modern distributed systems rely on replicated databases, which maintain multiple replicas of shared data. Clients can access the data at any of the replicas, and these replicas communicate changes to each other using message passing. For example, large-scale Internet services use data replicas in geographically distinct locations, and appli- cations for mobile devices keep replicas locally to support offline use. Ideally, we would like replicated databases to provide strong consistency, i.e., to behave as if a single centralised node handles all operations. However, achieving this ideal usually requires syn- chronisation among replicas, which slows down the database and even makes it unavailable if network connections between replicas fail [2, 24]. For this reason, modern replicated databases often eschew syn- chronisation completely; such databases are commonly dubbed eventually consistent [47]. In these databases, a replica performs an operation requested by a client locally without any synchronisa- tion with other replicas and immediately returns to the client; the effect of the operation is propagated to the other replicas only even- tually. This may lead to anomalies—behaviours deviating from strong consistency. One of them is illustrated in Figure 1(a). Here Alice makes a post while connected to a replica r1 , and Bob, also connected to r1 , sees the post and comments on it. After each of the two operations, r1 sends a message to the other replicas in the system with the update performed by the user. If the messages with the updates by Alice and Bob arrive to a replica r2 out of order, then Carol, connected to r2 , may end up seeing Bob’s comment, but not Alice’s post it pertains to. The consistency model of a repli- cated database restricts the anomalies that it exhibits. For example, the model of causal consistency [33], which we consider in this pa- per, disallows the anomaly in Figure 1(a), yet can be implemented without any synchronisation. The model ensures that all replicas in the system see causally dependent events, such as the posts by Al- ice and Bob, in the order in which they happened. However, causal consistency allows different replicas to see causally independent events as occurring in different orders. This is illustrated in Fig- ure 1(b), where Alice and Bob concurrently make posts at r1 and r2 . Carol, connected to r3 initially sees Alice’s post, but not Bob’s, and Dave, connected to r4 , sees Bob’s post, but not Alice’s. This outcome cannot be obtained by executing the operations in any to- tal order and, hence, deviates from strong consistency. Such anomalies related to the ordering of actions are often ac- ceptable for applications. What is not acceptable is to violate cru- cial well-formedness properties of application data, called integrity invariants. Consistency models that do not require any synchroni- sation are often too weak to ensure these. For example, consider a toy banking application where the database stores the balance of a single account that clients can make deposits to and withdrawals from. In this case, an integrity invariant may require the account balance to be always non-negative. Consider the database compu- This is the author’s version of the work. It is posted here for your personal use. Not for redistribution. The definitive version was published in the following publication: POPL’16 , January 20–22, 2016, St. Petersburg, FL, USA ACM. 978-1-4503-3549-2/16/01... http://dx.doi.org/10.1145/2837614.2837625 371 Timestamps in Message-Passing Systems That Preserve the Partial Ordering Colin J. Fidge Department of Computer Science, Australian National University, Canberra, A CT. ABSTRACT Timestamping is a common method of totally ordering events in concurrent programs. However, for applications requiring access to the global state, a total ordering is inappro- priate. This paper presents algorithms for timestamping events in both synchronous and asynchronous n1essage-passing programs that allow for access to the partial ordering in- herent in a parallel system. The algorithms do not change the con1munications graph or require a central timestamp issuing authority. Keywords and phrases: concurrent programming, message-passing, timestamps, logical clocks CR categories: D.l.3 INTRODUCTION A fundamental problem in concurrent programming is determining the order in which events in different processes occurred. An obvious solution is to attach a number representing the current time to a permanent record of the execution of each event. This assumes that each process can access an accurate clock, but practical parallel systems, by their very nature, make it difficult to ensure consistency among the processes. There are two solutions to this problem. Firstly, have a central process to issue timestamps, i.e. pro- vide the system with a global clock. In practice this has the major disadvantage of needing communication links from all processes to the central clock. More acceptable are separate clocks in each process that are kept synchronised as much as necessary to ensure that the timestamps represent, at the very least, a possible ordering of events (in light of the vagaries of distributed scheduling). Lamport (1978) describes just such a scheme of logical clocks that can be used to totally order events, without the need to introduce extra communication links. However this only yields one of the many possible, and equally valid, event orderings defined by a particular distributed computation. For problems concerned with the global program state it is far more useful to have access to the entire partial ordering, which defines the set of consistent "slices" of the global state at any arbitrary moment in time. This paper presents an implementation of the partially ordered relation "happened before" that is true for two given events iff the first could causally affect the second in all possible interleavings of events. This allows access to all possible global states for a particular distributed computation, rather than a single, arbitrarily selected ordering. Lamport's totally ordered relation is used as a starting point. The algorithm is first defined for the asynchronous case, and then extended to cater for concurrent programs using synchronous message-passing. ‘88 ‘88 Vector Clocks