Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Transactions: Myths, Surprises and Opportunities

Transactions: Myths, Surprises and Opportunities

Slides from a talk given at Strange Loop, 26 September 2015. https://thestrangeloop.com/2015/transactions-myths-surprises-and-opportunities.html

Abstract:

Back in the 1970s, the earliest databases had transactions. Then NoSQL abolished them. And now, perhaps, they are making a comeback... but reinvented.

The purpose of transactions is to make application code simpler, by reducing the amount of failure handling you need to do yourself. However, they have also gained a reputation for being slow and unscalable. With the traditional implementation of serializability (2-phase locking), that reputation was somewhat deserved.

In the last few years, there has been a resurgence of interest in transaction algorithms that perform well and scale well. This talk answers some of the biggest questions about the bright new landscape of transactions:

* What does ACID actually mean? What race conditions can you get with weak isolation (such as "read committed" and "repeatable read"), and how does this affect your application?
* What are the strongest guarantees we can achieve, while maintaining high availability and high performance at scale?
* How do the new generation of algorithms for distributed, highly-available transactions work?
* Linearizability, session guarantees, "consistency" and the much-misunderstood CAP theorem -- what's really going on here?
* When you move beyond a single database, e.g. doing stream processing, what are your options for maintaining transactional guarantees?

Martin Kleppmann

September 26, 2015
Tweet

More Decks by Martin Kleppmann

Other Decks in Programming

Transcript

  1. References (1/4) 1.  Atul Adya: “Weak Consistency: A Generalized Theory

    and Optimistic Implementations for Distributed Transactions,” PhD thesis, Massachusetts Institute of Technology, Cambridge, MA, USA, March 1999. http://pmg.csail.mit.edu/papers/adya-phd.pdf 2.  Hagit Attiya, Faith Ellen, and Adam Morrison: “Limitations of Highly-Available Eventually-Consistent Data Stores,” at ACM Symposium on Principles of Distributed Computing (PODC), July 2015. http:// www.cs.technion.ac.il/people/mad/online-publications/podc2015-replds.pdf 3.  Peter Bailis, Alan Fekete, Ali Ghodsi, Joseph M Hellerstein, and Ion Stoica: “HAT, not CAP: Towards Highly Available Transactions,” at 14th USENIX Workshop on Hot Topics in Operating Systems (HotOS), May 2013. http://www.bailis.org/papers/hat-hotos2013.pdf 4.  Peter Bailis, Ali Ghodsi, Joseph M Hellerstein, and Ion Stoica: “Bolt-on Causal Consistency,” at ACM International Conference on Management of Data (SIGMOD), June 2013. http://db.cs.berkeley.edu/papers/ sigmod13-bolton.pdf 5.  Peter Bailis, Aaron Davidson, Alan Fekete, et al.: “Highly Available Transactions: Virtues and Limitations,” at 40th International Conference on Very Large Data Bases (VLDB), September 2014. http://www.bailis.org/ papers/hat-vldb2014.pdf 6.  Hal Berenson, Philip A Bernstein, Jim N Gray, et al.: “A Critique of ANSI SQL Isolation Levels,” at ACM International Conference on Management of Data (SIGMOD), May 1995. http://research.microsoft.com/ pubs/69541/tr-95-51.pdf
  2. References (2/4) 7.  Eric A Brewer: “CAP Twelve Years Later:

    How the “Rules” Have Changed,” IEEE Computer Magazine, volume 45, number 2, pages 23–29, February 2012. http://cs609.cs.ua.edu/CAP12.pdf 8.  Michael J Cahill, Uwe Röhm, and Alan Fekete: “Serializable Isolation for Snapshot Databases,” at ACM International Conference on Management of Data (SIGMOD), pages 729–738, June 2008. http:// www.cs.nyu.edu/courses/fall12/CSCI-GA.2434-001/p729-cahill.pdf 9.  Donald D Chamberlin, Morton M Astrahan, Michael W Blasgen, et al.: “A History and Evaluation of System R,” Communications of the ACM, volume 24, number 10, pages 632–646, October 1981. http:// diaswww.epfl.ch/courses/adms07/papers/p632-chamberlin.pdf 10.  Tushar Deepak Chandra and Sam Toueg: “Unreliable Failure Detectors for Reliable Distributed Systems,” Journal of the ACM, volume 43, number 2, pages 225–267, March 1996. http:// courses.csail.mit.edu/6.852/08/papers/CT96-JACM.pdf 11.  Kapali P Eswaran, Jim N Gray, Raymond A Lorie, and Irving L Traiger: “The Notions of Consistency and Predicate Locks in a Database System,” Communications of the ACM, volume 19, number 11, pages 624– 633, November 1976. http://paul.rutgers.edu/cs545/S02/papers/eswaran-transaction.pdf 12.  Hector Garcia-Molina and Kenneth Salem: “Sagas,” at ACM International Conference on Management of Data (SIGMOD), May 1987. http://www.cs.cornell.edu/andru/cs711/2002fa/reading/sagas.pdf
  3. References (3/4) 13.  Jim N Gray, Raymond A Lorie, Gianfranco

    R Putzolu, and Irving L Traiger: “Granularity of Locks and Degrees of Consistency in a Shared Data Base,” in Modelling in Data Base Management Systems: Proceedings of the IFIP Working Conference on Modelling in Data Base Management Systems, G.M. Nijssen, Editor. Elsevier/North Holland Publishing, pages 364–394, 1976. http://citeseerx.ist.psu.edu/viewdoc/ summary?doi=10.1.1.92.8248 14.  Rachid Guerraoui: “Revisiting the relationship between non-blocking atomic commitment and consensus,” at 9th International Workshop on Distributed Algorithms (WDAG), pages 87–100, September 1995. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.27.6456 15.  Theo Härder and Andreas Reuter: “Principles of Transaction-Oriented Database Recovery,” ACM Computing Surveys, volume 15, number 4, pages 287–317, December 1983. http://web.stanford.edu/ class/cs340v/papers/recovery.pdf 16.  Pat Helland and Dave Campbell: “Building on Quicksand,” at 4th Biennial Conference on Innovative Data Systems Research (CIDR), January 2009. https://database.cs.wisc.edu/cidr/cidr2009/Paper_133.pdf 17.  Joseph M Hellerstein: “The Declarative Imperative: Experiences and Conjectures in Distributed Logic,” Technical Report, University of California at Berkeley, UCB/EECS-2010-90, June 2010. http:// www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-90.pdf 18.  Martin Kleppmann: “Hermitage: Testing the ‘I’ in ACID,” 25 November 2014. http:// martin.kleppmann.com/2014/11/25/hermitage-testing-the-i-in-acid.html
  4. References (4/4) 19.  Martin Kleppmann: “A Critique of the CAP

    Theorem,” Preprint arXiv:1509.05393 [cs.DC], Sep 2015. http://arxiv.org/abs/1509.05393 20.  Martin Kleppmann: Designing Data-Intensive Applications. O’Reilly Media, to appear. ISBN 1-4493-7332-1. http://dataintensive.net/ 21.  Wyatt Lloyd, Michael J Freedman, Michael Kaminsky, and David G Andersen: “Don’t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS,” at 23rd ACM Symposium on Operating Systems Principles (SOSP), pages 401–416, October 2011. https://www.cs.cmu.edu/~dga/ papers/cops-sosp2011.pdf 22.  Dan R K Ports and Kevin Grittner: “Serializable Snapshot Isolation in PostgreSQL,” at 38th International Conference on Very Large Data Bases (VLDB), volume 5, number 12, pages 1850–1861, August 2012. http://drkp.net/papers/ssi-vldb12.pdf 23.  Michael Stonebraker, Samuel Madden, Daniel J Abadi, et al.: “The End of an Architectural Era (It’s Time for a Complete Rewrite),” at 33rd International Conference on Very Large Data Bases (VLDB), pages 1150–1160, September 2007. http://www.vldb.org/conf/2007/papers/industrial/p1150-stonebraker.pdf 24.  Marek Zawirski, Annette Bieniusa, Valter Balegas, et al.: “SwiftCloud: Fault-Tolerant Geo-Replication Integrated all the Way to the Client Machine,” INRIA Research Report 8347, August 2013. http:// arxiv.org/abs/1310.3107