Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Function Passing Style: Typed, Distributed Func...

Heather Miller
September 19, 2014

Function Passing Style: Typed, Distributed Functional Programming

TLDR; Asynchronously pass safe functions to distributed, stationary, immutable data in a stateless container. Plus, use lazy combinators to eliminate intermediate data structures.

The functional programming paradigm nicely fits problems in distributed programming. For example, moving computation to data can achieve multi-fold improvements in latency and throughput of big-data-style applications.

This talk presents a new paradigm of “Function-Passing” as an effective means of asynchronous and disributed programming. By bringing together recent advances in type systems research and new language features in Scala, this paradigm promotes new patterns of programming in distributed environments using distributable lambdas and types.

Heather Miller

September 19, 2014
Tweet

More Decks by Heather Miller

Other Decks in Programming

Transcript

  1. …been working on Language support for distributed system builders. Serialization

    That happens mostly at compile time, so it’s performant. Type classes to allow users to serialize to different formats (binary, JSON, etc) 1.
  2. …been working on Language support for distributed system builders. Spores

    Portable (serializable) closures.
 Type constraints to restrict what they capture 2.
  3. In this talk… A programming model. Builds on the basis

    of serializable functions to provide a substrate that distributed systems can be built upon
  4. In this talk… A programming model. The result… the model

    greatly simplifies the design and implementation of mechanisms for: Fault-tolerance In-memory caching Debugging(i.e., pushing types into more layers of the stack) IN A CLEAN & FUNCTIONAL WAY. (STATELESS!)
  5. Note: Currently a research project. 
 Thus, all aspects of

    it are under development + publication in the works. (Thanks, )
  6. Inversion of the actor model. Can be thought of as

    a dual to actors. FUNDAMENTAL IDEA:
  7. A DUAL WHICH NICELY COMPLEMENTS ACTORS! Inversion of the actor

    model. Can be thought of as a dual to actors. FUNDAMENTAL IDEA:
  8. Function-passing… Stateless. Built on persistent data structures. Functions are exchanged

    through asynchronous messaging. Keep the data stationary. Of note: This is a model for programming with data and not a new model of concurrent processes like actors. ! Instead, we provide a new means of working with distributed data in a functional way.
  9. A Note on Distributed Computing Jim Waldo, Geoff Wyant, Ann

    Wollrath, and Sam Kendall Sun Microsystems Laboratories 2550 Garcia Avenue Mountain View, CA 94043 1 Introduction Much of the current work in distributed, object-oriented systems is based on the assumption that objects form a sin- gle ontological class. This class consists of all entities that can be fully described by the specification of the set of interfaces supported by the object and the semantics of the operations in those interfaces. The class includes objects that share a single address space, objects that are in sepa- rate address spaces on the same machine, and objects that are in separate address spaces on different machines (with, perhaps, different architectures). On the view that all 1.1 Terminology In what follows, we will talk about local and distributed computing. By local computing (local object invocation, etc.), we mean programs that are confined to a single address space. In contrast, we will use the term distributed computing (remote object invocation, etc.) to refer to pro- grams that make calls to other address spaces, possibly on another machine. In the case of distributed computing, nothing is known about the recipient of the call (other than that it supports a particular interface). For example, the client of such a distributed object does not know the hard-
  10. A Note on Distributed Computing Jim Waldo, Geoff Wyant, Ann

    Wollrath, and Sam Kendall Sun Microsystems Laboratories 2550 Garcia Avenue Mountain View, CA 94043 1 Introduction Much of the current work in distributed, object-oriented systems is based on the assumption that objects form a sin- gle ontological class. This class consists of all entities that can be fully described by the specification of the set of interfaces supported by the object and the semantics of the operations in those interfaces. The class includes objects that share a single address space, objects that are in sepa- rate address spaces on the same machine, and objects that are in separate address spaces on different machines (with, perhaps, different architectures). On the view that all 1.1 Terminology In what follows, we will talk about local and distributed computing. By local computing (local object invocation, etc.), we mean programs that are confined to a single address space. In contrast, we will use the term distributed computing (remote object invocation, etc.) to refer to pro- grams that make calls to other address spaces, possibly on another machine. In the case of distributed computing, nothing is known about the recipient of the call (other than that it supports a particular interface). For example, the client of such a distributed object does not know the hard- Differences in latency, memory access, partial failure, and concurrency make merging of the computational models of local and distributed computing both unwise to attempt and unable to succeed. “ ”
  11. A Note on Distributed Computing Jim Waldo, Geoff Wyant, Ann

    Wollrath, and Sam Kendall Sun Microsystems Laboratories 2550 Garcia Avenue Mountain View, CA 94043 1 Introduction Much of the current work in distributed, object-oriented systems is based on the assumption that objects form a sin- gle ontological class. This class consists of all entities that can be fully described by the specification of the set of interfaces supported by the object and the semantics of the operations in those interfaces. The class includes objects that share a single address space, objects that are in sepa- rate address spaces on the same machine, and objects that are in separate address spaces on different machines (with, perhaps, different architectures). On the view that all 1.1 Terminology In what follows, we will talk about local and distributed computing. By local computing (local object invocation, etc.), we mean programs that are confined to a single address space. In contrast, we will use the term distributed computing (remote object invocation, etc.) to refer to pro- grams that make calls to other address spaces, possibly on another machine. In the case of distributed computing, nothing is known about the recipient of the call (other than that it supports a particular interface). For example, the client of such a distributed object does not know the hard- A better approach is to accept that there are irreconcilable differences between local and distributed computing, and to be conscious of those differences at all stages of the design and implementation of distributed applications. Rather than trying to merge local and remote objects, engineers need to be constantly reminded of the differences between the two, and know when it is appropriate to use each kind of object. “ ” Differences in latency, memory access, partial failure, and concurrency make merging of the computational models of local and distributed computing both unwise to attempt and unable to succeed. “ ”
  12. Function-Passing Model the Stationary, immutable data. 1. Portable functions –

    
 move the functionality to the data. 2. Two concepts: (illustrated)
  13. Function-Passing Model the Stationary, immutable data. 1. Portable functions –

    
 move the functionality to the data. 2. Two concepts: Silos (illustrated)
  14. Function-Passing Model the Stationary, immutable data. 1. Portable functions –

    
 move the functionality to the data. 2. Two concepts: Silos (for a lack of a 
 better name) (illustrated)
  15. Function-Passing Model the Stationary, immutable data. 1. Portable functions –

    
 move the functionality to the data. 2. Two concepts: Silos Spores (for a lack of a 
 better name) (illustrated)
  16. Two concepts: Function-Passing (illustrated) Model the Portable functions – 


    move the functionality to the data. 2. Spores (for a lack of a 
 better name) Stationary, immutable data. 1. Silos
  17. Function-Passing Model the Silo[T] (illustrated) T WHAT ARE THEY? Silos.

    SiloRef[T] def  apply def  send (The workhorse.) The handle to a Silo.
  18. Function-Passing Model the Silo[T] (illustrated) T WHAT ARE THEY? Silos.

    SiloRef[T] def  apply def  send The handle to a Silo. def  apply(s1:  Spore,  s2:  Spore):  SiloRef[T]
  19. Function-Passing Model the Silo[T] (illustrated) T WHAT ARE THEY? Silos.

    SiloRef[T] def  apply def  send The handle to a Silo. Takes two spores: framework logic (combinator), e.g. map user/application-provided argument function Defers application of fn to silo, returns SiloRef with info for later materialization of silo. def  apply(s1:  Spore,  s2:  Spore):  SiloRef[T] LAZY!
  20. Function-Passing Model the Silo[T] (illustrated) T WHAT ARE THEY? Silos.

    SiloRef[T] def  apply def  send The handle to a Silo. def  apply(s1:  Spore,  s2:  Spore):  SiloRef[T] def  send():  Future[T]
  21. Function-Passing Model the Silo[T] (illustrated) T WHAT ARE THEY? Silos.

    SiloRef[T] def  apply def  send The handle to a Silo. def  apply(s1:  Spore,  s2:  Spore):  SiloRef[T] Sends info for function application and silo materialization to remote node EAGER! Asynchronous/nonblocking data transfer to local machine (via Future) def  send():  Future[T]
  22. Function-Passing Model the Stationary, immutable data. 1. Two concepts: Silos

    (for a lack of a 
 better name) (illustrated) Portable functions – 
 move the functionality to the data. 2. Spores
  23. What do spores look like? Basic usage: val  s  =

     spore  {      val  h  =  helper      (x:  Int)  =>  {          val  result  =  x  +  "  "  +  h.toString          println("The  result  is:  "  +  result)      }   } THE BODY OF A SPORE CONSISTS OF 2 PARTS 2 a closure a sequence of local value (val) declarations only (the “spore header”), and 1 http://docs.scala-lang.org/sips/pending/spores.html
  24. Spore http://docs.scala-lang.org/sips/pending/spores.html 1. All captured variables are declared in 


    the spore header, or using capture 2. The initializers of captured variables 
 are executed once, upon creation of 
 the spore 3. References to captured variables do 
 not change during the spore’s execution vsclosures ( ) A Guarantees...
  25. Spores & http://docs.scala-lang.org/sips/pending/spores.html Closures Evaluation semantics: Remove the spore marker,

    and the code behaves as before spores & closures are related: You can write a full function literal and pass it to something that expects a spore. (Of course, only if the function literal 
 satisfies the spore rules.)
  26. Function-Passing Model the (illustrated) Spores. Benefits: environment (captured variables) is

    declared explicitly, and fixed at spore creation time. can statically ensure that everything captured is 
 serializable
  27. Function-Passing Model the (illustrated) Spores. Benefits: environment (captured variables) is

    declared explicitly, and fixed at spore creation time. can statically ensure that everything captured is 
 serializable 6SRUHV $ 7\SH%DVHG )RXQGDWLRQ IRU &ORVXUHV LQ WKH $JH RI &RQFXUUHQF\ DQG 'LVWULEXWLRQ +HDWKHU 0LOOHU 3KLOLSS +DOOHU1 DQG 0DUWLQ 2GHUVN\ (3)/ DQG 7\SHVDIH ,QF1 ƇŠ‡ƒ–Š‡”Ŝ‹ŽŽ‡”ř ƒ”–‹Ŝ‘†‡”•›ƈɒ‡’ˆŽŜ…Š DQG ’Š‹Ž‹’’ŜŠƒŽŽ‡”ɒ–›’‡•ƒˆ‡Ŝ…‘1 $EVWUDFW )XQFWLRQDO SURJUDPPLQJ )3 LV UHJXODUO\ WRXWHG DV WKH ZD\ IRUZDUG IRU EULQJLQJ SDUDOOHO FRQFXUUHQW DQG GLVWULEXWHG SURJUDPPLQJ WR WKH PDLQVWUHDP 7KH SRSXODULW\ RI WKH UDWLRQDOH EHKLQG WKLV YLHZSRLQW KDV HYHQ OHG WR D QXPEHU RI REMHFWRULHQWHG 22 SURJUDPPLQJ ODQJXDJHV RXWVLGH WKH 6PDOOWDON WUDGLWLRQ DGRSW LQJ IXQFWLRQDO IHDWXUHV VXFK DV ODPEGDV DQG WKHUHE\ IXQFWLRQ FORVXUHV +RZHYHU GHVSLWH WKLV HVWDEOLVKHG YLHZSRLQW RI )3 DV DQ HQDEOHU UHOLDEO\ GLVWULEXWLQJ IXQF WLRQ FORVXUHV RYHU D QHWZRUN RU XVLQJ WKHP LQ FRQFXUUHQW HQYLURQPHQWV QRQHWKH OHVV UHPDLQV D FKDOOHQJH DFURVV )3 DQG 22 ODQJXDJHV 7KLV SDSHU WDNHV D VWHS WR ZDUGV PRUH SULQFLSOHG GLVWULEXWHG DQG FRQFXUUHQW SURJUDPPLQJ E\ LQWURGXFLQJ D QHZ FORVXUHOLNH DEVWUDFWLRQ DQG W\SH V\VWHP FDOOHG VSRUHV WKDW FDQ JXDUDQWHH FOR VXUHV WR EH VHULDOL]DEOH WKUHDGVDIH RU HYHQ KDYH FXVWRP XVHUGHILQHG SURSHUWLHV &UXFLDOO\ RXU V\VWHP LV EDVHG RQ WKH SULQFLSOH RI HQFRGLQJ W\SH LQIRUPDWLRQ FRU UHVSRQGLQJ WR FDSWXUHG YDULDEOHV LQ WKH W\SH RI D VSRUH :H SURYH RXU W\SH V\VWHP VRXQG LPSOHPHQW RXU DSSURDFK IRU 6FDOD HYDOXDWH LWV SUDFWLFDOLW\ WKURXJK D VPDOO HPSLULFDO VWXG\ DQG VKRZ WKH SRZHU RI WKHVH JXDUDQWHHV WKURXJK D FDVH DQDO\VLV RI UHDOZRUOG GLVWULEXWHG DQG FRQFXUUHQW IUDPHZRUNV WKDW WKLV VDIH IRXQGDWLRQ IRU FORVXUHV IDFLOLWDWHV .H\ZRUGV FORVXUHV IXQFWLRQV GLVWULEXWHG SURJUDPPLQJ FRQFXUUHQW SURJUDP ECOOP’14 http://docs.scala-lang.org/sips/pending/spores.html & A SIP:
  28. Function-Passing Model the (illustrated) SiloRef[List[Int]] Distributed List with operations map

    and reduce. EXAMPLE: (This is what would be happening under the hood)
  29. Function-Passing Model the (illustrated) SiloRef[List[Int]] Distributed List with operations map

    and reduce. EXAMPLE: (This is what would be happening under the hood) (Spores)
  30. Function-Passing Model the (illustrated) SiloRef[List[Int]] Distributed List with operations map

    and reduce. EXAMPLE: (This is what would be happening under the hood) .apply
  31. Function-Passing Model the (illustrated) SiloRef[List[Int]] SiloRef[List[Int]] Distributed List with operations

    map and reduce. EXAMPLE: (This is what would be happening under the hood) .apply
  32. .apply Function-Passing Model the (illustrated) SiloRef[List[Int]] SiloRef[List[Int]] Distributed List with

    operations map and reduce. EXAMPLE: (This is what would be happening under the hood) .apply
  33. map f .apply Function-Passing Model the (illustrated) SiloRef[List[Int]] SiloRef[List[Int]] Distributed

    List with operations map and reduce. EXAMPLE: (This is what would be happening under the hood) .apply
  34. map f .apply map (_*2) SiloRef[List[Int]] Function-Passing Model the (illustrated)

    reduce (_+_) SiloRef[Int] SiloRef[List[Int]] SiloRef[List[Int]] Distributed List with operations map and reduce. EXAMPLE: (This is what would be happening under the hood) .apply .apply
  35. map f .apply map (_*2) SiloRef[List[Int]] Function-Passing Model the (illustrated)

    reduce (_+_) SiloRef[Int] SiloRef[List[Int]] SiloRef[List[Int]] Distributed List with operations map and reduce. EXAMPLE: (This is what would be happening under the hood) .apply .send() .apply
  36. Function-Passing Model the (illustrated) Distributed List with operations map and

    reduce. EXAMPLE: (This is what would be happening under the hood) map f .apply map (_*2) SiloRef[List[Int]] reduce (_+_) SiloRef[Int] SiloRef[List[Int]] SiloRef[List[Int]] .apply .send()
  37. map f .apply map (_*2) SiloRef[List[Int]] reduce (_+_) SiloRef[Int] SiloRef[List[Int]]

    SiloRef[List[Int]] .apply .send() Machine 1 Silo[List[Int]] Machine 2 List[Int]
  38. map f .apply map (_*2) SiloRef[List[Int]] reduce (_+_) SiloRef[Int] SiloRef[List[Int]]

    SiloRef[List[Int]] .apply .send() Machine 1 Silo[List[Int]] Machine 2 List[Int]
  39. map f .apply map (_*2) SiloRef[List[Int]] reduce (_+_) SiloRef[Int] SiloRef[List[Int]]

    SiloRef[List[Int]] .apply .send() Machine 1 Silo[List[Int]] Machine 2 List[Int] λ
  40. map f .apply map (_*2) SiloRef[List[Int]] reduce (_+_) SiloRef[Int] SiloRef[List[Int]]

    SiloRef[List[Int]] .apply .send() Machine 1 Silo[List[Int]] Machine 2 List[Int] Silo[Int] Int Silo[List[Int]] List[Int] Silo[List[Int]] List[Int]
  41. map f .apply map (_*2) SiloRef[List[Int]] reduce (_+_) SiloRef[Int] SiloRef[List[Int]]

    SiloRef[List[Int]] .apply .send() Machine 1 Silo[List[Int]] Machine 2 List[Int] Silo[Int] Int Silo[List[Int]] List[Int] Silo[List[Int]] List[Int] Int
  42. Data in silos easily The persistent data structure is based

    on the chain of operations to derive the data of each silo. Since the lineage is composed of spores, it’s serialized. This means it can be persisted or transferred to other machine. Thus, traversing the silo data structures yields the complete lineage of a silo. Silos and SiloRefs relate to each other by means of a persistent data structure reconstructed:
  43. SUMMARY, All operations, including operations provided by system builders, are

    spores – so, serializable! Data (Silos) managed using persistent data structure. Taken together: in A lot simpler to build mechanisms for fault tolerance!
  44. Can’t I just… However, spores + silos additionally provides a

    lot of benefits that actors + spores alone do not provide out-of-the-box: Granted, that’s already quite powerful. :-) Send spores within messages between actors/processes? Benefits: deferred evaluation (laziness) enables optimizations to reduce intermediate results. statelessness + lineages simplifies the implementation of mechanisms for fault tolerance for certain applications 
 (think dist. collections)