Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building Distributed System with Akka

Building Distributed System with Akka

anildigital

April 17, 2018
Tweet

More Decks by anildigital

Other Decks in Programming

Transcript

  1. Market forces leading to change Concurrent connections •“Internet of Things”,

    mobile devices Big data •Size of data is overwhelming our ability to manage it Response times •Real-time results (e.g. analytics) with sub-second latencies
  2. Physical factors leading to change •Expensive hardware → cheap hardware

    •A single machine → cluster of machines •A single core → multiple cores •Slow networks → fast networks •Small data snapshots → big data, streams of data
  3. You have a distributed system, when the crash of a

    computer you’ve never heard of, stops you from getting any work done. Leslie Lamport: A Guide to Building Dependable Distributed Systems.
  4. A collection of independent computers that appear to its users

    as one computer. Tanenbaum and Steen: Distributed Systems, Principles and Paradigms
  5. 8 Fallacies of Distributed System •The network is reliable. •Latency

    is zero. •Bandwidth is infinite. •The network is secure. •Topology doesn't change. •There is one administrator. •Transport cost is zero. •The network is homogeneous.
  6. o o Akira - Americano Akira - Americano Akira -

    Americano Akira - Americano Akira - Americano Read Replication Akira - Americano Head
 Barista Assistant 
 Baristas
  7. Sharding o o Akira - Americano Akira - Americano Akira

    - Americano Akira - Americano o o Akira - Americano Koichi - Cappuccino Akira to Chang
 Barista 1 Jim to Lorenzo
 Barista 2 A-C J-L Akira - Americano Akira - Americano Koichi - Cappuccino Koichi - Cappuccino Koichi - Cappuccino Koichi - Cappuccino Koichi - Cappuccino Koichi - Cappuccino
  8. Issues with Sharding •Limited data model •More complexity •Limited data

    access patterns •Only good for certain kind of applications e.g. SAAS apps
  9. Consistency formula R + W > N • N =

    Total number of replicas (e.g. 3) • W = Number of replicas acknowledge my update • R = Number of replicas that agree on read
  10. When to use consistent hashing? •Scale •Transactional data (Business transactions

    .. not ACID) Data which changes a lot •Always available
  11. Ideal world system has •Two paths •Components that can never

    fail •Accounting for every possible fault by providing a recovery action
  12. FileWatcher LogFile LogProcessor Row Log Processor Log Processor DbWriter Database

    Connection The database connection might break Java Concurrent Logs Processor
  13. runnable.run() runnable.run() runnable.run() dbWriter.write(row5) runnable.run() DBBrokenConnectionException DBBrokenConnectionException DBBrokenConnectionException con.write(row5) Exception

    moves up the stack on the thread. We don’t have the connection details here to re-create dbWriter and retro dbWriter logProcessor FileWatcher thread logProcessor.process (file) Runnable dbWriter Writes using db Connection Exception can happen from different threads Many log processors are called to process files from several threads. Java Concurrent Logs Processor
  14. Difficult to recover •Exception moves up in stack •Making processed

    lines and connection info available •Breaks simple design •Violates best practices (Encapsulation, DI, SRP)
  15. Actors •Higher level abstraction to write concurrent and distributed programs

    •To concurrently manageable state •Communication with Actor is by sending messages
  16. Let it crash • Akka provides two separate flows: •

    Normal flow • Fault recovery flow
  17. Let it crash •The normal flow consists • Actors that

    handle normal messages •The recovery flow consists • Actors that monitor the actors in the normal flow.
  18. FileWatcher Disk Error Stop Corrupt FileException Resume Restart DbBrokeConnection Exception

    LogProcessor DbWriter LogProcessingSupervisor Escalate Actors in the log-processing application do not concern themselves with fault recovery Supervisors can decide to escalate problem to higher level The LogProcessingSupervisor create all the actors at startup and supervises them all Akka Concurrent Logs Processor
  19. LogProcessingSupervisor FileWatcher LogProcessor DbWriter DiskError Stop CorruptFileException Resume Db Broken

    Connection Exception Restart DbNodeDownException logProcessor Stop The LogProcessor also watches the dbWriter and replaces it once it is terminated due to a DbNodeDownException Akka Concurrent Logs Processor
  20. Benefits of Let it Crash •Fault isolation •Structure  •Redundancy  •Replacement 

    •Reboot  •Component lifecylcle •Suspend  •Separation of concerns 
  21. Akka Layers Akka Core Akka IO Akka Remote (Implements Distribution)

    Akka Cluster (Mobility) Akka Cluster Extensions (Pattern Singleton, Sharding, PubSub)
  22. class Worker extends Actor { def receive = { case

    x => println(x) } } val system = ActorSystem("ExampleActorSystem") val workerActorRef = system.actorOf(Props[Worker]) workerActorRef | "Hello World"
  23. val workerActorRef = system.actorOf(Props[Worker]) workerActorRef ! "Hello Conference" val workerActorRef

    = context.actorSelection("akka:tcp:// ExampleActorSystem@127.0.0.1:9005/usr/WorkerActor") workerActorRef ! "Hello World"
  24. akka { actor { provider = "akka.remote.RemoteActorRefProvider" } remote {

    enabled-transports = ["akka.remote.netty.tcp"] netty.tcp { hostname = "127.0.0.1" port = 2552 } } }
  25. Akka Cluster •Fault Tolerant Membership service for Akka nodes built

    on top of Akka Remoting •No Single point of failure or bottleneck •Provides support of load balancing or fail over
  26. Akka Cluster •Allows dynamically grow or shrink number of nodes

    •Actor could reside anywhere in the cluster.. local or remote.
  27. Cluster (node 1, node 2, node 3, node 4) Node

    1 Node 2 Node 3 Node 4 User a b c d e User a b c d e User a b c d e User a b c d e This cluster is ring of Nodes Every node contains an actor system. The actor system needs to have the same name to be part of the same cluster. A list of member nodes Is maintained in a current cluster state. The actor systems gossip to each other about this state Akka Cluster
  28. Akka Gossip Protocol •Decentralised, probabilistic, viral communication protocol with convergence

    •Each node hold the state of the cluster and tell neighbours about it
  29. Cluster Seed nodes (1, 2, 3) Master nodes (4, 5)

    Worker nodes (6, 7, 8) Node 1: Seed Role Node 3: Seed Role Node 2: Seed Role Node 4: Master Role Node 7: Worker Role Node 8: Worker Role Node 5: Master Role Node 6: Worker Role Minimal setup cluster. 3 seeds 2 master 3 workers
  30. Cluster Seed Nodes: (1, 2) Joining: (3) Node 1: Seed

    Role Node 2: Seed Role Node 3: Seed Role Join
  31. Cluster Seed Nodes: (1, 2, 3) Joining nodes: (4, 5)

    Node 1: Seed Role Node 2: Seed Role Node 4: Worker Role seed list (1, 2, 3) Join Node 3: Seed Role Node 5: Master Role seed list (1, 2, 3) Node 3 responds fastest and handles join of node 5 Node 2 responds fastest and handles join of node 4
  32. Cluster Seed nodes: (2) Master nodes: (5) Worker nodes: (4)

    Joining nodes: (6, 7) Node 1: Seed Role Node 2: Seed Role Node 7: Worker Role seed list (1, 2, 3) Join Node 3: Seed Role Node 5: Master Role seed list (1, 2, 3) Node 2 responds fastest and handles join of node 4 Leave Leave Node 4: Worker Role Node 5: Master Role Node 6: Worker Role seed list (1, 2, 3) Join
  33. Cluster Seed nodes (1, 2, 3) Master nodes (4, 5)

    Worker nodes (6, 7, 8) Node 1: Seed Role Node 3: Seed Role Node 2: Seed Role Node 4: Master Role Node 7: Worker Role Node 8: Worker Role Node 5: Master Role Node 6: Worker Role Minimal setup cluster. 3 seeds 2 master 3 workers
  34. Cluster Leader: Node 1 Node 1: Leaving Node 2: Up

    Node 3: Up Seed node 1 Seed node 2 Seed node 3 Leave
  35. Cluster Leader: Node 1 Node 1: Exiting unreachable Node 2:

    Up Node 3: Up Seed node 1 cluster node Is shutdown Seed node 2 Seed node 3 Leave
  36. Cluster Leader: Node 2 Node 1: Removed Node 2: Up

    Node 3: Up Seed node 2 Seed node 3
  37. Joining Up Leaving Unreachable Down Initial state Final state State

    in transition Key Join Leader action Leader Exiting Leader action Leader action Down Removed
  38. Cluster Sharding •To distribute actors across several nodes •Interact with

    them with logical identifier (without knowing physical location)
  39. Node Shard E E E E E Shard Region Shard

    Coordinator Akka Cluster Sharding
  40. Akka Persistence •Stateful actors to persist their internal state. •State

    can be recovered when actor is started, restarted
  41. Akka Persistence •Implemented using Event Sourcing •Only Changes to actor’s

    internal state are stored •Current state is never stored directly.
  42. Distributed Data •Useful to share data between nodes in Akka

    Cluster •Based on Conflict Free Replicated Data Types (CRDTs).
  43. References • Akka in Action - Raymond Roestenburg, • https://speakerd.s3.amazonaws.com/presentations/

    9a6b0f62b9ee4dc8980e5ff590f1a6cf/sheridan_college.pdf • Distributed Systems in One Lesson http://shop.oreilly.com/product/0636920039518.do by Tim Berglund • Learning Akka - Salma Khater • Hands on Introduction to Distributed Systems Concepts with Akka Clustering - by David Russell • A tour of the (advanced) Akka features in 60 minutes by Johan Janssen • Go distributed (and scale out) with Actors and Akka Clustering