Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Kafka & Event-driven Architecture

Vinay Kumar
October 19, 2020

Kafka & Event-driven Architecture

My talk on APACOUG(Asia Pacific Oracle User group) 20.

Event-driven architecture in APIs and microservice are very important topics if you are developing modern applications with new technology, platforms. This session explains what is Kafka and how we can use in event-driven architecture. This session explains the basic concepts of publisher, subscriber, streams, and connect. Explain how Kafka works. The session covers developing different functions with different programming languages and shows how they can share messages by using Kafka. What are the options we have in Oracle stack? Which tool make it possible event-driven architecture in Oracle stack. Speaker will also explain Oracle Event HUB, OCI streaming, and Oracle AQ implementation.

Vinay Kumar

October 19, 2020
Tweet

Transcript

  1. • ORACLE ACE • Global Integration Architect • Author of

    “Beginning Oracle WebCenter portal 12c” • Blogger- http://www.techartifact.com/blogs • https://www.linkedin.com/in/vinaykumar2/ 2 About me
  2. Agenda 4 • Interaction Style • Traditional SOA approach •

    Event driven Architecture • Event with Microservices • Event streaming with event Hub • Integration with legacy. • Kafka • Kafka internals • Async APIs
  3. Communication Styles 5 Type of Interaction Initiator Participants Time-driven Time

    The specifice system Request-driven Client Client & Server Event-driven event Open-ended
  4. • Anything happened (or didnt happen). • A change in

    the state. • An event is always named in the past tense and is immutabled • A condition that triggers a notification. CustomerAddressChanged InventoryUpdated SalesOrderCreated PurschaseOrderCreated 10 Events
  5. • “Real-time” events as they happen at the producer •

    Push notifications • One-way “fire-and-forget” • Immediate action at the consumers • Informational (“someone logged in”), not commands (“audit this”) 11 Characterstics of Events
  6. Typical EDA Architecture Event Bus System System System System System

    System Event Producers Event Transport Event Consumer
  7. • Supports the business demands for better service (no batch,

    less waiting) • No point-to-point integrations (fire & forget) • Fault tolerance, scalability, versatility, and other benefits of loose coupling. • Powerful real-time response and analytics. • Greater operational efficiencies 13 benefits of EDA
  8. 15 Document Title - Name - Function - Business Unit

    DD/MM/YYYY Monolithic Shop Customer Customer Inventory Payment Might be individual bounded context in new world
  9. 16 Document Title - Name - Function - Business Unit

    DD/MM/YYYY Traditional Archiecture (Point to Point) Shop Customer Marketing Inventory Payment Reporting Inventory
  10. 17 Document Title - Name - Function - Business Unit

    DD/MM/YYYY Traditional Archiecture- ESB Shop Customer Marketing Inventory Payment Reporting Enterprise Service Bus
  11. 18 Document Title - Name - Function - Business Unit

    DD/MM/YYYY Traditional Archiecture- ESB Shop Customer Marketing Inventory Payment Reporting Lets add some fraud check and new version V1 V2 Enterprise Service Bus fraud fraud
  12. 19 Document Title - Name - Function - Business Unit

    DD/MM/YYYY Lets add loyalty features Shop Customer Marketing Inventory Payment Reporting V1 V2 Enterprise Service Bus fraud fraud Loyalty
  13. 20 Document Title - Name - Function - Business Unit

    DD/MM/YYYY When we scale up and see Integration ripple effect Shop Customer Marketing Inventory Payment Reporting V1 V2 Enterprise Service Bus fraud fraud Loyalty Loyalty
  14. • SOA is all about dividing domain logic into separate

    systems and exposing it as services • In some cases business logic will, by its very nature, be spread out over many systems or across various domain (cross domain). • The result is domain pollution and bloat in the SOA systems. Whats the problem
  15. • Domain driven design promote the business logic to expose

    as a service and focus should be on domain and domain logic. • Domain event is an activity happened that domain expert is concerned. • By exposing relevant Domain Events on a shared event bus we can isolate cross cutting functions to separate systems SOA and Domain events
  16. Event Driven (Async) in Microservices Shop DB Shop logic Shop

    API Customer DB Customer logic Customer API Payment DB Payment logic Payment API Event Hub Shop Microservices Payment Microservices Customer Microservices Producer, Consumer Consumer Consumer
  17. 27 Document Title - Name - Function - Business Unit

    DD/MM/YYYY Event Streaming Data Source Shop DB Shop logic Shop API Customer DB Customer logic Customer API Payment DB Payment logic Payment API Event Hub Shop Microservices Payment Microservices Customer Microservices Producer, Consumer Consumer Consumer Mobile Apps Social Media Stocks Blockchain Location IOT Events Streaming
  18. 28 Document Title - Name - Function - Business Unit

    DD/MM/YYYY Microservice events and Streaming processing State Microservice API Microservices Cluster Mobile Apps Social Media Stocks Blockchain Location IOT Events Stream Event Hub Events Stream Stream Processing Cluster Stream Analytics dashboard Events Stream Reference Model Results BI tools Search/Discover Mobile & online apps SQL Service Service Service
  19. • Domain event- In domain-driven design, domain events are described

    as something that happens in the domain and is important to domain experts. - A user has registered - An order has been cancelled. - The payment has been received Domain events are relevant both within a bounded context and across bounded contexts for implementing processes within the domain. Best for communication between bounded context. 29 Document Title - Name - Function - Business Unit DD/MM/YYYY Domain event and event sourcing ▪ Event Sourcing - Event Sourcing ensures that all changes to application state are stored as a sequence of events. It store the events that lead to specific state and state too. - MobileNumberProvided (MobileNumber) - VerificationCodeGenerated (VerificationCode) - MobileNumberValidated (no additional state) - UserDetailsProvided (FullName, Address, …) These events are sufficient to reconstruct the current state of the UserRegistration aggregate at any time. Event Sourcing is for persistent strategy. Event Sourcing makes it easier to fix inconsistencies. Event Sourcing is local for a domain.
  20. 32 Document Title - Name - Function - Business Unit

    DD/MM/YYYY Kafka Overview • Distributed publish-subscribe messaging system. • Designed for processing of real time activity stream data (log, metrics, collections, social media streams,…..) • Does not use JMS API and standards • Kafka maintains feeds of message in topics • Initially developed at Linkedin, now part of Apache.
  21. • Reliability. Kafka is distributed, partitioned, replicated, and fault tolerant.

    Kafka replicates data and is able to support multiple subscribers. Additionally, it automatically balances consumers in the event of failure. • Scalability. Kafka is a distributed system that scales quickly and easily without incurring any downtime. • Durability. Kafka uses a distributed commit log, which means messages persists on disk as fast as possible providing intra-cluster replication, hence it is durable. • Performance. Kafka has high throughput for both publishing and subscribing messages. It maintains stable performance even when dealing with many terabytes of stored messages. 34 Document Title - Name - Function - Business Unit DD/MM/YYYY Benefits of Kafka
  22. • Kafka is a messaging system that is designed to

    be fast, scalable, and durable. • A producer is an entity/application that publishes data to a Kafka cluster, which is made up of brokers. • A Broker is responsible for receiving and storing the data when a producer publishes. • A consumer then consumes data from a broker at a specified offset, i.e. position. • A Topic is a category/feed name to which records are stored and published. Topics have partitions and order guaranteed per partitions • All Kafka records are organized into topics. Producer applications write data to topics and consumer applications read from topics. 35 Document Title - Name - Function - Business Unit DD/MM/YYYY What is kafka
  23. • Topic is divided in partitions. • The message order

    is only guarantee inside a partition • Consumer offsets are persisted by Kafka with a commit/auto-commit mechanism. • Consumers subscribes to topics • Consumers with different group-id receives all messages of the topics they subscribe. They consume the messages at their own speed. • Consumers sharing the same group-id will be assigned to one (or several) partition of the topics they subscribe. They only receive messages from their partitions. So a constraint appears here: the number of partitions in a topic gives the maximum number of parallel consumers. • The assignment of partitions to consumer can be automatic and performed by Kafka (through Zookeeper). If a consumer stops polling or is too slow, a process call “re- balancing” is performed and the partitions are re-assigned to other consumers. 38 Document Title - Name - Function - Business Unit DD/MM/YYYY Key Concepts of Kafka
  24. • Kafka normally divides topic in multiply partitions. • Each

    partition is an ordered, immutable sequence of messages that is continually appended to. • A message in a partition is identified by a sequence number called offset. • The FIFO is only guarantee inside a partition. • When a topic is created, the number of partitions should be given • The producer can choose which partition will get the message or let Kafka decides for him based on a hash of the message key (recommended). So the message key is important and will be the used to ensure the message order. • Moreover, as the consumer will be assigned to one or several partition, the key will also “group” messages to a same consumer. 39 Document Title - Name - Function - Business Unit DD/MM/YYYY Key Concepts of Kafka - continued
  25. • A data source writes messages to the log and

    one or more consumers reads from the log at the point in time they choose. • In the diagram below a data source is writing to the log and consumers A and B are reading from the log at different offsets. 40 Document Title - Name - Function - Business Unit DD/MM/YYYY Log Anatomy
  26. • We have a broker with three topics, where each

    topic has 8 partitions. • The producer sends a record to partition 1 in topic 1 and since the partition is empty the record ends up at offset 0. 41 Document Title - Name - Function - Business Unit DD/MM/YYYY Record flow in Apache Kafka
  27. • Next record is added to partition 1 will and

    up at offset 1, and the next record at offset 2 and so on. • This is a commit log, each record is appended to the log and there is no way to change the existing records in the log(immutable). This is also the same offset that the consumer uses to specify where to start reading. 42 Document Title - Name - Function - Business Unit DD/MM/YYYY Record flow in Apache Kafka - continued
  28. 43 Document Title - Name - Function - Business Unit

    DD/MM/YYYY Apache Kafka Architecture
  29. • Each broker holds a number of partitions and each

    of these partitions can be either a leader or a replica for a topic. • All writes and reads to a topic go through the leader and the leader coordinates updating replicas with new data. If a leader fails, a replica takes over as the new leader. 44 Document Title - Name - Function - Business Unit DD/MM/YYYY Kafka - Partitions and Brokers
  30. • Producers write to a single leader, this provides a

    means of load balancing production so that each write can be serviced by a separate broker and machine. • In the image, the producer is writing to partition 0 of the topic and partition 0 replicates that write to the available replicas. 45 Document Title - Name - Function - Business Unit DD/MM/YYYY Kafka – Producers writing to broker
  31. 46 Document Title - Name - Function - Business Unit

    DD/MM/YYYY Kafka Architecture- Topic Replication factor
  32. • Producer is process that can publish a message to

    a topic. • Consumer is a process that can subscribe to one or more topics and consume messages published to topics. • Topic category is the name of the feed to which messages are published. • Broker is a process running on single machine • Cluster is a group of brokers working together. • Broker management done by Zookeeper. 47 Document Title - Name - Function - Business Unit DD/MM/YYYY Flow of a record in Kafka
  33. • Auto Scalable infrastructure. • Multi language support (SDKs) •

    Event Streaming Database • GUI driven management and monitoring • Enterprise Grade security • Diagnostic Logs • Data Monitor • Global Resilience • Disaster Recovery • Connector with legacy application • Retention management • Flexible DevOps • ….. Document Title - Name - Function - Business Unit DD/MM/YYYY What are capabilites of events hub
  34. • Traditional message broker (not really event driven) 49 Document

    Title - Name - Function - Business Unit DD/MM/YYYY Alternatives of event-hub or kafka?
  35. 50 Document Title - Name - Function - Business Unit

    DD/MM/YYYY What about legacy App? RDBMS Existing App Event Hub New APP Change Data capture
  36. • Attunity Replicate • Debezium (open source) • IBM IIDR

    • Oracle GoldenGate for Big Data • SQ Data 51 Document Title - Name - Function - Business Unit DD/MM/YYYY Change data capture tools
  37. 52 Document Title - Name - Function - Business Unit

    DD/MM/YYYY Microservice events and Streaming processing and legacy application State Microservice API Microservices Cluster Mobile Apps Social Media Stocks Blockchain Location IOT Events Stream Event Hub Events Stream Stream Processing Cluster Stream Analytics dashboard Events Stream Reference Model Results BI tools Search/Discover Mobile & online apps SQL Service Service Service State State Finance Sales Audit Change Data capture Big Data
  38. Integrate kafka with legacy • JDBC connector for Kafka connect

    • Use CDC (Change data capture) tool which integrates with kafka connect.
  39. • Kafka Connect is a tool for scalably and reliably

    streaming data between Apache Kafka and other data systems. Runs separately from Kafka brokers. 55 Document Title - Name - Function - Business Unit DD/MM/YYYY Kafka connect
  40. • How do manage the event lifecycle? • We have

    API management platform for APIs. 56 Document Title - Name - Function - Business Unit DD/MM/YYYY Async API - https://www.asyncapi.com/ Event lifecycle - Design - Documentation - Code generation - Event management - Test - Monitoring An Async API document is a file that defines and annotates the different components of a specific Event-Driven API.
  41. 57 Document Title - Name - Function - Business Unit

    DD/MM/YYYY OpenAPI- AsyncAPI comparison
  42. Summary 5 8 • Make the split right – Bounded

    context. • Events are the communication between bounded context. • Event can be Async communication b/w microsevices. • Kafka is great source for messaging. • Event hub is key in new enterprise integration world. • Use CDC for legacy integrationn. • Try Async API for event documentation.
  43. • https://www.slideshare.net/gschmutz/building-event-driven-microservices-with-apache- kafka-208145957 • https://www.slideshare.net/jeppec/soa-and-event-driven-architecture-soa- 20?qid=604f3115-642b-48d4-b7ef-66ce11ab9b0b&v=&b=&from_search=65 • https://docs.confluent.io/current/connect/index.html • https://data-flair.training/blogs/kafka-architecture/

    • https://martinfowler.com/bliki/BoundedContext.html • https://insidebigdata.com/2018/04/12/developing-deeper-understanding-apache-kafka- architecture/ • https://www.asyncapi.com/ 59 Document Title - Name - Function - Business Unit DD/MM/YYYY References