Build Real-Time Next Generation AI Applications with Kafka and Flink by Jeffrey Lam

Building Real-time Next Generation AI Applications with Kafka and Flink
Jeffrey Lam Staff Solutions Engineer, Conﬂuent Inc.

“Our latest research estimates that generative AI could add the
equivalent of $2.6 trillion to $4.4 trillion annually across the 63 use cases we analyzed.” Source: Economic Potential of Generative AI, McKinsey

Generative AI is a revolutionary tool… …and it’s only getting
better. July 2022 July 2023 Source: https://twitter.com/nickfloats/status/1676279157620199424?s=46&t=plcKoQYXnokFvxs3ieVg3Q

Vectors man [0.243, 0.765, …] woman [0.293, 0.774, …] Similar
Vectors plotted in space are placed near one another

Generative AI: the hottest topic in tech… …but what makes
it different? - AI models that generate content (e.g., text, pictures) rather than make predictions. - Uses Foundation Models (e.g., LLMs) that are prohibitively expensive ($100M+) to train. 175B neurons. - Models are trained on 1+ year-old public data. - However, models are inherently reusable.

LLMs can drive value for your business… …but only if
they have context from your data.

Without contextualized, trusted, current data LLMs can’t drive meaningful value
Source: https://www.wired.com/story/air-canada-chatbot-refund-policy/

How to create the next generation AI applications? Whatever your
AI use case, the recency, quality, trustworthiness and instant applicability of data is as important as the models themselves. Company Data Trusted Data Realtime Data

Without context, trustworthiness or real-time data applicability, LLMs can’t drive
meaningful value What is the status of my flight to New York? It is currently delayed by 2 hours and expected to depart at 5 pm GMT. Is there another flight available to the same city that will depart and arrive sooner? What are the seating options and cost? The next available flight to New York with United departs later but will arrive faster than your current flight. The only available seats in this flight are first class window seats and costs $1,500. Can your GenAI assistant remember data from an earlier conversation? What is the source of this information? Is this trustworthy? Is it fresh and accurate? How do you securely augment customer data with real-time data and process them on the fly to provide meaningful insights?

Retrieval Augmented Generation 10 Retrieval-augmented generation (RAG) is a technique
for enhancing the accuracy and reliability of generative AI models with facts fetched from external sources. LLM APP Databases Vector Embeddings Vector Database Users expect very low latency as the application must traverse multiple data sources before communicating with the LLM. Microservices

The reality of today’s data integration strategy A giant mess
of monolithic point-to-point connections with data ﬁdelity and governance challenges

Traditional enterprise data architecture is a AI innovation bottleneck Historic
Public Data Intelligent Business-Speciﬁc Co-Pilot User Interaction ?? Enterprise data architecture In-context learning & prompt-time assembly Generative AI Model

A high level picture Build a real-time, contextual and trustworthy
knowledge base for your AI applications CONNECT PROCESS GOVERN SHARE Real-time AI Apps Data Systems STREAM Pricing Inventory Payments Personalization Fraud Supply Chain Recommendations From Data Mess To Data Products To Instant Value Everywhere Kafka Flink

AI enabled stream enrichment in Apache Flink INSERT INTO enriched_reviews
SELECT id , review , invoke_gpt4(prompt,review) as score FROM product_reviews ; 14 The Prompt “Score the following text on a scale of 1 to 5 where 1 is negative and 5 is positive returning only the number” Reviews 1, “This was the worst decision ever” 2, “Not bad. Could have been cheaper” 3, “Amazing! Game Changer!” Output 1, “This was the worst decision ever”, 1 2, “Not bad. Could have been cheaper”, 3 3, “Amazing! Game Changer!”, 5

Improving product descriptions using GenAI Description CETAPHIL BODY MOISTURIZING CREAM
FOR DRY TO VERY DRY SKIN: Instantly replenishes, intensely nourishes and soothes skin dryness for 48 hours CLINICALLY PROVEN TO RESTORE SKIN'S MOISTURE BARRIER IN 1 WEEK. Binds water to the skin, preventing moisture loss to hydrate and protect skin from dryness NEW AND IMPROVED INGREDIENT BLEND: Now with hydrating glycerin and skin essential vitamins B5 (panthenol) and B3 (niacinamide) DEVELOPED FOR EVEN THE MOST SENSITIVE SKIN: The hypoallergenic, non-comedogenic formula is free of fragrances, parabens and sulfates DERMATOLOGIST RECOMMENDED for Sensitive Skin 15 Prompt Take the following product description and summarize the product in less than 10 words. Precede the summary with the word summary. Choose one word from the following words to categorize the product. Cream, Skincare, Beauty, Healthcare. Precede the category with the word category. Code SELECT name as Name , REGEXP_EXTRACT(a.response,".*Category:(.*),0") as Category , REGEXP_EXTRACT(a.response,"Summary:(.*),0") as Summary FROM ( SELECT name , invoke_gpt4(prompt, description) response FROM products ) a ; Output | Name | Category | Summary | | -------- | --------- | –---------------------------------------------------------- | | Cetaphil | Skincare | Moisturizing cream for dry skin, restores moisture barrier. | Summary: Moisturizing cream for dry skin, restores moisture barrier. Category: Skincare. GPT

0 50,000 100,000 150,000 2020 2021 2022 2016 2017 2018
Flink Kafka Kafka & Flink - 2 of Top 5 Apache Projects that goes hand-in-hand together for event stream processing >75% of the Fortune 500 estimated to be using Kafka >100,000+ orgs using Kafka >41,000 Kafka meetup attendees >750 Kafka Improvement Proposals >12,000 Jiras for Apache Kafka Two Apache Projects, Born a Few Years Apart Monthly Unique Users

17 Kafka A Distributed Commit Log. Publish and subscribe to
streams of records. Highly scalable, high throughput. Supports transactions. Persisted data. Reads are a single seek & scan Writes are append only Apache Kafka - Publish & Subscribe Reimagined 1. Publish Stream Events 2. Store 3. Process & Consume

Apache Flink Write microservices to process your data in real-time
Kafka Connect API Reliable and scalable integration of Kafka with other systems – no coding required. Orders Customers Flink Apache Kafka and Microservices Table Microservices 18

Filters INSERT INTO STREAM high_readings SELECT sensor, reading, FROM readings
WHERE reading > 41 ** The above query runs indefinitely and produces the resulting in a new topic “high readings”. Flink Kafka Kafka

Joins INSERT INTO enriched_readings SELECT senior, reading, area, brand_name FROM
readings INNER JOIN brands b ON b.sensor = readings.sensor Kafka Flink Kafka

Aggregate INSERT INTO avg_readings SELECT sensor, AVG(reading) AS location FROM
readings GROUP BY sensor Flink Kafka Kafka

Core processing systems External data Unstructured data Systems of Record
Browser mobile Telemetry DWH, Data Lake SaaS apps … Infrastructure Data Sources How to do it with Kafka & Flink Event-driven Decoupled architecture Immutable Robust security controls Real-time and performant Fully managed cloud-native service Vector Databases Model Building / Fine-tuning GenAI Consumer & Gateway GenAI Agents in SaaS Cluster 120+ pre-built Connectors Stream Processing Stream Governance Create a real-time knowledge base Bring real-time context at query time Build governed, secured, and trusted AI Experiment, scale and innovate faster +

LLM-enabled applications have four steps Data Augmentation Prepare data for
a real-time knowledge base and contextualization in LLM queries Inference Programmatically connecting relevant information with each prompt Workﬂows Parsing natural language to synthesize necessary information and apply contextual reasoning on-the-ﬂy Post-Processing Validates outputs and enforces business logic to detect hallucinations and ensure trustworthy responses

How Kafka + Flink Work: Data Augmentation Prepare data for
a real-time knowledge base and contextualization in LLM queries Schema Registry Kafka Consumer Group Kafka Sink Connector or Native Integration Kafka Sink Connector or Native Integration Operational Data Store (Unstructured Data) Vector Store Versioned Data Products in git Embedding API Gateway Embedding Model Instances Vector Embedding Service Schema Specs Terraform Flink

How Kafka + Flink Work: Inference Programmatically connecting relevant information
with each prompt Schema Registry Conﬂuent Cloud Consumer Group Versioned Data Products in git LLM API Gateway LLM Instances LLM Service Schema Specs Terraform Web Application Vector Store Flink

How Kafka + Flink Work: Workﬂows Parsing natural language to
synthesize necessary information and apply contextual reasoning Schema Registry Conﬂuent Cloud Consumer Group Versioned Data Products in git LLM API Gateway LLM Instances LLM Service Schema Specs Terraform Web Application Vector Store Operational Data Store (RDBMS) Reasoning Agent Flink

Schema Registry Conﬂuent Cloud Consumer Group Versioned Data Products in
git LLM API Gateway LLM Instances LLM Service Schema Specs Terraform Web Application Vector Store Operational Data Store (RDBMS) Reasoning Agent How Kafka + Flink Work: Post-Processing Enforce Business Logic and Compliance Requirements with LLM Outputs Post- Processing Consumer Group Flink

Further Materials 28 Github and Video https://github.com/brittonlaroche/Conﬂuent-Kafka-Vector-Encoding?tab=readme-ov-ﬁle

Real-time Weather and Flight Status Embeddings into Knowledge Database 29
Building Real-Time, Intelligent AI Copilots https://techcommunity.microsoft.com/t5/ai-azure-ai-services-blog/building-real-ti me-intelligent-ai-copilots-with-conﬂuent-cloud/ba-p/3932183

How to get started with Kafka and Flink 30 https://developer.conﬂuent.io

31 cnﬂ.io/ask-the-community Ask questions, share knowledge and chat with your
fellow community members! Join your local Kafka User Group! meetup.com/taipei-kafka/ Learn Apache Kafka® with Conﬂuent

Conﬂuent

THE AI FRAUD DETECTION CHALLENGE: Looking at the series of
events surrounding a transaction to derive context in a timely manner

Build Real-Time Next Generation AI Applications...

Build Real-Time Next Generation AI Applications with Kafka and Flink by Jeffrey Lam

Kim Kao

More Decks by Kim Kao

Other Decks in Technology

Featured

Transcript

Building Real-time Next Generation AI Applications with Kafka and Flink

“Our latest research estimates that generative AI could add the

Generative AI is a revolutionary tool… …and it’s only getting

Vectors man [0.243, 0.765, …] woman [0.293, 0.774, …] Similar

Generative AI: the hottest topic in tech… …but what makes

LLMs can drive value for your business… …but only if

Without contextualized, trusted, current data LLMs can’t drive meaningful value

How to create the next generation AI applications? Whatever your

Without context, trustworthiness or real-time data applicability, LLMs can’t drive

Retrieval Augmented Generation 10 Retrieval-augmented generation (RAG) is a technique

The reality of today’s data integration strategy A giant mess

Traditional enterprise data architecture is a AI innovation bottleneck Historic

A high level picture Build a real-time, contextual and trustworthy

AI enabled stream enrichment in Apache Flink INSERT INTO enriched_reviews

Improving product descriptions using GenAI Description CETAPHIL BODY MOISTURIZING CREAM

0 50,000 100,000 150,000 2020 2021 2022 2016 2017 2018

17 Kafka A Distributed Commit Log. Publish and subscribe to

Apache Flink Write microservices to process your data in real-time

Filters INSERT INTO STREAM high_readings SELECT sensor, reading, FROM readings

Joins INSERT INTO enriched_readings SELECT senior, reading, area, brand_name FROM

Aggregate INSERT INTO avg_readings SELECT sensor, AVG(reading) AS location FROM

Core processing systems External data Unstructured data Systems of Record

LLM-enabled applications have four steps Data Augmentation Prepare data for

How Kafka + Flink Work: Data Augmentation Prepare data for

How Kafka + Flink Work: Inference Programmatically connecting relevant information

How Kafka + Flink Work: Workﬂows Parsing natural language to

Schema Registry Conﬂuent Cloud Consumer Group Versioned Data Products in

Further Materials 28 Github and Video https://github.com/brittonlaroche/Conﬂuent-Kafka-Vector-Encoding?tab=readme-ov-ﬁle

Real-time Weather and Flight Status Embeddings into Knowledge Database 29

How to get started with Kafka and Flink 30 https://developer.conﬂuent.io

31 cnﬂ.io/ask-the-community Ask questions, share knowledge and chat with your

Conﬂuent

THE AI FRAUD DETECTION CHALLENGE: Looking at the series of