Upgrade to Pro — share decks privately, control downloads, hide ads and more …

From Data to Insights: Building a Bluesky Bot p...

From Data to Insights: Building a Bluesky Bot powered by AI

A common challenge developers face when working with data streams is collecting and analyzing this data as fast as possible to uncover meaningful insights. It’s a complex problem that requires the right combination of real-time data technologies and AI for instant, intelligent decision-making.

In this talk, I’ll show you how I tackled this by building a Bluesky bot that turns raw data into actionable insights using GenAI. We’ll dive into the process of collecting data, transforming it into streams, and using Redis 8 to power real-time analysis. Along the way, I’ll explore how probabilistic data structures, like Count-Min Sketch and Bloom Filters, help optimize performance and enable scalable analytics without compromising accuracy.

I’ll also demonstrate how Redis 8 supports vector similarity search, making it possible to compare and classify data efficiently—an essential step for enhancing AI-driven insights. You’ll see how this can be applied to find patterns, group similar content, and make smarter recommendations.

Finally, I’ll bring it all together by showing how Redis and GenAI work hand in hand to extract patterns and generate insights, with practical examples implemented in Java.

Whether you’re curious about GenAI, interested in data-driven analytics, or simply love experimenting with creative tech solutions, this session will inspire you with practical techniques and real-world applications.

Avatar for Raphael De Lio

Raphael De Lio

May 26, 2025
Tweet

More Decks by Raphael De Lio

Other Decks in Programming

Transcript

  1. What are we building today? • Listen to Bluesky’s Jetstream

    • Filter in posts related to AI • Enrich the data from these posts with sub topics • Keep track of frequency of topics • Allow Bluesky users to interact with our bot and ask questions
  2. How are we gonna do that? • Redis Streams •

    Semantic Classification • Semantic Routing • Semantic Caching • Probabilistic Data Structures
  3. Why? • Messages are ephemeral • We want to process

    these messages with multiple services.
  4. • In-memory first (Fast) • Easy setup • Support for

    Consumer Groups • Perfect fit for realtime pipelines
  5. Filtering Service Is this post about AI? Filtered Bluesky Posts

    Store the post in Redis as a Hash or a JSON
  6. Generate 500 Bluesky Posts about AI Turn these samples into

    Vectors Store the vectors in a Vector Database
  7. Filtering Service Turn the post into a vector Compare the

    vector of the post with the vector of the samples in the Vector Database Write the post to the filtered-posts stream Store the post in Redis as a Hash or a JSON
  8. Topic Extractor Service What topics can be implied from this

    post? Update post in Redis (Hash or JSON) Increment topics counters in the TopK
  9. Bot These are the possible functions: […] What function should

    I call based on the post? Bot Invokes the specific function Generates final response with information from the function Bot Reads post that mentions the bot Posts response
  10. Generate references for a certain route Turn these references into

    Vectors alongside their route Store the vectors in a Vector Database
  11. Bot Turn the post into a vector Compare the vector

    of the post with the vector of the routes in the Vector Database Invoke returned route (function) if similarity is high enough Bot Generates final response with information from the function Bot Post response
  12. Consumer Stream Process Message Check if ID exists in BLOOM

    FILTER Consumer Add ID to the Bloom Filter
  13. Recap • How to use Redis Streams for consuming realtime

    data • How to use Semantic Classification for filtering data • How to use LLMs to extract data • How to use Semantic Routing for calling functions • How to use Semantic Caching for saving money & time • How to efficiently analyze huge data streams with TopK and Bloom Filter