Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Taming IoT Data - Making Sense of Sensors with ...

Taming IoT Data - Making Sense of Sensors with SQL Streaming @VoxxedCluj 2019

* Abstract:

We are living in a data streaming era, yet until recently it has been particularly hard to leverage existing stream processing technologies. On the one hand, because dealing with data in motion has its inherent challenges. On the other hand, most frameworks and APIs which are allowing for stream processing are typically very hard to employ and/or operate – NOT so for KSQL, the newest kid in Apache Kafka’s ecosystem.

Based on a simplified version of an IoT use case this session gives a gentle introduction into KSQL – Kafka’s SQL streaming engine for the masses. Join this fast-paced tour during which we are discussing a streaming IoT architecture. Concretely, we are going to:

(1) ingest smart home energy data into Apache Kafka,
(2) use KSQL for flexible, powerful and scalable SQL-only stream processing,
(3) send raw data as well as pre-processed results to an operational NoSQL data store, (4) reactively serve data to clients in near real-time and
(5) finally feed informative live charts.

* Video Recording:
https://www.youtube.com/watch?v=BkIwgWYRTYc&list=PLRsbF2sD7JVo4wqpokeojf07YfZsn5iUq&index=11

Hans-Peter Grahsl

October 31, 2019
Tweet

More Decks by Hans-Peter Grahsl

Other Decks in Programming

Transcript

  1. "... data processing that is designed with infinite data sets

    in mind." — Tyler Akidau @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 3
  2. Streaming Technologies ✓ purpose-built for data-in-motion ✓ events are 1st

    class citizens ✓ faster results & accurate answers @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 7
  3. Apache Kafka ✓ pub / sub to event streams ✓

    (permanently) store event streams ✓ process streams in near real-time ➔ horizontal scalability ➔ high fault-tolerance @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 15
  4. KSQL in a Nutshell ✓ built on top of Kafka

    Streams ✓ NO(!) coding skills required ✓ SQL only ➔ not embedded ✓ extremely low entry barrier ✓ familiar syntax & semantics ✓ concise & expressive @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 24
  5. KSQL in a Nutshell the usual suspects OOTB: ✓ projections,

    filters ✓ joins, aggregations ✓ windowing something missing? ✓ UDF & UDAF ✓ UDTF pending @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 25
  6. KSQL Queries ✓ per-record streaming with ms latency ✓ compiled

    into Kafka Streams apps ✓ distributed execution: KSQL servers ✓ 2 modes: interactive vs. headless @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 26
  7. KSQL's Interactive Mode ✓ KSQL servers accessed via REST API

    ✓ offers ad-hoc analytics of streams ✓ users can share streams & tables ✓ used for exploration and during development @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 27
  8. KSQL'S Headless Mode ✓ application == SQL file ✓ KSQL

    servers run streaming queries ✓ use case specific isolation ✓ "locked-down" ➡ NO REST API access ✓ used for production deployments @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 32
  9. "You think that's a database table you're querying now?" —

    Morpheus @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 39
  10. "Instead, only try to realize the truth - there is

    no database table." — Spoon Boy @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 40
  11. KSQL Wrap-Up ✓ SQL... and nothing but SQL ✓ use

    cases of any size (XS ... XXXL) ✓ scalable & fault-tolerant ✓ deployable anywhere in any way ✓ no additional infrastructure @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 44
  12. Even if your obsession tells you to do batching, I'd

    like you to walk away and stream with KSQL The choice is yours folks! @hpgrahsl | #VoxxedDays Cluj-Napoca, 31st Oct 2019, România 45