Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Advanced Zenoh Tutorial -- Part II

Advanced Zenoh Tutorial -- Part II

This three-part webinar series is designed for developers who want to master Zenoh’s full potential. We’ll cover everything from core concepts to advanced programming techniques, performance tuning, and deployment strategies. The sessions use Rust as the primary language with some complementary examples in Python. Whether you're building robotics systems, telemetry pipelines, or IoT infrastructures—this series will help you unlock the true power of Zenoh.

Avatar for Angelo Corsaro

Angelo Corsaro

May 14, 2025
Tweet

Video

More Decks by Angelo Corsaro

Other Decks in Programming

Transcript

  1. https://github.com/kydos/zsak Zenoh Swiss Army knife (ZSAK) We’ll extensively use ZSAK

    an experimental command line tool designed to learn and experiment with Zenoh
  2. Get It. Build It. Run It $ git clone [email protected]:kydos/zsak.git

    $ cd zsak $ cargo build —release $ ./target/release/zsak -h
  3. Declaring a Queryable The declaration of a queryable requires minimally

    the key expression associated with it let queryable = z.declare_queryable("daily/quote/**") .complete(is_complete) .await.unwrap(); You can also specify the completeness of this queryable w.r.t. the set of keys represented by its key expression. If the queryable “has answers” for any key in the set expressed by its key expression then it ice complete, otherwise it is not.
  4. Query In Zenoh queries are issued by using the get

    operation z.get(k) .target(query_target) .consolidation(consolidation) .res().await.unwrap(); The result of a query is a fi nite stream of Replies Controls who, among matching, is eligible to be targeted Controls if and how replies are consolidated
  5. Implementing a Queryable Implementing a queryable is about “iterating” through

    queries and providing replies. while let Ok(query) = queryable.recv_async().await { let key_expr = query.key_expr().to_string(); let payload = query.payload() … let result = …; query.reply(query.key_expr(), &result) .source_info(si.clone()) .timestamp(z.new_timestamp()) .await.unwrap(); }
  6. Zenoh Features By default Zenoh is built with these features

    $ cargo read-manifest | jq '.features' "default": [ "auth_pubkey", "auth_usrpwd", "transport_multilink", "transport_compression", “transport_quic", "transport_tcp", "transport_tls", "transport_udp,, "transport_unixsock-stream", “transport_ws” ] Supported features can be listed with the following command Additional features can be added when building $ cargo build —features “shared-memory"
  7. Supported Links To date, Zenoh supports the following links: •TCP/IP

    •UDP/IP •TLS •Web Sockets •Virtual Socket (VSock) •QUIC •Serial •Unix Domain Socket (Stream) •Unix Pipes
  8. Selecting Links Zenoh applications can use multiple links of multiple

    kinds Zenoh transparently routes, if needed, across di ff erent links kinds Links can be provided by means of a con fi guration fi le or set programmatically. Peer Peer Peer Peer Unix Sockets Unix Pipes UDP/IP TCP/IP QUIC UDP/IP
  9. Link Con fi guration This con fi guration fragment, shows

    how to con fi gure a peer to accept session on TCP/IP, UDP/IP, Unix Sockets and Pipes listen: { endpoints: { peer: [ "tcp/127.0.0.1:7447", "udp/127.0.0.1:7447", "unixsock-stream/./zsak/usock", "unixpipe/./zsak/pipe" ] } },
  10. Zenoh’s Reliability In other to do reliable communication Zenoh should

    have at least a reliable link By default data published in Zenoh is reliable By default reliability is hop-to- hop
  11. Hop-to-Hop Reliability Hop-to-Hop reliability means that Zenoh ensure that data

    is transmitted reliably between the two ends of a link, in other terms between A and B, B and C, … F and G Hop-to-Hop reliability is good fo scalability and ensures that data is reliably delivered end- to-end if the graph is stable A B C D E F G
  12. Failures If one of the nodes in the routing path

    fails, Zenoh will look for another path A B C D E F G
  13. Failures If one of the nodes in the routing path

    fails, Zenoh will look for another path In this case it may happens that some of the messages received by D were not delivered to E, thus causing some losses. A B C D E F G H I
  14. Reliability Or it can be con fi gured for the

    Publisher let z_pub = z.declare_publisher("daily/quote/latin") .reliability(Reliability::Reliable) .await.unwrap(); pub enum Reliability { BestEffort, Reliable, } z.put("daily/quote/brazilian", "A união faz a força.") .reliability(Reliability::Reliable) .await.unwrap(); Reliability in Zenoh can be controlled for individual put operations
  15. Congestion Control The producer has control over the congestion control

    policy pub enum CongestionControl { #[default] Drop = 0, Block = 1, } z.put("daily/quote/latin", "Per aspera ad astra.") .congestion_control(CongestionControl::Drop) .await.unwrap(); let z_pub = z.declare_publisher("daily/quote/latin") .congestion_control(CongestionControl::Drop) .await.unwrap(); The default congestion control can be provided when creating a publisher
  16. Controlling Priority pub enum Priority { RealTime = 1, InteractiveHigh

    = 2, InteractiveLow = 3, DataHigh = 4, #[default] Data = 5, DataLow = 6, Background = 7, } let z_pub = z.declare_publisher("daily/quote/sicilian") .priority(Priority::DataHigh) .await.unwrap(); Zenoh supports tra ffi c scheduling based on 7 priority levels Higher priority samples over-takes lower priority ones The priority can be individually controlled for each put operation Or de fi ned when creating a publisher z.put("daily/quote/sicilian", "Cu mangia fa muddichi.") .priority(Priority::DataHigh) .await.unwrap();
  17. Reliability and Links By default Zenoh operates over a single

    TCP/IP link. As a consequence both reliable and best-e ff ort data is sent over this link. The main di ff erence is that best- e ff ort data will always be immediately dropped in case of congestion. { listen: { endpoints: { peer: [ "tcp/127.0.0.1:7447"] } } }
  18. Reliability and Links By default Zenoh operates over TCP/ IP,

    thus if you want to ensure that a best-e ff ort link is used for sending best-e ff ort data you’ll have to enable a best e ff ort-link, e.g. UPD/IP. You’ll also have to increase the number of links allowed per transport — which by default is 1. { listen: { endpoints: { peer: [ "tcp/127.0.0.1:7447", "udp/127.0.0.1:7447", ] } }, transport: { unicast: { max_links: 2 } } }
  19. Advanced Pub/Sub The Publisher keeps a local cache that can

    be queried by the Subscriber when missing samples are detected B Pub Sub D E A C
  20. Advanced Pub/Sub 18 (key,value) 19 (key,value) 20 (key,value) 21 (key,value)

    B Pub Sub D E A C ? ID 21 (key,value) A gap will be detected an the data recovered by means of a query to the publisher cache
  21. Advanced Pub/Sub 18 (key,value) 19 (key,value) 20 (key,value) 21 (key,value)

    B Pub Sub D E A C ? ID 21 (key,value) get sample SN=20
  22. Advanced Pub/Sub 18 (key,value) 19 (key,value) 20 (key,value) 21 (key,value)

    B Pub Sub D E A C ID 20 (key,value) ? ID 21 (key,value)
  23. Observations The advanced Pub/Sub provide the user with the ability

    to con fi gure the use of a heart-beat or a periodic query to the “last-sample” missed By default the cache keeps the last K values that were put, anything that goes out of the cache is not recoverable
  24. Advanced Publisher API Declares an advanced publisher that will use

    heartbeats to allow subscribers to detect losses and which will automatically declare a liveliness token to track its liveliness use zenoh_ext::{ AdvancedPublisherBuilderExt, CacheConfig, MissDetectionConfig}; let publisher = session .declare_publisher(&key_expr) .cache(CacheConfig::default().max_samples(history)) .sample_miss_detection( MissDetectionConfig::default() .heartbeat(Duration::from_millis(500))) .publisher_detection() .await .unwrap();
  25. Advanced Subscriber API Declares an advanced subscriber that will use

    heartbeats to detect losses and which will automatically declare a liveliness token to track its liveliness use zenoh_ext::{ AdvancedPublisherBuilderExt, HistoryConfig, RecoveryConfig}; let subscriber = session .declare_subscriber(key_exp) .history(HistoryConfig::default() .detect_late_publishers()) .recovery(RecoveryConfig::default().heartbeat()) .subscriber_detection() .await .unwrap();
  26. Assumptions Zenoh’s doesn’t try to detect partitions. Instead it focuses

    on detecting misalignment and repairing those Zenoh also assumes that all events on the system are properly ordered — by default using HLC (Hybrid logical Clock) Storage replicas have to have exactly the same key expression Router Router Router Router Router Client Client Peer Peer Peer Peer Peer Router Client Client Client Client Router Router Client Peer Peer Peer Peer Client
  27. Time Discretization Zenoh storage alignment algorithm divides time into (con

    fi gurable) fi xed size intervals starting from a common origin (November 10th 2017 :-) This way intervals can be identi fi ed with an integer w/o any ambiguities and in spite of acceptable (in the sense of HLS) clocks misalignment ... O t 0 1 2 i n n+1 n-1
  28. Era Classi fi cation Zenoh classi fi es intervals into

    di ff erent kind of epochs depending on how likely it is to receive “live” updates within those epochs. Speci fi cally: Hot Era. This is at very least the interval preceding the current, in which live updates can still arrive. Warm Era. This is a con fi gurable number of intervals, preceding the Hot Era Cold Era. This is the collection of intervals preceding the Warm Era down to the time origin. ... O t 0 1 2 i n now n-1 Hot Era Warm Era Cold Era.
  29. Digest To e ffi ciently detect misalignment and identify where

    it happened, a digest is computed and distributed
  30. Hot Era Digest Each interval in the hot era is

    divided into sub intervals For each sub-interval a fi ngerprint is computed on the data received
  31. Warm Era Digest The cold era computes fi ngerprints by

    combining those of included sub- interval (notice these were already computed)
  32. Cold Era Digest The cold is like wise computed by

    combining the fi ngerprints for all the warm eras that have slipped into cold
  33. Anti-Entropy Action & Alignment Replicas distribute distribute periodically a digest

    — anti-entropy actions — that combines the Cold, War and Hot Era fi ngerprints If the received digest is the same as its own digest, there is no misalignment Otherwise, a misalignment is detected and a procedure is stated to identify the set of sub-intervals in which which there is a disagreement
  34. Workshop Registration Replica 1 Replica 2 Replica 3 Router 2

    Router 3 Router 1 Main Router Suppose we have multiple point of service (POS) for registering attendees to today’s workshop Attendees will check-in when arriving and check-out before leaving Eventually any of the replicas should have the full list of participants in the room, in spite of network failures, partitioning, etc.
  35. Workshop Registration Replica 1 Replica 2 Replica 3 Router 2

    Router 3 Router 1 Main Router To begin with, we’ll assume that the storages on each of the POS are disconnected from the rest of the the Zenoh infrastructure because of networking issues As attendees will be registered but each replica will have a view of just the registrant that used the given POS Then we’ll re-estalish connections and see what happens…
  36. Key Highlights Zenoh is one of a kind protocol —

    it uni fi es data in motion, data at rest and computations It is the only protocol able to run from the microcontroller up to the datacenter It has not topological constraints It is extremely easy to use!