Recursion Pharmaceuticals is turning drug discovery into a data science problem. This entails producing and processing petabytes of microscopy images from carefully designed biological experiments. In early 2017 the data production effort in our laboratory scaled to a point where the existing naive batch processing system was not reliably processing the data. The batch approach was also introducing unwanted lag between experiment image capture time and analysis results since an entire experiment, potentially 8TB+, would not begin processing until all the images were available. This was particularly troublesome for our laboratory as they wanted real time quality control metrics on the images. All of these reasons motivated us to replace the batch processing system with a streaming approach. The original data pipeline was implemented as microservices with no central orchestrator but instead relied on implicit flow between the services. The lack of visibility and robustness made the pipeline difficult and costly to operate. We wanted to address these concerns but also avoid rewriting the existing microservices. By building on top of Kafka Streams we created a flexible, highly available, and robust pipeline which leveraged our existing microservices giving us a clear migration path. This presentation will walk you through our thought process and explain the tradeoffs between using Kafka Streams and Spark for our specific use case. We’ll dive into the details of the workflow system we created on top of Kafka Streams that orchestrates these microservices. We’ve been operating with this system since mid 2017 and the additional scale and robustness has played a key role in enabling Recursion to succeed in its mission of discovering new treatments for various diseases. The messages flowing over our Kafka Streams have already led to clinical trials in humans and will hopefully translate into meaningful impact in patients lives one day.