Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Introducing Apache Airflow® 3 – The Next Evolut...

Introducing Apache Airflow® 3 – The Next Evolution in Orchestration

Apache Airflow® 3 is here, bringing major improvements to data orchestration. In this keynote, core Airflow contributors will walk through key enhancements that boost flexibility, efficiency, and user experience.

Vikram Koka will kick things off with an overview of Airflow 3, followed by deep dives into DAG versioning (Jed Cunningham), enhanced backfilling (Daniel Standish), and a modernized UI (Brent Bovenzi & Pierre Jeambrun).

Next, Kaxil Naik, and Amogh Desai will introduce the Task Execution Interface and Task SDK, enabling tasks in any environment and language. Jens Scheffler will showcase the Edge Executor, while Constance Martineau, Tzu-ping Chung and Vincent Beck will demo event-driven scheduling and data assets. Finally, Buğra Öztürk will unveil CLI enhancements for automation and debugging.

This keynote sets the stage for Airflow 3—don’t miss the chance to learn from the experts shaping the future of workflow orchestration!

Avatar for Kaxil Naik

Kaxil Naik

October 10, 2025
Tweet

More Decks by Kaxil Naik

Other Decks in Programming

Transcript

  1. • Please record all demos using 3.1 • Please use

    light mode for demos and screenshots • Use aspect ratio 16:9 for all demo recordings (reference photo in next slide)
  2. Publish RCs Mar 31 Beta complete Community buy-in and definition

    Development Feature iteration Beta Release candidates AIP voting Aug 1 Dev complete Jan 1 June July Aug Sept Oct Nov Dec Jan Feb Mar Airflow 3.0 timeline (dated Sep 2024)
  3. Airflow 3.0 released April 22nd, 2025! 330+ developers worldwide, 500+

    commits / month 13 AIPs Airflow Improvement Proposals) Already, the third most downloaded Airflow version!
  4. What does this mean for you? Better user experience -

    Dag Versioning, New UI, Backfills Improved security posture - Task Isolation, Updated Architecture Run tasks anywhere, at any time and in any language - Data Assets, Event driven
  5. Airflow 3.1 released Sept 25, 2025! 1400+ commits into 3.1

    (after 3.0 163 developers Accelerated development on the 3.0 foundation!
  6. What does Airflow 3.1 bring? Improved User Experience - UI

    plugins, Translations 17 so far) Expanded Event Driven Scheduling - Supports: SQS, Apache Kafka, Redis, Azure WIP AI enablement - Inference Execution - Human-in-the-loop! Improved authoring - Deadline Alerts
  7. 1.10.2 2.1  2.10 Apache Top Level Project Efficiency &

    Ease of Use Async Operators Dynamic Tasks Setup & Tear-down Airflow ObjectStore Data Awareness Data-aware scheduling Conditional scheduling Combined dataset + time scheduling Dataset Event API 0.1 Creation Airflow Started 2.0 Enterprise Production-Ready HA Schedulers Fully specified REST API TaskFlow API 1.0 2014 2019 2020 2021  2024 20242025 2015 Airflow release timeline Easier to Use Dag Versioning UI modernization Backfills at Scale Improved Security Better Security: Task Isolation Run anywhere, at any time, in any language Remote Execution Event Driven Scheduling Data Assets Python TaskSDK. 3.0 Easier to Use UI Plugins Translations AI enablement Human-in-the-Loop Inference Execution Dag authoring Deadline alerts Run in any language Golang TaskSDK Java TaskSDK WIP 2025 3.1
  8. 30.6% of users with 5 or more years of experience

    use Airflow for MLOps, 13.3% for GenAI Use Cases Insight from the State of Airflow survey
  9. Airflow is a “core technology” to a growing range of

    engineers and personas Data Engineers ML Engineers AI Engineers
  10. Summit sessions reflect this Around a third of use Case

    sessions about AI Enterprise Financial Healthcare Consumer Tech Gov IBM / Red Hat Financial Times Kaiser Permanente Pinterest King / Microsoft INTRVL (voting) SAP Weav iKang Duolingo Foundational AI companies: Note: Only a representative selection show above
  11. Integration Svcs Java,..) Airflow ecosystem Expanded with Airflow 3 Message

    Bus Kafka,..) Integration Svcs Java,..) Message Bus Kafka,..) Airflow API UI / CLI Ext data apps Enterprise apps Ext data apps Enterprise apps UI Plugins, Inference Execution Human in the Loop Data Engineers ML Engineers AI Engineers App Engineers Business Users
  12. UX features Slides intro Slide on API + React being

    a foundation - (Pierre - 3 mins) UX features via demo (Brent - 10 mins) Demo new features Released since Airflow 3.0 in 3.1: Internationalization, Gantt, Calendar (Brent - 2 mins) and Plugins (Pierre - 5 mins) Dag Versioning - Jed Cunningham Backfills - Daniel Standish
  13. Whole New Stack The entire API and UI were rebuilt

    for Airflow 3. FastAPI & React Thanks to our amazing community! Pierre Jeambrun Staff Software Engineer Astronomer
  14. FastAPI Backend Reliable, modern, easy to maintain Remove dependencies (Flask

    AppBuilder, Connexion) Code first API design Inherit FastAPI features (automatic documentation, data validation, native async and more)
  15. React UI Single Page Application, responsive Great React ecosystem (community,

    tools and packages) Flexible Deployment Opening opportunities for internationalization and more complex features
  16. Plugins Backward compatibility layer for Airflow 2 plugins in Airflow

    3 (requires fab provider) Have access to the powerful FastAPI interface via sub applications and middlewares Extend the UI through External Views and React Applications registration Pierre Jeambrun Staff Software Engineer Astronomer
  17. Plugins Takeaways Airflow 2 plugins are still supported, not a

    blocker to upgrade to Airflow 3 Flexible Interface Can’t wait to see what the community builds!
  18. Bundle Versions + DagRuns Dag code changes in git Run

    1  Sleep Run 1  Hello Seen by Airflow Run 2  Sleep Run 2  Hello New V2 Bundle V1 Bundle
  19. What is backfill in Airflow? Re-run tasks over dates Why:

    ◦ data correction ◦ new pipelines ◦ large data migrations
  20. Goals for Airflow 3 Make backfills easier to run and

    observe Give users more control over how they are run
  21. Backfills in Airflow 3 CLI Scheduler API UI • Runs

    managed by core scheduler • Feature parity across all interfaces
  22. Backfills in Airflow 3: wrap up Backfills are easier to

    use More control over execution Feature parity across UI / API / CLI
  23. 3.0 Run Tasks Anywhere: Task Isolation Kaxil Naik Sr Director

    of Engineering @ Astronomer Airflow Committer & PMC Member Amogh Desai Sr Software Engineer @ Astronomer Airflow Committer & PMC Member
  24. Airflow 2.x: Everything in one place Scheduler(s) Web server Executor

    Airflow metadata database Dag processor(s) Triggerer(s) Worker(s) Dag files Triggers Direct access to the Airflow metadata database for all components
  25. Airflow 2.x: Everything in one place Scheduler(s) Web server Executor

    Airflow metadata database Dag processor(s) Triggerer(s) Worker(s) Dag files Triggers Direct access to the Airflow metadata database for all components ✂
  26. But What About Dag Processor & Triggerer? Scheduler(s) Web server

    Executor Airflow metadata database Dag processor(s) Triggerer(s) Worker(s) Dag files Triggers Scheduler(s) API server Executor Airflow metadata database Dag processor(s) Triggerer(s) Worker(s) Dag files Triggers Database access
  27. Airflow 3 Architecture: Completing the Isolation Scheduler(s) API server Executor

    Airflow metadata database Dag processor(s) Triggerer(s) Worker(s) Dag code Triggers Core app Execution app In-process API server In-process API server User-defined code does not have direct access to the metadata database anymore Task execution interface (Task SDK)
  28. Scheduler(s) API server Executor Airflow metadata database Dag processor(s) Triggerer(s)

    Worker(s) Dag code Triggers Core app Execution app In-process API server In-process API server Airflow Server Task execution interface (Task SDK) Airflow Client
  29. Enables - Independent Upgrades Server Airflow 3.2 UI / APIServer

    Scheduler Client Task SDK 1.1 - Remote or Local Worker Worker Workers Triggerer Processor Client Task SDK 1.2 - Remote or Local Worker Worker Workers Triggerer Processor
  30. 3.0 Run Tasks Anywhere: Edge Executor Jens Scheffler Cluster Technical

    Architect @ Bosch Airflow Committer & PMC Member
  31. Edge Executor - The “Very” Remote Database Celery Worker (Pool)

    API Server Scheduler/Dag Parser Kubernetes (optional) Firewall/ AppGW/ Ingress HTTPS User Airflow Main Deployment/Cloud Corporate Network Proxy Edge Worker Client Client Client Edge Worker On-Prem System Remote Site
  32. From Time-Driven to Data-Driven Stop waiting for midnight. Start reacting

    in (near) real time. Time-Driven Data-Driven Run on a schedule (Cron jobs) Run on events (triggers & data arrivals)
  33. Refresh Tableau Update Executive Dashboard Dag Check for Daily Snapshot

    (sensor) Ingest Sales Events Transform & Validate Publish Daily Snapshot Produce Sales Daily Snapshot Dag Poking Produces Sales Daily Snapshot Assets: Orchestrate Around Data
  34. Sales Daily Snapshot Refresh Tableau Update Executive Dashboard Dag Check

    for Daily Snapshot (sensor) Ingest Sales Events Transform & Validate Publish Daily Snapshot Produce Sales Daily Snapshot Dag Annotate Asset Sales Daily Snapshot Subscribe (replaces sensor) Assets: Orchestrate Around Data
  35. Assets: Orchestrate Around Data Asset as the focus, not the

    process Equivalent (and matter of personal preference)
  36. Event-Driven Scheduling (Asset Watchers) Message Queue Sales Events Asset: Sales

    Events Dag: Realtime Sales Watcher: Sales Events Queue Queue → Watcher (polls for new events) → Asset → Dag
  37. Inference Execution: The Old Way Predict Customer Churn Dag Dag

    run with logical date = ‘2025-01-01’ Dag run with logical date = ‘2025-01-02’ Dag run with logical date = ‘2025-01-03’ Great for backfills. Painful for live inference Traditional Dags have fixed Logical Dates
  38. Inference Execution: The New Way Predict Customer Churn Dag Dag

    Run ID = req123 Dag Run ID = req124 Dag Run ID = req125 Response on Completion API Response Example User/API Request Synchronous Execution: Run → Predict → Respond, all in one flow
  39. What’s Next: Expanding the Asset Model Partitions & Watermarks →

    Define evolving assets by da, hour or key; track freshness and history Targeted Updates→ Refresh only what’s new or stale, instead of reprocessing everything Asset Validations → Validation and status surfaced directly in the UI At-a-Glance Visibility → Validation and status surfaced directly in the UI End-to-End Event Tracking → Trace events from message queue → asset → downstream dags
  40. 3.1 Run Tasks in any Language Ash Berlin Taylor Member

    of Apache Airflow PMC; and Director, Airflow Engineering Astronomer
  41. Tasks in any language: Goal Enable integrations with applications -

    Pull / push data from/to existing apps - Expand Airflowʼs role as “integration platformˮ - Enable non-python teams to leverage Airflow First Language: Golang, Java to follow - Stepping stone for compiled language support Non-goal at this time: - Writing entire workflows i.e. Dags in other languages
  42. Introducing Go SDK: 1.0.0-beta1 Airflow Tasks in Go Full power

    of Airflow - Connections, Variables, XCom and more Feels like natural part of Go ecosystem
  43. Airflow 3 is already being widely used Summit sessions include:

    Uber, Qualcomm, and DataDog Hundreds of Enterprise deployments running Airflow 3 Most have taken: Upgrade, then expand approach
  44. Airflow 3: Foundation for more contributions Rewritten React-based UI -

    UI plugins Airflow as a platform) - Translations, expanded contributors Data Assets: - Watchers / Common Msg Interface: Event Driven Scheduling Task SDK in additional languages Around 3,500 contributors now
  45. Integration Svcs Java,..) Airflow ecosystem Expanded with Airflow 3 Message

    Bus Kafka,..) Integration Svcs Java,..) Message Bus Kafka,..) Airflow API UI / CLI Ext data apps Enterprise apps Ext data apps Enterprise apps UI Plugins, Inference Execution Human in the Loop Data Engineers ML Engineers AI Engineers App Engineers Business Users
  46. Upgrading to Airflow 3 Min Versions: - Airflow 2.7 and

    Python 3.10 - Database cleanup: Can save a lot of time! Pre-upgrade: use the checks - Dag Code: ruff check dags/ --select AIR3 - For Airflow config, use airflow config update from Airflow 2.11 Upgrade: - Database upgrade - Startup scripts: web server to api server Note: Upgrade guide at Upgrading to Airflow 3
  47. TODO: Airflow 3 - Expert opinions Best practices: - Use

    Assets and Tasks together for best results. Use Assets as interface between Airflows, rather than using “Trigger Dagˮ. - Use common.abc interfaces whenever possible Anti-patterns: - Storing credentials in Airflow db, use Secrets back-end instead - Using the Airflow metadatabase for large data sharing between Tasks. XComs are for passing small data, ideally references
  48. Tues Oct 7 • 11:30AM Security made us do it:

    Airflow’s new Task Execution Architecture Columbia A Dive Deeper into Airflow 3! Tues Oct 7 • 12:15PM Airflow That Remembers: The Dag Versioning Era is here! Columbia A Tues Oct 7 • 2:00PM Unlocking Event-Driven Scheduling in Airflow 3: A New Era of Reactive Data Pipelines Columbia A Tues Oct 7 • 4:15PM EdgeExecutor / Edge Worker - The new option to run anywhere Columbia C Wed Oct 8 • 12:30PM Airflow 3 UI is not enough? Add a Plugin! Columbia A Weds Oct 8 • 3:45PM Assets: Past, Present, Future Columbia A
  49. Wed Oct 8 • 10:30AM Get started with Airflow 3.0

    Workshop Room 301 Dive Deeper into Airflow 3! Wed Oct 8 • 12:30PM Airflow 3 UI is not enough? Add a Plugin! Columbia A Weds Oct 8 • 3:45PM Assets: Past, Present, Future Columbia A
  50. Fill it out for a free Airflow 3 Fundamentals or

    Dag Authoring in Airflow 3 certification code The 2025 Apache Airflow® Survey is here!