logo

Noam Elfanbaum: Hello Airflow, Farewell Cronjobs

time2 yr agoview0 views

Session language – English Target audience – Developers, Data Scientists, R&D

At Bluevine we use Airflow to drive our all "offline" processing. In this talk, I'll present the challenges and gains we had at transitioning from a single server running Python scripts with cron to a full blown Airflow setup.

At Bluevine, we were looking to upgrade our backend processing infrastructure from a servers running Python scripts with Cron to a more scalable solution that allows for workflows (DAGs) and better observability of the application state. Airflow proved to be a valuable tool, though not without some sharp edges. Some of the points that I'll cover are:

  • Supporting multiple Python versions
  • Event driven DAGs
  • Airflow Performance issues and how we circumvented them
  • Building Airflow plugins to enhance observability
  • Monitoring Airflow using Grafana
  • CI for Airflow DAGs (super useful!)
  • Patching Airflow scheduler
Loading comments...