Luigi Airflow Pinball
repo Github.com/spotify/lui… Github.com/airbnb/airf… Github.com/pinterest/p…
docs luigi.readthedocs.org airflow.readthedocs.org none
review bytepawn.com/luigi.html Bytepawn.com/airflow.htm… Bytepawn.com/pinball.htm…
github forks 750 345 58
github stars 4029 1798 506
github watchers 319 166 47
commits in last 30 days lots of commits lots of commits 3 commits
architecture      
web dashboard not really, minimal very nice yes
code/dsl code code python dict + python code
files/datasets yes, targets not really, as special tasks ?
calendar scheduling no, use cron yes, LocalScheduler yes
datadoc’able [1] maybe, doesn’t really fit probably, by convention yes, dicts would be easy to parse
backfill jobs yes yes ?
persists state kindof yes, to db yes, to db
tracks history yes yes, in db yes, in db
code shipping no yes, pickle workflow is shipped using pickle, jobs are not?
priorities yes yes ?
parallelism yes, workers, threads per workers yes, workers ?
control parallelism yes, resources yes, pools ?
cross-dag deps yes, using targets yes, using sensors yes
finds new deployed tasks no yes ?
executes dag no, have to create special sink task yes yes
multiple dags no, just one yes, also several dag instances (dagruns) yes
scheduler/workers      
starting workers users start worker procceses scheduler spawns workers processes users start worker procceses
comms scheduler’s HTTP API minimal, in state db through master module using Swift
workers execute worker can execute tasks that is has locally worker reads pickled tasks from db worker can execute tasks that is has locally?
contrib      
hadoop yes yes yes
pig yes doc mentions PigOperator, it’s not in the source no
hive yes yes yes
pgsql yes yes no
mysql yes yes no
redshift yes no no
s3 yes yes yes
source      
written in python python python
loc 18000 21000 18000
tests lots minimal lots
maturity fair low low
other serious users yes not really no
pip install yes yes broken
niceties sla, xcom, variables, trigger rules, celery, charts pass data between jobs
does it for you      
sync tasks to workers no yes no
scheduling no yes yes
monitoring no no no
alerting no slas, but probably not enough sends emails
dashboards no yes yes