Skip to content Skip to sidebar Skip to footer

Issues Running Airflow Scheduler As A Daemon Process

I have an EC2 instance that is running airflow 1.8.0 using LocalExecutor. Per the docs I would have expected that one of the following two commands would have raised the scheduler

Solution 1:

Documentation might be dated?

I normally start Airflow as following

airflow kerberos -D
airflow scheduler -D
airflow webserver -D

Here's airflow webeserver --help output (from version 1.8):

-D, --daemon Daemonize instead of running in the foreground

Notice there is not boolean flag possible there. Documentation has to be fixed.

Quick note in case airflow scheduler -D fails:

This is included in the comments, but it seems like it's worth mentioning here. When you run your airflow scheduler it will create the file $AIRFLOW_HOME/airflow-scheduler.pid. If you try to re-run the airflow scheduler daemon process this will almost certainly produce the file $AIRFLOW_HOME/airflow-scheduler.err which will tell you that lockfile.AlreadyLocked: /home/ubuntu/airflow/airflow-scheduler.pid is already locked. If your scheduler daemon is indeed out of commission and you find yourself needing to restart is execute the following commands:

sudo rm $AIRFLOW_HOME airflow-scheduler.err  airflow-scheduler.pid
airflow scheduler -D 

This got my scheduler back on track.


Solution 2:

About task start via systemd:

I had a problem with the PATH variable when run this way is initially empty. That is, when you write to the file /etc/sysconfig/airflow:

PATH=/home/ubuntu/bin:/home/ubuntu/.local/bin:$PATH

you literally write:

PATH=/home/ubuntu/bin:/home/ubuntu/.local/bin

Thus, the variable PATH doesn't contain /bin which is a bash utility that LocalExecutor uses to run tasks.

So I do not understand why in this file you have not specified AIRFLOW_HOME. That is, the directory in which the Airflow is looking for its configuration file.


Post a Comment for "Issues Running Airflow Scheduler As A Daemon Process"