Skip to content Skip to sidebar Skip to footer

How Can I Log From My Python Application To Splunk, If I Use Celery As My Task Scheduler?

I have a python script running on a server, that should get executed once a day by the celery scheduler. I want to send my logs directly from the script to splunk. I am trying to u

Solution 1:

After hours of figuring out what eventually could be wrong with my code, I now have a result that satisfies me. First I created a file loggingsetup.py where I configured my python loggers with dictConfig:

LOGGING = {
    'version': 1,
    'disable_existing_loggers': True,
    'formatters': { # Sets up the format of the logging output
        'simple': {
            'format': '%(asctime)s - %(name)s - %(levelname)s - %(message)s',
             'datefmt': '%y %b %d, %H:%M:%S',
            },
        },
    'filters': {
        'filterForSplunk': { # custom loggingFilter, to not have Logs logged to Splunk that have the word celery in the name
            '()': 'loggingsetup.RemoveCeleryLogs', # class on top of this file
            'logsToSkip': 'celery' # word that it is filtered for
        },
    },
    'handlers': {
        'splunk': { # handler for splunk, level Warning. to not have many logs sent to splunk
            'level': 'WARNING',
            'class': 'splunk_logging_handler.SplunkLoggingHandler',
            'url': os.getenv('SPLUNK_HTTP_COLLECTOR_URL'),
            'splunk_key': os.getenv('SPLUNK_TOKEN'),
            'splunk_index': os.getenv('SPLUNK_INDEX'),
            'formatter': 'simple',
            'filters': ['filterForSplunk']
        },
        'console': { 
            'level': 'DEBUG',
            'class': 'logging.StreamHandler',
            'stream': 'ext://sys.stdout',
            'formatter': 'simple',
        },
    },
    'loggers': { # the logger, root is used
        '': {
            'handlers': ['console', 'splunk'],
            'level': 'DEBUG',
            'propagate': 'False', # does not give logs to other logers
        }
    }
}

For the logging filter, I had to create a class that inherits from the logging.Filter class. The class also relies in file loggingsetup.py

class RemoveCeleryLogs(logging.Filter): # custom class to filter for celery logs (to not send them to Splunk)
    def __init__(self, logsToSkip=None):
        self.logsToSkip = logsToSkip

    def filter(self, record):
        if self.logsToSkip == None:
            allow = True
        else:
            allow = self.logsToSkip not in record.name
        return allow

After that, you can configer the loggers like this:

logging.config.dictConfig(loggingsetup.LOGGING)
logger = logging.getLogger('')

And because celery redirected it's logs and logs were doubled, I had to update the app.conf:

app.conf.update({
    'worker_hijack_root_logger': False, # so celery does not set up its loggers
    'worker_redirect_stdouts': False, # so celery does not redirect its logs
})

The next Problem I was facing was, that my chosen Splunk_Logging library mixed something up with the url. So I had to create my own splunk_handler class that inherits from the logging.Handler class. The important lines here are the following:

auth_header = {'Authorization': 'Splunk {0}'.format(self.splunk_key)}
json_message = {"index": str(self.splunk_index), "event": data}
r = requests.post(self.url, headers=auth_header, json=json_message)

I hope that I can help someone with this answer who is facing similar problems with python, splunk and celery logging! :)


Post a Comment for "How Can I Log From My Python Application To Splunk, If I Use Celery As My Task Scheduler?"