Saturday, April 21, 2012

The art of crontab jobs monitoring

In a regular production or development environment there are normally a lot of crontab jobs configured on running servers. The jobs can be a part of deployed applications or can perform different system administration tasks like backups, reporting, etc. This post will describe several key points which should be considered while configuring crontab jobs in a large environment.

Automatic deployment of crontab configuration
Linux distributions provide a convenient way to automatically deploy new crontab jobs while installing a new software package or using centralized configuration management systems like Puppet or Chef. You simply drop a properly formatted crontab configuration file in directory /etc/cron.d/, and the running crontab scheduler process will automatically read the new file and configure jobs specified in it.

Configuration of email subsystem
To properly monitor (as described below) the execution of deployed crontab jobs it is highly recommended to configure email subsystem on each deployed server, so the crontab scheduler will be able to email to the script owners (the user account each script is running under) the output text produced on each job run. Standard Postfix or Sendmail MTA available in all Linux distribution will do the work just fine.

It might be wise to configure email redirection to a centralized sysadmin email account (like for user accounts used to run the crontab jobs, so the generated emails will not stack in unmonitored local mailboxes.

Output of crontab jobs
In many cases deployed crontab scripts will produce some text output (like monitoring or error messages), and I would recommend to use the following approach to manage the messages:
  1. All error messages can be sent directly to stderr (standard error) handler, and automatically emailed by crontab daemon to the script owner at the end of each run. Another option (when available) is to log the error messages using local syslog facility, and monitor the events using SEC or logcheck tools
  2. All script debug messages (when the debug mode is activated using a script parameter) can be sent to the standard output to be emailed at the end of each script run 
  3. When it is important to see the script regular output AND the script is running with low frequency (like once in a day) it is normal to send the script activity messages to the standard output handler (and get the output by email). Examples of such scripts can be daily backup or reporting tools
  4. For frequently running jobs it is really annoying to get emailed with the script output on each script run. In some cases it is acceptable to silence the script's standard output by redirecting it to /dev/null. Another option it to log the messages using local syslog facility, and monitor them using a tool like logcheck

Cron jobs overrun protection
I highly recommend to configure cron jobs overrun protection using a tool like lockrun. The mechanism is especially useful when a job is running with high frequency (like each one or five minutes), and there is a requirement that only one instance of the script will run in a time.

Implementing the described simple approaches will allow you to deploy a manageable and scaleable set of regular crontab jobs.

1 comment:

  1. Hi! Victor. You really shared something useful here. As a developer, I know some of these things already but not all tools. Anyways, I also used for monitoring my cron jobs. You may also want to check it out. Other readers of this post may also utilize the tool because it's free.