Prevent cronjobs from overlapping in Linux

It’s an unfortunate common problem on many systems: you have scheduled tasks defined in cronjobs and for some reason, they take longer to execute than anticipated. This eventually means that they start to overlap and run at the same time. If those are cronjobs that are acting on the same data from a database, it may mean data corruption.

If they’re doing heavy processing of data, it could mean the server load is rising too high. Because of that high load, those cronjobs are taking longer than usual and before you know it there’s a vicious circle in which cronjobs keep launching and overlapping eachother.

Obviously, you don’t want this. The good news is, this is fairly easy to prevent.

Using flock#

Flock is a very interesting tool for managing lock files. Those lock files are used to determine if a script or application is already running (comparable to a PID file that contains the Process ID of the running script). If the lock exists, the cronjob won’t start. If the lock doesn’t exist, it’s safe to launch the cron.

Take the following common example, where a cron is run every minute on the server.

$ crontab -l
* * * * * /usr/bin/php /path/to/cron.php

If the script takes longer than a minute to execute, they’ll begin to overlap. To prevent it, you can change it with the flock example below.

$ crontab -l
* * * * * /usr/bin/flock -w 0 /path/to/cron.lock /usr/bin/php /path/to/cron.php

The example above requires flock to manage those lock files. If it does not yet exist on your system, installation should be as simple as a yum install util-linux or apt-get install util-linux, depending on your Linux Distribution (see: how to find your current Linux Distribution ).

The moment flock starts, it locks the lock-file you specify in the command. You can see that by requesting the user/script that is having the lock on that file.

$ fuser -v /path/to/cron.lock
                     USER        PID ACCESS COMMAND
cron.lock:           root       7836 f.... flock
                     root       7837 f.... php

It will show you the Process IDs (PIDs) of the script that is holding the lock. If no script is holding the lock, the fuser command will simply return nothing.

$ fuser -v /path/to/cron.lock

So flock is a pretty good way to prevent cronjobs from overlapping by using an extra Command Line tool.

If flock isn’t installed on your system yet, install the utils package which includes flock.

$ yum install util-linux

And you’re set.

Using pgrep#

Another method, without using lock files, is using a rather simple bash-one liner that checks for the current running file and executes it if it’s not running. The trick is to wrap your crontask in a uniquely-named bash-script, as such.

$ cat /path/to/cron.sh
#!/bin/bash
/usr/bin/php /path/to/cron.php

$ chmod +x /path/to/cron.sh

In your crontab, it should be listed as such now.

$ crontab -l
* * * * * /path/to/cron.sh

The command above will, just as the first example, execute our PHP script every minute through a bash script. To prevent it from overlapping, it can also be changed to this.

$ crontab -l
* * * * * /usr/bin/pgrep -f /path/to/cron.sh > /dev/null 2> /dev/null || /path/to/cron.sh

The pgrep command will return false if it does not find a running process matching the first argument, /path/to/cron.sh. If it returns false, it’ll process the second part of the OR comparison (the double vertical line, ||). If the running process was found, pgrep will return the Process ID (PID) and Bash will not continue to the second part of the OR statement since the first already returned true.

The trick here is to use very unique scriptnames. If the name is too generic (such as “cron.sh”), pgrep may return Process IDs from other running cron jobs and not execute the cron you wanted.

Using lock-files within the script#

If the examples above are not available to you, you can still use the concept of lock files in your application. One of the first commands in your script could be to check for the existance of a lock-file. If it exists, the script would simply exit(1) out of the application and stop running. If the lock-file does not exist, the script could create it and prevent the next job from executing.

As a last step in your script you remove the lock file to indicate that the script has finished and allowing the next run to continue.