Friday, April 22, 2011

cron

Cron is a time-based job scheduler in Unix-like computer operating systems. The name cron comes from the word “chronos”, Greek for “time”. Cron enables users to schedule jobs (commands or shell scripts) to run periodically at certain times or dates. It is commonly used to automate system maintenance or administration, though its general-purpose nature means that it can be used for other purposes, such as connecting to the Internet and downloading email.

Overview

Cron is driven by a crontab (cron table) file, a configuration file that specifies shell commands to run periodically on a given schedule. The crontab files are stored where the lists of jobs and other instructions to the cron daemon are kept. Users can have their own individual crontab files and often there is a system wide crontab file (usually in /etc or a subdirectory of /etc) which only system administrators can edit.
Each line of a crontab file represents a job and is composed of a CRON expression, followed by a shell command to execute. Some implementations of cron, such as that in the popular 4th BSD edition written by Paul Vixie and included in many Linux distributions, add a sixth field to the format: an account username that the specified job will be run by (subject to user existence and permissions). This is only allowed in the system crontabs, not in others which are each assigned to a single user to configure.
For “day of the week” (field 5), both 0 and 7 are considered Sunday, though some versions of Unix such as AIX do not list “7″ as acceptable in the man page. While normally the job is executed when the time/date specification fields all match the current time and date, there is one exception: if both “day of month” and “day of week” are restricted (not “*”), then either the “day of month” field (3) or the “day of week” field (5) must match the current day.

Examples

The following will clear the Apache error log at one minute past midnight (00:01 of every day of the month, of every day of the week), assuming that the default shell for the cron user is Bourne shell compliant.
1 0 * * *  echo -n "" > /www/apache/logs/error_log
The following would run cron every two hours like 2am, 4am, 6am, 8am
0 */2 * * *  /home/user/test.pl

Predefined scheduling definitions

There are several special predefined values which can be used to substitute the CRON expression.
Entry Description Equivalent To
@yearly (or @annually) Run once a year 0 0 1 1 *
@monthly Run once a month 0 0 1 * *
@weekly Run once a week 0 0 * * 0
@daily (or @midnight) Run once a day 0 0 * * *
@hourly Run once an hour 0 * * * *
@reboot Run at startup
*     *     *   *    *        command to be executed
-     -     -   -    -
|     |     |   |    |
|     |     |   |    +----- day of week (0 - 6) (Sunday=0)
|     |     |   +------- month (1 - 12)
|     |     +--------- day of        month (1 - 31)
|     +----------- hour (0 - 23)
+------------- min (0 - 59)
@reboot configures a job to run once when the daemon is started. Since cron is typically never restarted, this typically corresponds to the machine being booted. This behavior is enforced in some variations of cron, such as that provided in Debian[3], so that simply restarting the daemon does not re-run @reboot jobs.
@reboot can be useful if there is a need to start up a server or daemon under a particular user, and the user does not have access to configure init to start the program.

Cron permissions

The following two files play an important role:
  • /etc/cron.allow – If this file exists, then you must be listed therein (your username must be listed) in order to be allowed to use cron jobs.
  • /etc/cron.deny – If the cron.allow file does not exist but the /etc/cron.deny file does exist, then you must not be listed in the /etc/cron.deny file in order to use cron jobs.
Please note that if neither of these files exists, then depending on site-dependent configuration parameters, only the super user will be allowed to use cron jobs, or all users will be able to use cron jobs.

Timezone handling

Most cron implementations simply interpret crontab entries in the system time zone setting under which the cron daemon itself is run. This can be a source of dispute if a large multiuser machine has users in several time zones, especially if the system default timezone includes the potentially confusing DST. Thus, a cron implementation may special-case any “TZ=<timezone>” environment variable setting lines in user crontabs, interpreting subsequent crontab entries relative to that timezone.[4]

History

Early versions

The cron in Version 7 Unix, written by Brian Kernighan, was a system service (later called daemons) invoked from /etc/inittab when the operating system entered multi-user mode. Its algorithm was straightforward:
  1. Read /usr/etc/crontab
  2. Determine if any commands are to be run at the current date and time and if so, run them as the Superuser, root.
  3. Sleep for one minute
  4. Repeat from step 1.
This version of cron was basic and robust but it also consumed resources whether it found any work to do or not. In an experiment at Purdue University in the late 1970s to extend cron’s service to all 100 users on a time-shared VAX, it was found to place too much load on the system.

Multi-user capability

The next version of cron, with the release of Unix System V, was created to extend the capabilities of cron to all users of a Unix system, not just the superuser. Though this may seem trivial today with most Unix and Unix-like systems having powerful processors and small numbers of users, at the time it required a new approach on a 1 MIPS system having roughly 100 user accounts.
In the August, 1977 issue of the Communications of the ACM, W. R. Franta and Kurt Maly published an article entitled “An efficient data structure for the simulation event set” describing an event queue data structure for discrete event-driven simulation systems that demonstrated “performance superior to that of commonly used simple linked list algorithms,” good behavior given non-uniform time distributions, and worst case complexity O\left(\sqrt{n}\right), “n” being the number of events in the queue.
A graduate student, Robert Brown, reviewing this article, recognized the parallel between cron and discrete event simulators, and created an implementation of the Franta-Maly event list manager (ELM) for experimentation. Discrete event simulators run in “virtual time”, peeling events off the event queue as quickly as possible and advancing their notion of “now” to the scheduled time of the next event. By running the event simulator in “real time” instead of virtual time, a version of cron was created that spent most of its time sleeping, waiting for the moment in time when the task at the head of the event list was to be executed.
The following school year brought new students into the graduate program, including Keith Williamson, who joined the systems staff in the Computer Science department. As a “warm up task” Brown asked him to flesh out the prototype cron into a production service, and this multi-user cron went into use at Purdue in late 1979. This version of cron wholly replaced the /etc/cron that was in use on the Computer Science department’s VAX 11/780 running 32/V.
The algorithm used by this cron is as follows:
  1. On start-up, look for a file named .crontab in the home directories of all account holders.
  2. For each crontab file found, determine the next time in the future that each command is to be run.
  3. Place those commands on the Franta-Maly event list with their corresponding time and their “five field” time specifier.
  4. Enter main loop:
    1. Examine the task entry at the head of the queue, compute how far in the future it is to be run.
    2. Sleep for that period of time.
    3. On awakening and after verifying the correct time, execute the task at the head of the queue (in background) with the privileges of the user who created it.
    4. Determine the next time in the future to run this command and place it back on the event list at that time value.
Additionally, the daemon would respond to SIGHUP signals to rescan modified crontab files and would schedule special “wake up events” on the hour and half hour to look for modified crontab files. Much detail is omitted here concerning the inaccuracies of computer time-of-day tracking, Unix alarm scheduling, explicit time-of-day changes, and process management, all of which account for the majority of the lines of code in this cron. This cron also captured the output of stdout and stderr and e-mailed any output to the crontab owner.
The resources consumed by this cron scale only with the amount of work it is given and do not inherently increase over time with the exception of periodically checking for changes.
Williamson completed his studies and departed the University with a Masters of Science in Computer Science and joined AT&T Bell Labs in Murray Hill, New Jersey, and took this cron with him. At Bell Labs, he and others incorporated the Unix at command into cron, moved the crontab files out of users’ home directories (which were not host-specific) and into a common host-specific spool directory, and of necessity added the crontab command to allow users to copy their crontabs to that spool directory.
This version of cron later appeared largely unchanged in Unix System V and in BSD and their derivatives, the Solaris Operating System from Sun Microsystems, IRIX from Silicon Graphics, HP-UX from Hewlett-Packard, and IBM AIX. Technically, the original license for these implementations should be with the Purdue Research Foundation who funded the work, but this took place at a time when little concern was given to such matters.

Modern versions

With the advent of the GNU Project and Linux, new crons appeared. The most prevalent of these is the Vixie cron, originally coded by Paul Vixie in 1987. Version 3 of Vixie cron was released in late 1993. Version 4.1 was renamed to ISC Cron and was released in January 2004. Version 3, with some minor bugfixes, is used in most distributions of Linux and BSDs.
In 2007, RedHat forked vixie-cron 4.1 to the cronie project and included anacron 2.3 in 2009.
Other popular implementations include anacron and fcron. However, anacron is not an independent cron program; it relies on another cron program to call it in order to perform.
A webcron solution schedules recurring tasks to run on a regular basis wherever cron implementations may not be available in a web hosting environment.
source : http://en.wikipedia.org/wiki/Cron

No comments:

Post a Comment