LibrarySites.Banner

Repost: All About Sitecore Scheduling: Agents and Tasks

This blog post provides information about agents and tasks that you can use to schedule processes with the Sitecore ASP.NET Web Content Management System (CMS).

This is a repost of http://sitecorejohn.spaces.live.com/Blog/cns!960125F1D4A59952!698.entry.

Introduction

There are at least three ways to schedule processes with the Sitecore .NET CMS:

  1. Configure agents in web.config.
  2. Define scheduled tasks in a database.
  3. Use the Windows Task Scheduler to call a Web service in a Sitecore instance.

Agents in web.config are very straightforward and by far the most common way to schedule processes in Sitecore, especially for things that run perpetually, such as administrative work.

Scheduling tasks in the database is a good solution when you need to control tasks dynamically and/or programmatically, since this approach doesn’t require that you update web.config, which would restart the ASP.NET worker process.

The Windows Task Scheduler gets around limitations to the ASP.NET architecture that make it unrealistic to expect to invoke a process at a very specific time, and the fact that the ASP.NET worker process might not be active when you want Sitecore to invoke a process.

Note: The Taks Scheduling (sic) thread on the Sitecore Developer Network (SDN) forums indicates that Sitecore may interpret 24:00:00 as 00:00:00; use a value such as 23:59:59 instead.

Agents in web.config

Sitecore uses web.config to enable several default agents, and you enable additional agents that Sitecore disables by default, or create your own agents.

Each /configuration/sitecore/scheduling/agent element defines an agent. The type attribute of each <agent> element specifies the .NET class to invoke. The method attribute defines the method of the class that Sitecore will call. The interval attribute defines the minimal interval between invocations of the agent in HH:mm:ss format. A value of 00:00:00 for the interval attribute disables an agent. Comments above the agent definitions in the web.config file describe their functions.

Sitecore passes parameters to the constructor of the agent class and sets properties of that object in the same way it does for any other type defined in web.config. You can read about how to define properties and constructor arguments in the SDN forum thread Validation of Content Item name .

The polling frequency determines how often Sitecore checks for agents that it needs to invoke. You can control the polling frequency by setting the value of the /configuration/sitecore/scheduling/frequency element in the web.config file (in HH:mm:ss format). I like to set the polling frequency to half of the value of the smallest value of the interval attribute of all the agents defined in the web.config file.

To create a custom agent, just create a class that implements a method, and register that class as an agent. Here’s an example class:

namespace Sitecore.Sharedsource.Tasks
{
  public class LogSomethingAgent
  {
    public string Message
    {
      get;
      set;
    }
    public void Run()
    {
      Sitecore.Diagnostics.Log.Info(this + " : " + this.Message, this);
    }
  }
}

After setting the value of the <frequency> element to one minute (00:00:30), here’s how I configured the agent in web.config for testing:

<agent type="Sitecore.Sharedsource.Tasks.LogSomething" method="Run" interval="00:01:00">
  <message>Hello, World!</message>
</agent>

With the default Sitecore verbosity level, Sitecore logs the invocation of each agent. So Sitecore actually writes three lines to the log each time it invokes this agent. For example:

ManagedPoolThread #8 13:05:11 INFO  Job started: Sitecore.Sharedsource.Tasks.LogSomethingAgent
ManagedPoolThread #8 13:05:11 INFO  Sitecore.Sharedsource.Tasks.LogSomethingAgent : Hello, World!
ManagedPoolThread #8 13:05:11 INFO  Job ended: Sitecore.Sharedsource.Tasks.LogSomethingAgent

For an example of an agent that removes old versions, see my blog post Remove Old Versions of Items in the Sitecore ASP.NET CMS.

The UrlAgent

If you find that ASP.NET inactive when you want your scheduled task to run, you can reduce the interval attribute of the UrlAgent. The goal of this agent is to keep the ASP.NET worker process alive by periodically requesting an ASP.NET page.

Sometimes the URL used is incorrect in one or two ways. This is the default configuration of the UrlAgent that I found on my system:

<agent type="Sitecore.Tasks.UrlAgent" method="Run" interval="01:00:00">
  <param desc="url">/sitecore/service/keepalive.aspx</param>
  <LogActivity>true</LogActivity>
</agent>

If the URL does not contain a protocol and hostname, Sitecore assumes http://127.0.0.1, which may or may not correspond to the Sitecore instance. This will most likely show up in the Sitecore log as something like the following:

ManagedPoolThread #3 10:26:51 INFO  Job started: Sitecore.Tasks.UrlAgent
ManagedPoolThread #3 10:26:51 INFO  Scheduling.UrlAgent started. Url: http://127.0.0.1/sitecore/service/keepalive.aspx
ManagedPoolThread #3 10:26:51 ERROR Exception in UrlAgent (url: /sitecore/service/keepalive.aspx)
Exception: System.Net.WebException
Message: The remote server returned an error: (404) Not Found.

A solution is to update the first parameter to the constructor to include the protocol and domain:

  <param desc="url"> http://sitename /sitecore/service/keepalive.aspx</param>

I seem to remember some old Sitecore versions had a typo in the path in this URL, which I expect would also result in 404, so check the referenced file exists, or find it and update the path. It’s probably not a problem if the file doesn’t exist, as the ASP.NET process has to be running to handle the HTTP 404 condition, but this problem will generate some noise in the log.

Scheduled Tasks

You can create items in a Sitecore database to schedule tasks. First, create a .NET class that contains the logic, then create a command definition item that references that class, and then create one or more schedule definition items to invoke that command.

To define the scheduled task logic, create a class that contains a method with the following signature:

public void MethodName(Sitecore.Data.Items.Item[] items, Sitecore.Tasks.CommandItem command, Sitecore.Tasks.ScheduleItem schedule)

The text below explains how to pass items to your command in the first parameter. The second parameter is the command definition item. The third parameter is the schedule definition item.

For example:

namespace Sitecore.Sharedsource.Tasks
{
  public class LogSomethingDatabase
  {
    public void WriteToLogFile(
      Sitecore.Data.Items.Item[] items,
      Sitecore.Tasks.CommandItem command,
      Sitecore.Tasks.ScheduleItem schedule)
    {
      if (items != null)
      {
        foreach(Sitecore.Data.Items.Item item in items)
        {
          Sitecore.Diagnostics.Log.Info(this + " : item : " + item.Paths.FullPath, this);
        }
      }
      Sitecore.Diagnostics.Log.Info(this + " : command : " + command.InnerItem.Paths.FullPath, this);
      Sitecore.Diagnostics.Log.Info(this + " : schedule : " + schedule.InnerItem.Paths.FullPath, this);
    }
  }
}

To define the command for the scheduled task to invoke:

  1. If appropriate, first select a database in the Sitecore desktop.
  2. In the Content Editor, navigate to the /Sitecore/System/Tasks/Commands item.
  3. In the Content Editor, insert a command definition item using the System/Tasks/Command data template.
  4. In the command definition item, in the Data section, in the Type field, enter the signature of the .NET class, such as Sitecore.Sharedsource.Tasks.LogSomethingDatabase, assembly.
  5. In the command definition item, in the Data section, in the Method field, enter the name of the method to invoke in that class, such as WriteToLogFile.

That was actually the easy part.

To define the schedule to invoke the command:

  1. If appropriate, first select the appropriate database in the Sitecore desktop.
  2. In the Content Editor, navigate to the /Sitecore/System/Tasks/Schedules item.
  3. In the Content Editor, insert a schedule definition item using the System/Tasks/Schedule data template.
  4. In the schedule definition item, in the Data section, in the Command field, select the command to invoke.
  5. In the schedule definition item, in the Data section, in the Items field, you can specify a list of items to pass to the command, separated by pipe (“|”) characters. Alternatively, you can enter a Sitecore query (without the query: prefix), but remember that the Query.MaxItems setting in the web.config file applies to this query.
  6. In the schedule definition item, in the Data section, in the Schedule field, enter a parameters to control the schedule, separated by piple (“|”) characters. The first parameter indicates the start date for the schedule in yyyyMMdd format. The second parameter indicates the end date for the schedule in the same format, The third parameter indicates the days of the week on which to run the task. This basically works like a bit mask, where 1=Sunday, 2=Monday, 4=Tuesday, 8=Wednesday, 16=Thursday, 32=Friday, and 64=Saturday. So Monday through Friday is 2+4+8+16+32=62, while every day is 1+2+4+8+16+32+64=127. These values come from the Sitecore.DaysOfWeek enum. The fourth parameter is the minimum interval between invocations of the task in HH:mm:ss format.
  7. In the schedule definition item, in the Data section, in the Last Run field, you can enter a date and time to control when Sitecore thinks it last invoked the command due to the existence of this schedule definition item, in the ISO format Sitecore uses for all dates (yyyyMMddTHHmmss). Sitecore automatically updates this field after invoking the command to control the processing schedule.
  8. In the schedule definition item, in the Data section, select the Async checkbox to cause the task to run asynchronously.
  9. In the schedule definition item, in the Data section, select the Auto Remove checkbox to cause Sitecore to remove the schedule definition item after invoking the task. This setting only comes into play after the expiration date defined in the Schedule field.

Obviously something that only a computer program could love. You can also use the Sitecore.Globals.TaskDatabase API to manipulate scheduled tasks in a database.

I created a test Command definition item named Command and a test schedule definition item named Scheduled Task in the Master database. In my scheduled definition item, for Command, I selected my command definition item and used /sitecore|/sitecore/content|/sitecore/content/home for Items and 20000101|21000101|127|00:00:01 for Schedule, leaving everything else blank. Afterwards, I found the following in my Sitecore log:

ManagedPoolThread #19 14:08:15 INFO  Job started: Sitecore.Tasks.DatabaseAgent
ManagedPoolThread #19 14:08:15 INFO  Scheduling.DatabaseAgent started. Database: master
ManagedPoolThread #19 14:08:15 INFO  Examining schedules (count: 1)
ManagedPoolThread #19 14:08:15 INFO  Starting: Scheduled Task
ManagedPoolThread #19 14:08:15 INFO  Sitecore.Sharedsource.Tasks.LogSomethingDatabase : item : /sitecore
ManagedPoolThread #19 14:08:15 INFO  Sitecore.Sharedsource.Tasks.LogSomethingDatabase : item : /sitecore/content
ManagedPoolThread #19 14:08:15 INFO  Sitecore.Sharedsource.Tasks.LogSomethingDatabase : item : /sitecore/content/Home
ManagedPoolThread #19 14:08:15 INFO  Sitecore.Sharedsource.Tasks.LogSomethingDatabase : command : /sitecore/system/Tasks/Commands/Command
ManagedPoolThread #19 14:08:15 INFO  Sitecore.Sharedsource.Tasks.LogSomethingDatabase : schedule : /sitecore/system/Tasks/Schedules/Scheduled Task
ManagedPoolThread #19 14:08:15 INFO  Ended: Scheduled Task
ManagedPoolThread #19 14:08:15 INFO  Job ended: Sitecore.Tasks.DatabaseAgent (units processed: 1)

The default web.config file includes two agents that check for scheduled tasks to invoke as defined in one of the Sitecore databases. These agents define parameters to check for tasks in the Master and the Core database. By default, there don’t seem to be any scheduled task definitions in either database. You can disable, update or copy these agents and change parameters to check for tasks scheduled in a publishing target database (such as the Web database) or any other Sitecore database.

The agent that checks for scheduled tasks requires a specific data template for scheduled task definition items. A thread on the Sitecore Developer Network forums with the title Custom Task Item discusses a solution that uses a custom data template for scheduled task definition items. In that thread I posted a replacement that processes any descendant of the /Sitecore/System/Tasks/Schedules item in the contains a value in the Command field as a scheduled task definition item. Instead of using custom templates, you could add base templates to the data template for scheduled tasks.

Troubleshooting Scheduled Processes

The ASP.NET worker process must be active for Sitecore to poll for scheduled processes to invoke. If a scheduled process does not run, you have likely updated web.config or otherwise caused the ASP.NET process to terminate. If the ASP.NET worker process is not already active, you can bring it up by requesting any ASP.NET resource.

Another reason that a scheduled process might not run is that you have not set the polling frequency to a small enough value.

Scheduling Publication

Sitecore customers frequently want to schedule publication. You can read about how to set the publishing schedule for an item and each version of an item in the Content Cookbook . The next publishing operation after the item or version publication date will publish the item or version. You can enable the PublishAgent in the web.config file to schedule publishing operations at some interval, but as I wrote earlier, it’s not feasible to get ASP.NET to do something at a very specific time. Because publishing clears caches, publishing frequently just to see if there is anything to publish can have an adverse performance impact. For more information about publishing, see the Sitecore Publishing Operations page onthe Sitecore Developer Network (SDN). See also the Publishing strategies post on the molten core blog, as well as the Publishing site at a specific time - another idea? thread on the SDN forums.

For one approach for scheduled publishing, see the Sitecore Shared Source Automated Publisher and this blog post about it.

In how to publish at a specific time , Alex Shyba blogged about another approach, but the code seems to have disappeared. Luckily I had kept a copy and made it available through the Publish at a specific time thread on the Sitecore Developer Network   forums. I have to assume that Alex’s code is better than my own would be. I think his solution uses the Windows Task Scheduler to invoke a command line tool that calls a Web service in the Sitecore instance to do the processing, which seems relatively straightforward. In addition to the advantage of scheduling a task for a very specific time, the Windows Task Scheduler does not depend on the ASP.NET worker process (in fact, the Web service call will bring up ASP.NET if it is not already active). The downside is that the Windows Task Scheduler is not integrated into Sitecore, so you need to configure it separately. Maybe you could hook into Sitecore events or elsewhere to schedule the Windows task.

Conclusion

This is a complex topic and a long post, so I probably left some details out and got some others wrong, in which case please comment below. The options and possibilities for scheduling processes are just one factor that make Sitecore the best .NET CMS available!

More posts All About the Sitecore ASP.NET CMS.