This entry is the first in a series of blog posts about an approach that attempts to minimize the number of output caches cleared after publishing in the Sitecore ASP.NET web Content Management System (CMS) and Experience Platform (XP). For more information, see the Resources section at the end of this blog post.
In general, the more output you can cache, the better your site will perform. The fewer and more general the criteria by which you cache that output, the less memory that cache will consume. You may even wish to implement most or all dynamic aspects of your solution with AJAX and other techniques rather than generating HTML dynamically. In large scale solutions, output caching can actually reduce hardware and hence licensing requirements.
By default, Sitecore uses event handlers to clear output caches. These handlers clear output caches after publishing completes and after search indexes rebuild. The disabled HtmlCacheClearAgent in web.config provides an alternative to this event-based approach, and as in this custom solution, you can invoke the relevant APIs to clear caches as needed.
Sitecore actually provides two handlers for the two different types events. For the publishing:end and publishing:end:remote events, the HtmlCacheClearer event handler clears the output caches for the sites specified in the event handler definition. For the indexing:end and indexing:end:remote events, the IndexDependentHtmlCacheManager event handler trawls the output caches for all managed sites to remove entries with cache keys that contain "_#index", which Sitecore includes in cache keys when you set the Clear on Index Update property of a rendering.
This implementation is somewhat inconsistent: in the case of publishing we must specify the sites; in the case of index rebuilds the handler processes output caches for all sites automatically. Additionally, there is room for optimization:
Especially considering concurrent publishing options introduced in Sitecore 7.2, it did not look very easy to intercept every possible point that can trigger publication. As a hedge, I am sorry to say that I implemented a static class used by the publishItem pipeline and a custom event handler. My untested prototype includes:
This solution depends on the site definitions in the content management environment matching those in the content delivery environment. Specifically, to determine the managed sites associated with an item, it matches the paths of published items against the attributes of the managed sites that indicate the start item in the publishing environment, as well as the cacheHtml attribute of those site definitions.
The following diagram shows the solution in effect:
On the publishing instance, the TrackPublishing processor intercepts the publishItem pipeline to maintain information about publishing in a ClearCacheOptions exposed by a property of the ClearSiteOutputCaches static class. The publishing process then raises the publish:end event. The OutputCacheClearingEvent event handler on the publishing instance traps the publish:end event and passes the values from the ClearSiteOutputCaches to create the custom:publish:end:remote event. The OutputCacheClearingEvent event handler on the other instances traps the custom:publish:end:remote event and invokes the clearOutputCaches pipeline, which can in turn call the scavengeOutputCacheKey pipeline. Meanwhile, the OutputCacheClearingEventHandler on the publishing instance has likely continued, invoking the same clearOutputCachesPipeline and then resetting the ClearSiteOutputCaches static class. Not shown are the even-less-tested indexing:end and indexing:end:remote events.
This approach could be especially useful in organizations with many managed sites, especially when those sites use different languages and publishing targets. To summarize the features:
Additionally, this prototype demonstrates one way to pass custom parameters to custom remote events.
I have not tested this solution and do not expect to explorer it further or maintain this code. If you have a chance to work with it or any suggestions or other feedback, please comment on this blog post. It would be especially interesting to hear if this improves or worsens performance or capacity in any way.
When considering the HtmlCacheClearer, consider also the RenderingParametersCacheClearer event handler in newer versions of Sitecore.