This blog post explains how you can use the publish replacer to change text in field values when you publish items in the Sitecore ASP.NET web Content Management System (CMS). The Replacers? and Search and Replace content during Publish threads on the Sitecore Developer Network (SDN) forums prompted me to write this blog post. For more information about publishing, see Sitecore Publishing Operations on SDN.
You can use the publishing replacer for any field values that should differ between the Master database and the publishing target database(s). For example, consider external links to other systems. You may want CMS users to insert links using hostnames that correspond to development, test, or other internal systems, and have the system transform those references to the production hostnames during publication to the production content delivery environment. In reality, I have more often seen replacers used to address various defects in Sitecore, most or all of which I think Sitecore subsequently resolved.
Before explaining how use the publishing replacers, consider the difference between Sitecores definitions of the terms replacers and replacements in this context:
Sitecore provides a default replacer (Sitecore.Text.Replacer ) and two default types of replacements, each of which derives from the Sitecore.Text.Replacer.Replacement abstract base class:
Sitecore invokes the publishItem pipeline to publish each item. For more information about pipelines, see the blog post All About Pipelines in the Sitecore ASP.NET CMS. For more information about the publishItem pipeline, see the blog post Intercept Item Publishing with the Sitecore ASP.NET CMS.
The PerformAction processor in the publishItem pipeline invokes the publishing replacer. Specifically, the constructor for the Sitecore.Publishing.PublishOptions class used by publication passes "publish" to the Sitecore.Configuration.Factory.GetReplacer(), causing the configuration factory to create configure an instance of the class specified by the type attribute of the /configuration/sitecore/replacers/replacer element in the Web.config file with a value of publish for the id attribute. For information about the configuration factory, see the blog post The Sitecore ASP.NET CMS Configuration Factory. In other words, it is possible to use replacers in contexts other than publication.
By default, the value of that type attribute is Sitecore.Text.Replacer, which is the default replacer class. By default, the value of the mode attribute of the /configuration/sitecore/replacers/replacer with a value of publish for the id attribute is off. To enable this replacer, change the value of this mode attribute to true.
The contents of that /configuration/sitecore/replacers/replacer element in the Web.config file specify any number of simple and regex replacements using <simple> and <regex> elements, respectively, nested within the <replacements> element.
For each <simple> element, the find attribute specifies text to match and the replaceWith element specifies characters with which to replace that text. The ignoreCase element controls whether Sitecore matches the find attribute with character case sensitivity.
For each <regex> element, the find attribute specifies a regular expression to match and the replaceWith element specifies characters with which to replace tokens that match that regular expression. The ignoreCase element controls whether Sitecore evaluates the regular expression with character case sensitivity. Because regular expressions can be expensive, if the <regex> element includes the simpleTest attrbitute, Sitecore uses System.String.IndexOf to check for that value before applying the regular expression replacement.
Both <regex> and <simple> elements support a forPublish attribute. If the value is true, Sitecore increments the Publishing.Replacements performance counter in the Sitecore.Jobs category.
Other than understanding the attributes of the <regex> and <simple> elements, you do not need to understand much of the explanation in the previous section just to use publishing replacers. What you need to do is set the mode attribute of the /configuration/sitecore/replacer element in the Web.config file with a value of publish for the id attribute to on and add your own /configuration/sitecore/replacer/replacement/simple and/or /configuration/sitecore/replacer/replacement/regex elements within that <replacer> element. You can do this with a web.config file such as this example, which enables the replacer and moves the default <simple> and <regex> examples to this Web.config include file to give you a starting place (be sure to remove any examples that you do not use). For more information about Web.config include files, see the blog post All About web config Include Files with the Sitecore ASP.NET CMS.
The process is not exactly trivial because Sitecore apparently did not intend for it, but you can implement your own replacements. You might implement a replacer for example if you need to determine the string with which to replace the token at runtime rather than specifying it in the Web.config file.
Creating a replacement is simple:
The challenge is that you need to add your replacement to the replacer, which means mapping the element you use to configure that type of replacement (similar to <simple> and <regex>) to the class that implements the replacement. Unfortunately, the default replacer uses a private variable to store the list of replacements, and hard-codes the mapping of element names to replacement classes. This requires you to override the replacer, such as by creating a class that inherits from the default implementation (Sitecore.Text.Replacer). In that class:
This untested example implements a replacement that transforms a token such as $random to a random number and replacement that uses <random> elements to configure that type of replacement, and includes a Web.config include file to enable that replacement.
I did not confirm, but would assume that Sitecore invokes replacements in the order they appear in the Web.config file. This might be important if you have two replacements that transform the same value, or if one replacement generates values that another replacement might transform. This could also affect the way you implement a custom replacer; because you cannot add your replacements to the default ordered list of replacers, you may want your replacer to process your replacements before the base class processes its replacements, or afterwards, or you may want to override the components of the base class that define and use the private variable so that you can add all replacements to a single list. The example provided with this post applies its own types of replacements before the default replacements (its Replace() method applies its replacements and then calls Replace() in the base class).
It appears that the GetPublishedVersionOfItem processor in the filterItem pipeline invokes the replacer used by publishing. This affects managed sites for which the filterItems attribute in the corresponding /configuration/sitecore/sites/site element is true (also known as live mode). For more information about the filterItems attribute, see the comments above the /configuration/sitecore/sites element in the Web.config file. For information about managed sites, see the blog post Managed Web Sites in the Sitecore ASP.NET CMS. For information about live mode, see Live Mode on SDN, but note also the /App_Config/Include/LiveMode.config.example sample Web.config include file distributed with Sitecore CMS to easily enable live mode (rename without the .example extension). Note that I do not personally recommend live mode.
Replacements and replacers have no awareness of the context in which they run, such as the item or field they transform. They operate as simple filters on all field values without any knowledge of the fields that contain those values or the items that contain those fields.
According to the SDN forum thread HTML editor links - invalid xhtml, you may need to escape the ampersand character (&) in the find attribute of <regex> replacement elements with &. You may need to escape other characters (I expect quotes characters (" and ') and angle braces (< and >) in a similar manner.
Excessive and expensive replacements could affect publishing performance. Replacements work, so if you accidentally configure them to replace data that the system should not transform, you may experience unexpected results.
Sitecore provides a number of facilities that support replacement that may be more appropriate than the publishing replacer appropriate for various requirements. For example: