LibrarySites.Banner

Sitecore 7: Computed Index Fields

This blog post explains how you can add computed fields to search indexes in version 7 of the Sitecore ASP.NET web Content Management System (CMS). Computed fields allow you to index values calculated while indexing, such as the URL of each item. Before you read this blog post, please read the Sitecore 7: Introduction blog post linked in the list of resources at the end of this page.

Adding fields to an index can improve runtime performance by making data available in the index rather than requiring a visit to a data source, such as an item in a Sitecore database. One tradeoff involved in adding fields to an index is that each such index field increases the weight of that index, meaning the amount of resources required to generate and store the index (for example, processing time and disk space).

At this point it might be valuable to indicate and differentiate at least three definitions of the term field in the context of Sitecore development:

  • In .NET programming, a field is a variable of any type declared directly in a class or struct (structure).
  • In Sitecore, fields contain values that constitute most of the data that makes up an item.
  • In common search index terminology, a field is a discrete indexed value. In the context of Sitecore, search index fields often correspond to fields in items, where indexed documents correspond to items. Search indexes can contain documents that do not correspond to items. In a search index, many fields available for documents that correspond to Sitecore items correspond directly to the fields in those items, but some fields in the index for such documents have no relation to fields in those items.

It might also be worthwhile to mention that search indexes typically have no schema. In other words, you can think of a document as a flat list of named field values, where any document can contain any fields. This makes it very easy to add fields to the index, without the need to update a database schema or even a Sitecore data template.

Sitecore 7 ships configured to index a number of fields. In fact, one objective of this version is to reduce the use of the Sitecore.Data.Items.Item class by allowing developers to retrieve data directly from the index. 

In content delivery environments, presentation components that use APIs to access search indexes often need to limit results by excluding items that do not have URLs. For example, a search results page should not contain links to items that Sitecore cannot render as pages. 

The code and configuration in this blog post adds a field named hasurl to the index. That field contains a Boolean value that indicates whether a document (item) has a URL.

To use this code, your Visual Studio project should reference the new Sitecore.ContentSearch.dll assembly shipped with Sitecore 7 (in the Website/bin subdirectory of your Sitecore installation). Remember to set the Copy Local property of the reference to false. I assume your project already references the Sitecore.Kernel.dll assembly.

To code a computed field, create a class that implements the Sitecore.ContentSearch.ComputedFields.IComputedIndexField interface. This interface requires that your class implement simple string properties named FieldName and ReturnType, but more importantly, a method named ComputeFieldValue(). This method accepts an argument that implements the Sitecore.ContentSearch.IIndexable interface, which specifies the data to index, and returns an object, which represents the value for the field. In the case of Sitecore items, this interface abstracts the underlying Sitecore.Data.Items.Item object. We can retrieve the item (content, media, or other) from the Sitecore.ContentSearch.IIndexable argument passed to the ComputeFieldValue() method. 

Here is some sample code for adding a computed Boolean field to the index to indicate whether each document has a URL:

namespace Sitecore.Sharedsource.ContentSearch.ComputedFields
{
  using System.Linq;
 
  using Assert = Sitecore.Diagnostics.Assert;
  using Log = Sitecore.ContentSearch.Diagnostics.CrawlingLog;
 
  using SC = Sitecore;
 
  public class HasUrl : Sitecore.ContentSearch.ComputedFields.IComputedIndexField
  {
    public string FieldName { get; set; }
 
    public string ReturnType { get; set; }
 
    public object ComputeFieldValue(Sitecore.ContentSearch.IIndexable indexable)
    {
      Assert.ArgumentNotNull(indexable, "indexable");
      SC.ContentSearch.SitecoreIndexableItem scIndexable =
        indexable as SC.ContentSearch.SitecoreIndexableItem;
 
      if (scIndexable == null)
      {
        Log.Log.Warn(
          this + " : unsupported IIndexable type : " + indexable.GetType());
        return false;
      }
 
      SC.Data.Items.Item item = (SC.Data.Items.Item)scIndexable;
 
      if (item == null)
      {
        Log.Log.Warn(
          this + " : unsupported SitecoreIndexableItem type : " + scIndexable.GetType());
        return false;
      }
 
      // optimization to reduce indexing time
      // by skipping this logic for items in the Core database
      if (System.String.Compare(
        item.Database.Name,
        "core",
        System.StringComparison.OrdinalIgnoreCase) == 0)
      {
        return false;
      }
 
      if (item.Paths.IsMediaItem)
      {
        return item.TemplateID != SC.TemplateIDs.MediaFolder
          && item.ID != SC.ItemIDs.MediaLibraryRoot;
      }
 
      if (!item.Paths.IsContentItem)
      {
        return false;
      }
 
      return item.Database.Resources.Devices.GetAll().Where(compare => compare.ID != SC.Syndication.FeedUtil.FeedDeviceId
        || !SC.Syndication.FeedUtil.IsFeed(item)).Any(compare => item.Visualization.GetLayout(compare) != null);
    }
  }
}

Different implementations may use different logic to determine whether an item has a URL (and therefore the logic probably belongs in a pipeline or provider).

Here is a sample Web.config include file (Sitecore.Sharedsource.IndexHasUrl.config in my case) to add this computed field to all of the new indexes:

<configuration xmlns:patch="https://www.sitecore.com/xmlconfig/">
  <sitecore>
    <contentSearch>
      <configuration>
        <DefaultIndexConfiguration>
          <fields hint="raw:AddComputedIndexField">
            <field fieldName="hasurl" storageType="no" indexType="tokenized">Sitecore.Sharedsource.ContentSearch.ComputedFields.HasUrl,Sitecore.Sharedsource</field>
          </fields>
        </DefaultIndexConfiguration>
      </configuration>
    </contentSearch>
  </sitecore>
</configuration>

As you can see from this configuration, in addition to the class that implements the logic to calculate the value to index, you can can specify whether the index should store the indexed value (useful if you need to retrieve the value from the index as opposed to just using it for matching) and whether to tokenize that value (treat strings containing multiple words as multiple words or entire phrases - for example matching "John West" or "John" separately from "West"). In this example, there is no need to store a value for the hasurl field, as the default implementation of Boolean fields indicate their value.

Because we placed this field definition within the <DefaultIndexConfiguration> element in the Web.config file, all indexes that inherit that configuration (which means all of the new Sitecore 7 indexes) inherit this computed field. Instead of putting logic in the code to ignore elements in the Core database, we could configure the index for the core database to exclude this field. 

Remember that you must re-index the data to cause the new field to appear in the index. I know that the following could use more contextual information, but this blog post is already too long. The following class abstracts documents/items and exposes a HasUrl Boolean property based on this computed index field (Sitecore sets this property for us automatically based on the computed value indexed and the fact that the property name case-insensitively matches the name of the field in the index - field names in the index are lowercase by default):

namespace Sitecore.Sharedsource.ContentSearch.SearchTypes
{
  using SC = Sitecore;
 
  public class SearchResultItem : SC.ContentSearch.SearchTypes.SearchResultItem
  {
    public bool HasUrl { get; set; }
  }
}

You can use code such as the following to retrieve instances of this class representing all documents/items that have a URL in the default index associated with an item (normally you would include additional criteria to limit the results):

SC.Data.Items.Item item = Sitecore.Context.Item;
Assert.IsNotNull(item, "item");
Sitecore.ContentSearch.SitecoreIndexableItem sItem =
  new Sitecore.ContentSearch.SitecoreIndexableItem(item);
 
using (
  Sitecore.ContentSearch.IProviderSearchContext context =
    SC.ContentSearch.SearchManager.CreateSearchContext(sItem))
{
  foreach (SC.Sharedsource.ContentSearch.SearchTypes.SearchResultItem result
    in context.GetQueryable<SC.Sharedsource.ContentSearch.SearchTypes.SearchResultItem>().Where(x => x.HasUrl))
  {
    output.WriteLine(result.Path + " : " + result.HasUrl + "<br />");
  }
}

Resources

  • Hi Hemant,  Instead of searching in Business template items, search in product template items with condition "BusinessMultiListField_sm = businessName". In this case, product items will be results and facet can be applied on them.  If you have any constraint to search only in Business template items then add another computed field to business items which will be string Collection to store tagged product facet field values. Your other computed field will have product urls and this computed field will have facet field values of those products. In this case facet can be applied on this new computed field.  Please let me know if I suggested correctly or I misunderstood your requirements.  Thanks, G. Naresh Kumar

  • Hi John,

    I have a custom search index with crawler under root: /sitecore/content/Home This means it will create indexes for all items under /Home depending on indexConfigurations.

    Question is what if page under home has controls with datasource eg. Content Repository which is not under Home, whether computed field is good option to index values from datasource?