LibrarySites.Banner

Using Custom Contact Data Part 2 - Search

Starting with Sitecore 7.5 an individual visitor is represented using a contact. Information that is collected about the contact is stored in contact facets. This part 2 in a 3-part series that explores how data stored in contact facets can be used throughout the Sitecore Experience Platform.

In this post I will cover how data stored in contact facets can be indexed and searched along with the rest of the xDB data.

Prerequisites

In my previous post I covered how to get custom contact data into xDB using contact facets. This post assumes you have read that post.

Introduction to Analytics and Search

In my opinion one of the most exciting new features included in Sitecore 7.5 is that Sitecore analytics are now being indexed. This means that all of the search capabilities available to managed content (such as the LINQ-based search provider) are now available to analytics.

The implications of this new, powerful and usable feature are huge. It will serve as the foundation of a whole new generation of marketing tools. But covering this topic - both what it can do and how it works - is far beyond the scope of this humble blog post.

But I do need to introduce the basic concepts involved with how analytics data is indexed because the goal of this post is to explain how to get custom contact facet data indexed.

At a minimum there are 3 components to understand:

  1. Indexable object - contains the data that will be indexed
  2. Aggregator - makes the indexable object available to the crawler
  3. Crawler - determines and issues the commands needed in order to get the indexable object indexed

I need to create classes for these components and configure Sitecore to use them.

Step 1 - Add Assembly References to Visual Studio Project

Add the following references to the Visual Studio Project:

  • Sitecore.ContentSearch.dll
  • Sitecore.ContentSearch.Analytics.dll
  • Sitecore.ContentSearch.Linq.dll
  • Sitecore.Analytics.Aggregation.dll

Step 2 - Implement the Indexable Type

The main job for the indexable type is to set a property that contains a list of Sitecore.ContentSearch.IIndexableDataField objects. The indexing process then matches the IIndexableDataField to determine the appropriate indexing settings (such as whether to tokenize the value, what type of value is being handled and so on) by looking through the search configuration settings.

The following screenshot shows an example of the document in the search engine (in this case Lucene) that is represents the employee data that I am storing in a contact facet. Some of the fields are generated by indexing process that Sitecore controls, but most of the fields are the result of the code below.

Add the following code to the Visual Studio project:

using System;
using System.Linq;
using System.Collections.Generic;
using Sitecore.ContentSearch;
using Testing.ContactFacets.Model;
 
namespace Testing.ContactFacets.Search
{
    public class EmployeeDataIndexable : AbstractIndexable
    {
        public EmployeeDataIndexable(Guid contactGuid, IEmployeeData data)
        {
            var str = contactGuid + "employeedata";
            base.Id = new IndexableId<string>(str);
            base.UniqueId = new IndexableUniqueId<string>(string.Format("{0}|{1}", "employee", str));
            base.DataSource = "sitecore_aggregation";
            base.AbsolutePath = string.Empty;
            base.Culture = System.Globalization.CultureInfo.CurrentCulture;
            this.LoadFields(contactGuid, data);
        }
        public override void LoadAllFields()
        {
            //Override this method otherwise the fields
            //specified in LoadFields will not indexed!
        }
        protected virtual void LoadFields(Guid contactGuid, IEmployeeData data)
        {
            var list = new List<IIndexableDataField>
            {
                new IndexableDataField<string>("type", "employee"),
                new IndexableDataField<Guid>("contact.ContactId", contactGuid),
                new IndexableDataField<string>("employee.id", data.EmployeeId)
            };
            base.Fields = list;
        }
    }
}

Step 3 - Implement the Aggregator

The aggregator reads the employee data from the contact facet and creates an instance of the indexable type.

Add the following code to the Visual Studio project:

using System;
using System.Linq;
using System.Collections.Generic;
using Sitecore.ContentSearch.Analytics.Aggregators;
using Sitecore.Analytics.Aggregation.Pipeline;
using Testing.ContactFacets.Model;
 
namespace Testing.ContactFacets.Search
{
    public class AnalyticsEmployeeDataAggregator : ObservableAggregator<EmployeeDataIndexable>
    {
        public AnalyticsEmployeeDataAggregator(string name) : base(name) { }
 
        protected override IEnumerable<EmployeeDataIndexable> ResolveIndexables(AggregationPipelineArgs args)
        {
            if (args.Context.Contact == null)
            {
                yield break;
            }
            var data = args.Context.Contact.GetFacet<IEmployeeData>("EmployeeData");
            yield return new EmployeeDataIndexable(args.Context.Contact.Id.Guid, data);
        }
    }
}

Step 4 - Implement the Crawler

I don't need to do much in order to implement the crawler. Basically it is just specifying the type for the generics parameter. The base crawler handles everything else.

Add the following code to the Visual Studio project:

using System;
using System.Linq;
using Sitecore.ContentSearch.Analytics.Crawlers;
 
namespace Testing.ContactFacets.Search
{
    public class AnalyticsEmployeeDataCrawler : ObserverCrawler<EmployeeDataIndexable>
    {
    }
}

Step 5 - Register Search Components

Sitecore needs to be configured to use my aggregator and crawler and to knows how to handle the fields in my indexable object.

I need to add the following configuration into my config file. This markup goes inside the /configuration/sitecore node:

<contentSearch>
  <configuration type="Sitecore.ContentSearch.ContentSearchConfiguration, Sitecore.ContentSearch">
    <indexes hint="list:AddIndex">
      <index id="sitecore_analytics_index" type="Sitecore.ContentSearch.LuceneProvider.LuceneIndex, Sitecore.ContentSearch.LuceneProvider">
        <param desc="name">$(id)</param>
        <param desc="folder">$(id)</param>
        <param desc="propertyStore" ref="contentSearch/indexConfigurations/databasePropertyStore" param1="$(id)" />
        <configuration ref="contentSearch/indexConfigurations/defaultLuceneIndexConfiguration">
          <fieldMap ref="contentSearch/indexConfigurations/defaultLuceneIndexConfiguration/fieldMap">
            <fieldNames hint="raw:AddFieldByFieldName">
              <field fieldName="employee.id" storageType="YES" indexType="UNTOKENIZED" vectorType="WITH_POSITIONS_OFFSETS" boost="1f" emptyString="_EMPTY_" nullValue="_NULL_" type="System.String" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider" />
            </fieldNames>
          </fieldMap>
        </configuration>
        <locations hint="list:AddCrawler">
          <crawler type="Testing.ContactFacets.Search.AnalyticsEmployeeDataCrawler, Testing.ContactFacets">
            <CrawlerName>Lucene Employee Data Crawler</CrawlerName>
            <ObservableName>EmployeeDataObservable</ObservableName>
          </crawler>
        </locations>
      </index>
    </indexes>
  </configuration>
</contentSearch>
<pipelines>
  <employeedataobservable.filter.inbound />
  <group groupName="analytics.aggregation">
    <pipelines>
      <interactions>
        <processor type="Testing.ContactFacets.Search.AnalyticsEmployeeDataAggregator, Testing.ContactFacets">
          <param desc="name">EmployeeDataObservable</param>
        </processor>
      </interactions>
    </pipelines>
  </group>
</pipelines>

Step 6 - Deploy the Component

I need to compile my code and deploy my assembly and config file to my Sitecore server.

Confirming the Component Works

There are a variety of ways I can confirm this code works.

The quickest and easiest way is to query the search engine directly. I'm using Lucene as my search engine, so I can use the Luke tool to open the index named sitecore_analytics_index.

A more exciting way is to use the Sitecore content search API. I am a big LINQPad user (with the LINQPad driver for Sitecore, of course) so I open up LINQPad, connect to my Sitecore server, select the index named sitecore_analytics_index and run the following code:

void Main()
{
  var index = ContentSearchManager.GetIndex("sitecore_analytics_index");
  using (var context = index.CreateSearchContext())
  {
    context.GetQueryable<EmployeeData>()
      .Where(data => data.EmployeeId == "ABC123").Dump();
  }
}
 
public class EmployeeData
{
  [IndexField("contact.contactid")]
  public ID ContactId { get; set; }
   
  [IndexField("employee.id")]
  public string EmployeeId { get; set; }
}

You can also use the Sitecore logs to monitor the progress of the indexing process. Look for the crawling log.

Next Steps

My next post will conclude this 3-post series on how to use the custom contact data available in contact facets. If you haven't read part 1 now might be a good time.

  • Any directions on how to do this using Solr? I can't seem to get anything in the index.

  • @Peter Clark:  I got something similar to this working in SOLR using the following modifications:  1. Use "ContactChangeProcessor<T>" as the base class for your pipeline processor.  The differences between the two base classes are minor and you should be able to figure them out pretty easily.  2. Put your pipeline processor in the "/configuration/sitecore/pipelines/contacts" pipeline instead of the "interactions" pipeline.  If you look at "Sitecore.Analytics.Processing.Aggregation.config" configuration file you should see the two pipelines, or use "/sitecore/admin/showconfig.aspx" to figure things out.  I don't know whether the approach of not using the "interactions" pipeline works for your use case, but maybe it can point you in an alternate direction.