LibrarySites.Banner

Language Fallback with the SOLR Document Builder

In my post, Sitecore 7 and Language Fallback, among other things, I gave some tips and code for customizing the Lucene Document Builder so that it uses language fallback when creating lucene indexes.

Since then, I have successfully done the same with Solr and wanted to share that here.

What you need to do for this to work right is to:

  1. Add a class that implements SolrDocumentBuilder and overrides the AddField, StoreField, and GetFieldConfigurationBoost methods
  2. Update the documentBuilderType attribute in the config (or better yet patch it in your Fallback.config file)
  3. Use SOLR as your index

This was more complicated to override than the Lucene Document Builder, the first reason being that the StoreField and GetFieldConfigurationBoost methods, which are in the out-of-the-box SolrDocumentBuilder, are private.  So you can't just override the AddField method and then call StoreField and GetFieldConfigurationBoost.  I had to reflect them as well and include copies of them in my class file, trying to keep it as close as possible to the same logic.  Unfortunately I yet again ran into an inaccessible method within the StoreField method, this.settings.DefaultLanguage(), so I make an assumption here that 'en' is the default language.

From here, I had three rather significant departures from the original SolrDocumentBuilder.cs and the Lucene Document Builder I gave previously:

  1. Instead of using FallbackLanguageManager.ReadFallbackValue to get the fallback value of the field, I created a recursive method to do so.  I'm not sure why in this scenario it wasn't working, but basically the ReadFallbackValue method was not consistently getting the value in the case of chained fallback (eg: fr-CA -> en-CA -> en).  In most places the fallback module's method works fine and merely calling fallbackItem[field.ID] would trigger the GetStandardValue method which is overridden by the fallback module's standard values provider.  But not here. 
  2. So the following method does the trick and this is called if the field is ValidForFallback to get the fallback value, as a string (important distinction), which is used in the Solr Document builder:
// recursive method to keep checking for value in fallback items 
// until we find a value or until we reach an item that no longer falls back
public static string GetSitecoreFallbackValue(Item item, Field fld)
{
string currentValue = "";
try
{
// Cannot check 'item.Fields[fld.ID].HasValue' first, because it returns false
// even though '.Value' will return a value in situation where standard values on the template comes into play
if (item.Fields[fld.ID] != null)
{
var value = item.Fields[fld.ID].Value;
if (!string.IsNullOrEmpty(value))
return value;
}
var fallbackItem = item.GetFallbackItem();
if (fallbackItem != null)
{
//Recursive call to get the item that has value for the particular field.
currentValue = GetSitecoreFallbackValue(fallbackItem, fld);
}
}
catch (Exception)
{
// TODO: Need comment
}
return currentValue;
}
  1. The SOLR document builder uses the various field readers to get the field value.  Depending on the type of field, it will cast the field value into various types that make the most sense for storage, it is NOT just a string.  This should probably be implemented in the Lucene Document Builder as well (which I have not yet done).  If fallback has to be used, it only returns a string and if saved to the index this way, will hinder search capabilities on the index.
  2. Therefore we extract the type of fieldvalue returned from the field reader, eg:
    1. fieldValue.GetType()
    2. Then based on the type, such as: List, Guid, Boolean, DateTime and String (which seemed to cover all of the main types of fields), we massage the fieldValue returned from the fallback method into the appropriate type of object.
      1. Example of Guid:
else if (fieldType.Name == "Guid" && ID.IsID(thisFieldValue))
{
fieldValue = ShortID.Encode(thisFieldValue).ToLowerInvariant();
}
  1. We can then pass that object through to the StoreField method as an object and it will be stored appropriately in the index.

 

  • The final difference from the original AddField method, is that I changed the logic to get the indexFieldName to not use culture.
  • From this:
    string indexFieldName = this.fieldNameTranslator.GetIndexFieldName(name, fieldValue.GetType(), this.culture);

    To this:
    string indexFieldName = this.fieldNameTranslator.GetIndexFieldName(name, fieldValue.GetType()); 
  1. Perhaps I am being shortsighted, but I didn't see the value in having the culture language version stored with a different field name that had the culture appended to the end of the name, like headline_fr.  On the front-end, I know I want to search the index just on the name of the field and not have to change that name based on the current culture I am searching.  Instead, I should just be able to add the query criteria for the particular language I am searching and the name of the field stays the same regardless.

The code for this can be found here:

https://github.com/Verndale-Corp/Sitecore-7-Fallback-Solr-Indexing

PS: You'll also find a version of SearchHelper.cs that includes some customizations made for working with Solr, including a different way to search location, a workaround for forcing exact match searches that contain special characters for text fields and strings, non-plural searches that need to still find a match even if the non-plural version of the word does not actually exist within the content, and field boosting.

  • Hi Elizabeth,  Great post. When reviewing your code on GitHub I found a reference to this class/method in the updated SearchHelper   Verndale.SharedSource.Utilities.TextUtilityStatic.GetSingular  It doesn't appear that its included in the Verndale.SharedSource project. Could you by chance update with this file?  Thanks, Jay