LibrarySites.Banner

Sitecore 7: Rebuild Lucene Indexes in Separate Subdirectories

This blog post explains how you can configure version 7 of the Sitecore ASP.NET web Content Management System (CMS) to rebuild Lucene search indexes in temporary subdirectories. Rebuilding in a temporary subdirectory prevents Lucene from resetting (deleting) the file system subdirectory that contains the index before rebuilding it. This solution is often appropriate for production and some types of testing environments, but may not be important for development and other types of testing.

Update 23.May.2003: I must have written the original title of this post before investigating. I've updated it, but not changed the URL of the post. Sitecore does not use temporary subdirectories when you use this configuration as indicated by the title; it switches between two active directories as described in the text.

The implementation that includes the temporary subdirectory feature (Sitecore.ContentSearch.LuceneProvider.SwitchOnRebuildLuceneIndex) inherits from the default class that Sitecore 7 uses to represent and manage Lucene indexes (Sitecore.ContentSearch.LuceneProvider.LuceneIndex). For an index such as sitecore_index_web, typically managed in the /data/indexes/sitecore_index_web subdirectory, the SwitchOnRebuildLuceneIndex provider ensures the existence of a corresponding subdirectory with the _sec suffix, such as /data/indexes/sitecore_index_web_sec.

The SwitchOnRebuildLuceneIndex provider uses the last modification dates of the subdirectories (specifically, the LastModified() static method of Lucene.Net.Index.IndexReader abstract class). The SwitchOnRebuildLuceneIndex provicer uses the most recently updated of these two subdirectories for read and update operations, and the other for full index rebuilds. When a rebuild completes, the SwitchOnRebuildLuceneIndex switches the two subdirectories.

Sitecore manages information about the current subdirectory status for such indexes in the index property store, which is not complicated but beyond the scope of this post. You can see messages about the activities of the SwitchOnRebuildLuceneIndex in the crawling log.

To configure an index to use the SwitchOnRebuildLuceneIndex provider, set the type attribute of the appropriate <index> element in the Web.config file. For example, you can use the following Web.config include file (Sitecore.Sharedsource.SwitchOnRebuildLuceneIndex.config in my case) to apply this provider for the three default indexes.

<configuration xmlns:patch="https://www.sitecore.com/xmlconfig/">
  <sitecore>
    <contentSearch>
      <configuration>
        <indexes>
          <index id="sitecore_master_index">
            <patch:attribute name="type">Sitecore.ContentSearch.LuceneProvider.SwitchOnRebuildLuceneIndex,Sitecore.ContentSearch.LuceneProvider</patch:attribute>
          </index>
          <index id="sitecore_web_index">
            <patch:attribute name="type">Sitecore.ContentSearch.LuceneProvider.SwitchOnRebuildLuceneIndex,Sitecore.ContentSearch.LuceneProvider</patch:attribute>
          </index>
          <index id="sitecore_core_index">
            <patch:attribute name="type">Sitecore.ContentSearch.LuceneProvider.SwitchOnRebuildLuceneIndex,Sitecore.ContentSearch.LuceneProvider</patch:attribute>
          </index>
        </indexes>
      </configuration>
    </contentSearch>
  </sitecore>
</configuration>

In order to create the _sec subdirectory, you may need to rebuild an index twice after updating its configuration to use the SwitchOnRebuildLuceneIndex provider.

Resources