Replacing the Collection Database in Sitecore XP

The Sitecore Experience Platform (XP) is a lot more interesting when the Experience Database (xDB) is populated with data. You can use the experience profile, experience analytics, path analyzer and a number of other tools.

Of course, when you install Sitecore XP for the first time your xDB is going to be empty. If you're fortunate enough to find someone with a collection database that you can use, you need to get that database connected.

This post explains how to connect a collection database to an existing Sitecore server and to ensure that Sitecore processes the data in that database.

What is the collection database?

The collection database is the MongoDB database that collects data that describes visitors' digital experiences.

The collection database is not the xDB. The collection database is a part of the xDB. Specifically it's the place where as much data is collected as quickly as possible.

Do I need anything in addition to a collection database?

The various tools in Sitecore XP (such as experience profile and path analyzer) do not access the collection database directly. The data is simply not in a format where it can be used efficiently.

Periodically the data in the collection database is aggregated. This is where the data is parsed and transformed into formats that make the data more usable.

Two other components that make up the xDB are the analytics index and the reporting database. These components are populated by the aggregation process. They provide access to various tools in Sitecore XP.

So simply connecting a collection database is not enough. You also need to have the aggregation process run. This is accomplished by rebuilding the reporting database.

How do I connect the collection database and get it processed?

OK. Enough background. Time to import some data.

Step 1. Install Sitecore site

Make sure your Sitecore server has the site that corresponds with the collections database installed.

Step 2. Add the collection database to MongoDB

  1. Shut down your MongoDB server.
  2. Copy the collection files to the data folder used by your MongoDB server.
  3. Start your MongoDB server.

Step 3. Configure the collection database in Sitecore

The collection database is specified in ConnectionStrings.config under the name analytics. Make sure that this setting matches the collection database you added to MongoDB.

Step 4. Connect a secondary reporting database

The import process won't bring data into the reporting database. It brings data into a separate database called the secondary reporting database.

The secondary reporting database uses the same schema as the primary reporting database.

  1. Copy the primary reporting database. This is the database specified in ConnectionStrings.config under the name reporting. The name of the new database doesn't matter.
  2. Add an entry in ConnectionStrings.config for the secondary reporting database. The name of the entry must be reporting.secondary.

Step 5. Set the deployment date on the secondary reporting database

When the reporting database is rebuilt, not all of the data from the collection database is used. How much data is used is determined by the configuration of each report segment.

All data collected after the date the report segment was deployed is included. The deploy date for each report segment is stored in reporting database in the Segments table.

You can change the deploy date using a simple SQL statement. Be sure you set the deploy date to a date that is earlier than the oldest data you want to include in reports:

UPDATE Segments SET DeployDate='2010-01-01 00:00:00.000'

Step 6. Rebuild the reporting database

Next you need to start the aggregation process.

  1. Log into Sitecore as an administrator.
  2. Navigate to http://[your-host]/sitecore/admin/RebuildReportingDB.aspx
  3. Click "Start"

This process can take a bit of time, depending on how much data is in the collection database and the deploy dates for your segments.

For example, on my modest developer machine (running IIS, SQL Server and MongoDB, along with a whole collection of other software that would never be on a server) with 10k contacts and 14k interactions spanning 8 months took 50 minutes.

Eventually you will see the process is completed.

Step 7. Change the connection strings for the reporting databases

The database identified as the secondary reporting database is now populated with the data from the collection database. The database identified as the primary reporting database hasn't been touched.

Sitecore needs to be reconfigured so that the database that is currently identified as the secondary is the primary.

In ConnectionStrings.config swap the entries named reporting and reporting.secondary.

Step 8. Rebuild Path Analyzer Maps

Next you need to rebuild the maps used by the path analyzer.

  1. Log into Sitecore as an administrator.
  2. Navigate to http://[your-host]/sitecore/admin/PathAnalyzer.aspx
  3. Click "Rebuild"


Here are some links with more information: