The Sitecore Experience Platform (XP) is a lot more interesting when the Experience Database (xDB) is populated with data. You can use the experience profile, experience analytics, path analyzer and a number of other tools.
Of course, when you install Sitecore XP for the first time your xDB is going to be empty. If you're fortunate enough to find someone with a collection database that you can use, you need to get that database connected.
This post explains how to connect a collection database to an existing Sitecore server and to ensure that Sitecore processes the data in that database.
The collection database is the MongoDB database that collects data that describes visitors' digital experiences.
The collection database is not the xDB. The collection database is a part of the xDB. Specifically it's the place where as much data is collected as quickly as possible.
The various tools in Sitecore XP (such as experience profile and path analyzer) do not access the collection database directly. The data is simply not in a format where it can be used efficiently.
Periodically the data in the collection database is aggregated. This is where the data is parsed and transformed into formats that make the data more usable.
Two other components that make up the xDB are the analytics index and the reporting database. These components are populated by the aggregation process. They provide access to various tools in Sitecore XP.
So simply connecting a collection database is not enough. You also need to have the aggregation process run. This is accomplished by rebuilding the reporting database.
OK. Enough background. Time to import some data.
Make sure your Sitecore server has the site that corresponds with the collections database installed.
The collection database is specified in ConnectionStrings.config under the name analytics. Make sure that this setting matches the collection database you added to MongoDB.
The import process won't bring data into the reporting database. It brings data into a separate database called the secondary reporting database.
The secondary reporting database uses the same schema as the primary reporting database.
When the reporting database is rebuilt, not all of the data from the collection database is used. How much data is used is determined by the configuration of each report segment.
All data collected after the date the report segment was deployed is included. The deploy date for each report segment is stored in reporting database in the Segments table.
You can change the deploy date using a simple SQL statement. Be sure you set the deploy date to a date that is earlier than the oldest data you want to include in reports:
UPDATE Segments SET DeployDate='2010-01-01 00:00:00.000'
Next you need to start the aggregation process.
This process can take a bit of time, depending on how much data is in the collection database and the deploy dates for your segments.
For example, on my modest developer machine (running IIS, SQL Server and MongoDB, along with a whole collection of other software that would never be on a server) with 10k contacts and 14k interactions spanning 8 months took 50 minutes.
Eventually you will see the process is completed.
The database identified as the secondary reporting database is now populated with the data from the collection database. The database identified as the primary reporting database hasn't been touched.
Sitecore needs to be reconfigured so that the database that is currently identified as the secondary is the primary.
In ConnectionStrings.config swap the entries named reporting and reporting.secondary.
Next you need to rebuild the maps used by the path analyzer.
Here are some links with more information:
Is there a way to create a new analytics Db from mongo and get all the historical data. My current analytics database was not correctly set-up :(