LibrarySites.Banner

First Leg of the Journey to Reporting

The key to the new reporting scheme of Sitecore Experience Platform is the combined use of MongoDB for data collection, as it scales quickly and can manage the large amount of data being gathered on site visitors, and the reporting flexibility of Microsoft SQL Server.

MongoDB is consider a No SQL data solution, where data is stored in a document-oriented configuration instead of a traditional relational oriented structure. This document-oriented structure also for increase speed and scalability in the collection of our visitor data. Downside to this approach is that summarization and reporting is not nearly as clean. Which is one of the benefits of a relation structured system like MS SQL.

The Journey to Reports

The journey we are going to explore is going from point 1 (MongoDB) to point 2 (MS SQL) to point 3 (all those cool reports talked about in the <a href=””>quick start</a>)

Aggregation 1

(Map taken from http://www.lib.utexas.edu/maps/historical/txu-pclmaps-virginia_battlefields_1892.jpg)

The first leg of this journey is moving the document orientation of MongoDB into a relation form so we can thing about reporting. As with any good journey, options are nice to have; and in this case Sitecore doesn’t disappoint

The first option to travel is referred as ‘Rebuilding Reporting’. Rebuilding the reporting database is a process which re-aggregates all data that has been collected since the start of your site. This is a very time and power intensive process requiring proper planning. Rebuilds can be triggered in code or via the admin page https://<MY WEBSITE>/sitecore/admin/rebuildreportingdb.aspx.

<script src="https://gist.github.com/gillissm/a7cc09e3d64343a4d5e5.js"></script>

The second traveling option is the ‘Continuous Update’ process. In its simplest form, this is a background task managed by Sitecore that gathers recent data from MongoDB, aggregates, and ships it to the reporting database. There are a number of options that can be configured to give you different throughput.

Which Train Do I Take?

[As one reads through the Sitecore documentation on server setup, the usage of roles and services seems to get intermixed. For my writings a server can have one or more roles, where each role provides one or more services.]

Anyone researching infrastructure setup scenarios for Sitecore Experience Platform will notice numerous references to the different roles/services a server can be configured for. The most commonly referenced are:

No matter how we process the data, we leverage the server role defined as the Processing Server. In the purest form, a Processing Server requires a Sitecore install but does NOT serve up any content, it doesn’t even need the Sitecore admin screens to function, but does require to be licensed.

In most Sitecore installations, this role will actually be configured to run on one of your content management (CM) servers, which is a perfectly acceptable and supported installation. If you notice that your reports aren’t refreshing as quickly as needed by the business or the CM seems to be running extremely slow for authors, this role should be the first to be considered for scaling off. For those who have purchased Sitecore’s xDB Cloud (known as xCloud) offering the Processing Server is included as part of the service.

Traveling the Lines of the Processing Server

A Processing Server role supports two services (features). The first service is called Processing. The idea of processing is the use of the Sitecore Task Manager API to run a variety of distributed tasks against xDB and the reporting database.

The second service is called Aggregation. This is the heart of the journey to MS SQL! This service is dependent on the Processing Service to be configured on the same server.  The Aggregation service (also referred to as Aggregation Process) is the series of tasks that move the data from MongoDB to MS SQL.

Aggregation 2

The Aggregation process ‘line’ looks like the following

  1. Sitecore Task Manager triggers Aggregation Agents as defined in the configuration files
  2. Agents collect any unprocessed data from MongoDB and other data stores as defined by the agent
  3. Data is grouped, summarized, and prepped for reporting
  4. Batches of processed data is sent to SQL as table-valued parameters (TVP)
  5. Data arrives at the reporting database destitution for consumption

Aggregation Bibliography

Overview of the process data moves through to be reported on, https://doc.sitecore.net/sitecore_experience_platform/xdb_overview/processing_overview

The go to for page for links explain how to configure the different server roles, https://doc.sitecore.net/sitecore_experience_platform/xdb_configuration/configuring_servers

 Breakdown of the different server roles that can be applied to a Sitecore server installation, https://doc.sitecore.net/sitecore_experience_platform/xdb_configuration/server_configuration_features