LibrarySites.Banner

Sitecore 7 Making Google Part 4

Making Good Part 4: Filters, Paging and I'm feeling lucky.

Filters are a very interesting feature in Sitecore 7. The feature comes to us from a well known paradigm in the search world of adding a filter to a query that does not affect the score or relevancy of the rest of the search. Filters also have a nice added benefit in that the filter or document bitset is cached so that further queries with that same filter are faster and more memory efficient.

NOTE: Beware that Filters do not scale. Once you have over about 1 million documents in the index, the Filter can cache a bit per every document.

For something like Google, you could use Filters for the adult search. Let's think of a real-life example. Let's say we have a user of our site, S. Pope, no wait, Stephen P, and we want them to have the choice of turning on an adult filter or not. First we would need to index a document as an adult document or not. This could be done as a checkbox on an item or a computed field. Then whenever we run a search we would write our query like so:

var query = context.GetQueryable<NaughtyDocument>.Where(i => i.Content.Contains("Search Query Text").Filter(filter => filter.AdultContent));

This is how we could achieve AdultFilter in Google. The Filter part will be cached in memory and it will store a bit for every document in the index that has the Adult Filter turned on....no pun intended. However this is only one way of achieving this. You could also add in a processor for the ApplyOutboundFilter pipeline to check what the value of "AdultContent" is when you are evaluating or iterating the IQueryable<NaughtyDocument> page. In this way you would not have to add this filter to every LINQ query you right. If you are running on Update 2 (not out at the time of writing this) then you have a global LINQ query pipeline to append this to all LINQ queries.

NOTE: Please don't confuse Filters with IFilters, they are too completely different things.

Now let's talk about paging. Paging is the key to performance and the less results you have per page the faster the search will be. By default we will return 20 items per page in the UI, however at the LINQ layer that is completely up to you to handle. If you do not page and you have 1 Million results, guess what, we will try and fetch 1 Million results. We have a few ways of paging and we will show you all of them here.

var query = context.GetQueryable<NaughtyDocument>.Where(i => i.Content.Contains("Search Query Text").Filter(filter => filter.AdultContent)).Take(20);


var query = context.GetQueryable<NaughtyDocument>.Where(i => i.Content.Contains("Search Query Text").Filter(filter => filter.AdultContent)).Take(20).Skip(20);


var query = context.GetQueryable<NaughtyDocument>.Where(i => i.Content.Contains("Search Query Text").Filter(filter => filter.AdultContent)).Page(1, 20);

All of the above will get back a certain amount of results at one time. Therefore, paging is super easy and already in the framework for you. You can also use the trick from blog post 3 of this series to GetResults() as this will give you the total hit count where you can use this number and divide it by the page size to get the number of pages you need to display.

Finally, and could be considered the anti-climax of the century, "I'm feeling lucky". I am sorry if anyone thought this was tricky, but it really isn't, it is simple First(). The query would look like this:

var query = context.GetQueryable<NaughtyDocument>.Where(i => i.Content.Contains("Search Query Text").Filter(filter => filter.AdultContent)).First();

Just to re-iterate, Google is most likely doing much more sophisticated things during paging, filters etc. however most sites don't require the sophistication of Google in their search and we are hoping that the Sitecore.ContentSearch API will give you the ability to get 99% percent of your search requirements.

- Dev Team