Switch SolrCloud Indexes During Sitecore PaaS Blue-Green Deployment

Switch On Rebuild in Sitecore

Sitecore has a great feature — Switch On Rebuild — that supports search index swapping once the rebuilding of the index completes.

In this article, I want to share my experience of implementing this feature in practice.

Problem

I was involved in one project based on Sitecore 9.3 Managed Cloud and SolrCloud, where we built a custom search index to crawl products every 12 hours. That was a huge one (with more than 3 million products at the time of this writing) and the entire rebuild took 8–10 hours.

In Sitecore, there is a mechanism for switching between active and rebuild indexes in order to avoid downtime in search functionality on the website. We, therefore, configured the Switch On Rebuild feature for this custom index so that the rebuilding process doesn’t affect the active index used on the live site.

As for continuous delivery, we used the blue-green deployment approach so as to have zero production downtime.

After each deployment, we triggered a full rebuild of our custom index and occasionally faced the problem of having the active index running instead of the rebuilt one.

For instance, before deployment, we have the following alias statuses in SOLR cloud:

  • active collection — product_index_rebuild (in SolrCloud it maps through aliases)
  • rebuild collection — product_index (in SolrCloud it maps through aliases)

Switch On Rebuild Sitecore

Main alias (active index) — product_index_rebuild collection (In SolrCloud, search core is called collection)

This means that the next time we trigger the index, it should be product_index. But we see quite the opposite in practice: the active index is running, the live site is broken, and data search is unavailable.

What Is the Cause of the Problem?

Sitecore has a property store where it stores some major facts about search indexes, such as when the index was last updated. And if the index has the Switch on Rebuild functionality, it stores information on which collection is Active and which is Rebuild. It happens inside the SwapAfterRebuild method (Class: SwitchOnRebuildSolrCloudSearchIndex, Assembly:  Sitecore.ContentSearch.SolrProvider) where the PreserveAliasesCollections is called.

Switch On Rebuild in Sitecore

All the data is located in the Properties table (core database):

Switch On Rebuild in Sitecore and SolrCloud

Switch On Rebuild feature in Sitecore

Switch On Rebuild Sitecore search index

As you can see, this table has three columns. Key and Value columns are of main interest to us.

The Key column comprises the following parts:

  • CORE — just the core prefix (core aka collection in SolrCloud)
  • PRODUCT_INDEX — the name of the index
  • RD50***B1F27 — server Machine Name. The machine where the CM instance is hosted as PaaS
  • MC-EDD***501-CM__6554 – WEBSITE_IIS_SITE_NAME
  • SOLR_ACTIVE_COLLECTION — the active collection in SOLR
  • REBUILD_COLLECTION — the collection that will be rebuilt next
If you are interested in how the KEY is populated, you can go to the PropertyStore.Set method declaration of IIndexPropertyStore interface (Assembly: Sitecore.ContentSearch):

Switch On Rebuild to swap indexes in Sitecore

The IndexDatabasePropertyStore (Assembly: Sitecore.ContentSearch) type implements this interface. Below you can see the implementation of the Set method with its dependencies:

Switch On Rebuild Blue-Green Deployment

Switch On Rebuild

In the Property table, you can find more than one pair because Azure can change the Machine Name for any reason.

How do you validate the current Machine Name and WEBSITE_IIS_SITE_NAME? In the Azure portal, navigate to CM AppService and run Advanced Tools (aka KUDU):

Switch On Rebuild Sitecore SolrCloud

Click the Environment tab where you can observe all major metadata about this App Service. For instance, you can find the Machine Name, WEBSITE_IIS_SITE_NAME:

Switch On Rebuild feature in Sitecore

Switch On Rebuild to swap search indexes in Sitecore

And what happens when we run the blue-green deployment?

Sitecore Switch On Rebuild

When the deployment completes, a new deployment slot becomes available. This slot is another App Service that has its own Machine Name and WEBSITE_IIS_SITE_NAME; for instance:

Switch On Rebuild during blue-green deployment

In the scope of Active/Rebuild collections stored in the Property table, Sitecore does not synchronize slots between each other. What does that mean for us? It means we could end up with a situation in which the status of both active/rebuilt collections is different.

Example

Let’s assume the current Machine Name is A. But we also have machine B that will be swapped when the blue-green deployment occurs. Machine B has Active Collection — product_index_rebuild, Rebuild Collection — product_index. When we trigger the deployment, machine A looks like this:

  • Active Collection — product_index
  • Rebuild Collection — product_index_rebuild

Consequently, on the live site, the product_index is active. After the deployment, machine B is active, and when we trigger the index, the product_index is run; this is an issue because another index needs to be run.

Solution

This issue can be resolved as follows:

  1. After the blue-green deployment, run SQL script to compare Active/Rebuild collections between deployment slots:
  2. If #1 resynchronized, update records for the active deployment slot (machine). It’s also possible to delete these two records.

You can automate these steps by writing a script and executing it in the scope of the Content Delivery pipeline.

Note: the ContentSearch.Solr.EnforceAliasCreation setting (part of the SwitchOnRebuildSolrCloudSearchIndex class) should be False. If this setting is True, Sitecore creates aliases every time the CM instance is reset.

Switch On Rebuid Sitecore blue-green deployment

I hope you will find my experience useful. That’s it for today.

Rating: 4.9/5. From 8 votes. Show votes.
Please wait...

About the author

Vadzim Papko
Vadzim Papko

Vadzim Papko is a Sitecore MVP, Sitecore Architect and Chief .NET Technologist at SaM Solutions. Adhering to the principles of non-stop self-development, he devotes himself to Sitecore innovation and popularization. A certified Sitecore and Xamarin developer. Find him on twitter: @VadzimPapko

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>