[TechSpeak] Sitecore Search: How SOLR Timeout Exception Affects Index Optimization

Updated Jun 16, 2023

If you have a custom SOLR index in your Sitecore-based project and it becomes too big (more than 1 million documents), you may face a problem when the rebuild process times out because of the SOLR connection timeout exception. Today, I’ll tell you how we dealt with this issue.

Problem

In our Sitecore project, we built a custom product index containing more than two million documents. This index has a custom crawler with complex logic, which includes reading thousands of JSON files and calling different APIs (e.g product ingestion, third-party data storage).

To implement the search functionality, we used the SOLR Cloud topology (from the SearchStax vendor) with the Switch On Rebuild feature (Zero downtime rebuild).

There is a custom automation process that triggers the index every night out of work hours. It takes about six hours, give or take, to process two million documents.

Once, when the index contained 500,000–600,000 documents, we got the following SOLR timeout connection exception:

Exception: SolrNet.Exceptions.SolrConnectionException
Message: The operation has timed out
Source: SolrNet
   at SolrNet.Impl.SolrConnection.PostStream(String relativeUrl, String contentType, Stream content, IEnumerable`1 parameters)
   at SolrNet.Impl.SolrConnection.Post(String relativeUrl, String s)
   at SolrNet.Impl.LowLevelSolrServer.SendAndParseHeader(ISolrCommand cmd)
   at Sitecore.ContentSearch.SolrProvider.SwitchOnRebuildSolrSearchIndex.PerformRebuild(Boolean resetIndex, Boolean optimizeOnComplete, IndexingOptions indexingOptions, CancellationToken cancellationToken)
   at Sitecore.ContentSearch.SolrProvider.SolrSearchIndex.Rebuild(Boolean resetIndex, Boolean optimizeOnComplete)

This happened when the rebuild process had been completed and Sitecore sent a request to SOLR to optimize the index.

Sitecore Switch On Rebuild Solr Cloud Search Index
Perform Rebuild
Optimize On Complete

Source: Sitecore.ContentSearch.SolrProvider.SwitchOnRebuildSolrSearchIndex.PerformRebuild(Boolean resetIndex, Boolean optimizeOnComplete, IndexingOptions indexingOptions, CancellationToken cancellationToken)

Sitecore enthusiasts have already discussed this issue.

Solution

To resolve this issue, we had to increase the ConnectionTimeout value. We changed it to 60,000, equalling 10 minutes. And it worked!

But our index was growing fast, and when it exceeded 2 million documents, we got the SOLR timeout issue again.

We increased that parameter to 120,000 (20 minutes), then to 360,000 (1 hour), but that didn’t help. We had to discover the source of the problem and resolve it.

Let’s take a look at how the official SOLR documentation describes optimization:

Optimization Official Doc

As you can see, optimization may improve the query performance when the index has become fragmented by many updates.

In our case, we had a daily full rebuild process, which means that we had a fresh index every day. And, as noted in the SOLR documentation, we did not need to optimize it:

Optimizing is not recommended unless it can be performed regularly as it may lead to a significantly larger portion of the index consisting of deleted documents than would normally be the case.

To avoid the optimization process, you can develop your own type of index inherited from the one you used before.

In our case, it was the SwitchOnRebuildSolrCloudSearchIndex (we used index swapping due to the zero downtime).

Then, you need to override the Rebuild() method: call the base method by setting the false value for the optimizeOnComplete parameter. This means that Sitecore will not trigger the Optimize command to SOLR:

Rebuild Method
public class NotOptimizedSwitchOnRebuildSolrCloudSearchIndex :
        SwitchOnRebuildSolrCloudSearchIndex,
        ISolrCloudIndex,
        ISearchIndex,
        IDisposable
    {
        public NotOptimizedSwitchOnRebuildSolrCloudSearchIndex(
            string name,
            string mainalias,
            string rebuildalias,
            string activecollection,
            string rebuildcollection,
            ISolrOperationsFactory solrOperationsFactory,
            IIndexPropertyStore propertyStore) : base(
            name,
            mainalias,
            rebuildalias,
            activecollection,
            rebuildcollection,
            solrOperationsFactory,
            propertyStore)
        {
        }

        public NotOptimizedSwitchOnRebuildSolrCloudSearchIndex(
            string name,
            string mainalias,
            string rebuildalias,
            string activecollection,
            string rebuildcollection,
            IIndexPropertyStore propertyStore) : base(
            name,
            mainalias,
            rebuildalias,
            activecollection,
            rebuildcollection,
            propertyStore)
        {
        }

        public NotOptimizedSwitchOnRebuildSolrCloudSearchIndex(
            string name,
            string mainalias,
            string rebuildalias,
            string activecollection,
            string rebuildcollection,
            IIndexPropertyStore propertyStore,
            ISolrProviderContextFactory providerContextFactory,
            string @group) : base(
            name,
            mainalias,
            rebuildalias,
            activecollection,
            rebuildcollection,
            propertyStore,
            providerContextFactory,
            @group)
        {
        }

        public override void Rebuild()
        {
            Rebuild(true, false);
        }
    }

Then, insert the created type into the index config:

Not Optimized Switch On Rebuild Config

And that should do the trick! Happy coding, dear Sitecorian. See you next time.

P.S. Thanks to my colleague Vadim Birkos for brainstorming this issue.