SharePoint Search - Content Processing Pipeline Failed to Process the Item
SharePoint 2013 Search Crawls Stop Crawling All ContentRecently the search system stopped successfully crawling content. ASPX pages were crawled, but no content was being indexed. Opening up the crawl log showed thousands of errors with this message:
The content processing pipeline failed to process the item. (Index was out of range. Must be non-negative and less than the size of the collection. Parameter name: index; SearchID = [GUID])
Note: the resolution is in the resolution section for those who cannot or refuse to read.What does this mean? Honestly initially it made no sense. So I learned a valuable lesson: I misguidedly spent the next 2 days trying the usual fixes like resetting the index, rebuilding the 2013 search components, and lastly what EVERYONE does: they fix it with a sledgehammer (in this case rebuilding the search service application).
But what if rebuilding the search service application is a horrible pain? We have thousands of query rules, 30 result sources, thousands of query suggestions, etc. Did SharePoint give us a pretty way to rebuild a search service application and retain those settings? Of course not.
So first i would like to impart some advice:
When things in SharePoint do not act right, please do some root cause analysis to trace what is really going on. This is usually always accomplished in the ULS logs.
Tracing the Error
- So I decide to kick off another full crawl and monitor the ULS logs live. Once I gathered enough log rows I filtered the results using "ULS Viewer".
- Please note we are not doing any diagnostic logging to a greater fidelity than Medium. In this case we did not need to turn on verbose logging, but that is a good idea prior to kicking off the crawl, just remember to reset it back after.
- Set the filter to: Product = Search.
- I then selected all of the rows in the window (CTRL+Shift+End), right-click, export to file. Save these as a .log file.
- Open this log file in Excel, choose tab delimited and adjust the column breaks.
- The problem with ULS logs is that the message column which contains the errors often contains much more. I want to be able to see if something in the content processing engine is throwing the error, because what we see in the crawl logs is the RESULT of the error - NOT the event that caused the error.
- As I look at the content processing it first gives us a message:
- [Microsoft.CrawlerFlow-716406d3-5f02-4f92-8efc-f55f16baa779] Microsoft.Ceres.ContentEngine.Processing.BuiltIn.AttributeMapperEvaluator+AttributeMapperProducer : Failed to map values to field ModifiedBy
- Here we can see that a column ModifiedBy cannot be mapped. Mapping is done by the search schema in the Managed properties. In research a lot of people mention that if you have a value in the managed properties that has Multi-Value set to True, but the system cannot allow it, you will get content processing errors.
- I ran a FIND function in Excel to look for this info in the message column: "Failed to map values to field". I ran a pivot tabel to see that this string in the message column is always indicating the ModifiedBy property is the issue. No other properties were throwing errors.
- Excel Function: =IFERROR((RIGHT($H1,(LEN($H1)-(FIND("Failed to map values to field",$H1))+1))),"")
- Note: Column H is the Message column from the ULS logs.
- I went to the search service application, found the ModifiedBy managed property. Unchecked the box for allow multiple values.
- Re-ran the full crawl and the errors went away.
ResolutionThis is the same as in the steps above but a synopsis for those who are in a hurry to triage their environment.
- In our case, someone or a migration tool, had set the managed property ModifiedBy to allow multiple values. This should never be set, it makes no sense for more than one person to be a modifier.