The Timer Service Failed to Recycle

SharePoint Timer Service Failed to Recycle

In the Monitoring section of SharePoint Central Administration a Health Analyzer rule appears warning about:

2 The timer service failed to recycle.

Title
The timer service failed to recycle.
Severity
2 - Warning
Category
Performance
Explanation

The last attempt to recycle the timer service failed as have most of the other attempts during the past week. Recycling typically fails because other timer jobs are running when the recycle is scheduled. To view which jobs blocked the recycle view the history for the recycle job and click on the failed status link for more information. The error message for the failed job entry will contain a list of jobs that were still running.
Remedy

Change the schedule for the timer recycle job so that it does not conflict with other long-running timer jobs.

Failing Services

SPTimerService (SPTimerV4)

Environment

SharePoint Server 2010 Enterprise
On-Premises
3 Web Servers, 3 Application Servers, 1 FAST Server, Active/Passive 2-node SQL Cluster.
Version: 14.0.6137.5002

Root Cause

  • This job will fail if other long running timer jobs are still running at the time the Timer Job Service tries to recycle.
  • In this case, this job is blocking this Timer Job from completing on multiple servers:
    • Microsoft SharePoint Foundation Usage Data Import

Findings

  • On one server (call this Server A) the Microsoft SharePoint Foundation Usage Data Import is failing consistently.
    • Error: An update conflict has occurred, and you must re-try this action. The object SPUsageServiceInstance was updated by [Farm Account], in the OWSTIMER (3016) process, on machine [Farm Server]. View the tracing log for more information about the conflict.
    • This issue occurs if the contents of the file system cache on the front-end servers are newer than the contents of the configuration database.  Clear the configuration cache.
  • On 2 other servers the Usage Data Import job takes 30 minutes (Server B) and 2 hours (Server C).
    • When the Timer Job fails to recycle, a 6398 Event ID is thrown in the Application Log.
    • When the Usage Data Import job takes too long, the Timer Service Recycle fails.

Resolutions

  1. Performance Monitor Users group:
    1. Ensure on the database server the SharePoint Timer service account [Farm Account] is a member of the Performance Monitor Users group on the database server.
  2. Clear Configuration Cache:
    1. Clear the configuration cache on Server A, but may want to do this on all farm servers.
    2. Link to Cache Cleaner script: https://spcachecleaner.codeplex.com/
  3. Timer Service Recycle Settings:
    1. This is currently set to start at 2:45 and no later than 2:45
    2. Timer Jobs run on more than one server, so the time window is there to randomly start for staggering.
    3. Setting this job to not have a large window to run in eliminates the flexibility the system needs.
    4. Recommend setting it to this:
      1. Start: 2:45 AM
      2. No Later: 5:45 AM
    5. This will allow any long running jobs to finish (like 2 hour Usage Data Import jobs)

References

Good Article on Timer Job Recycle: http://blogs.msdn.com/b/besidethepoint/archive/2012/01/10/the-timer-recycle-job-job-timer-recycle.aspx

Microsoft Knowledge Base Article on Lengthy Usage Data Imports: http://support.microsoft.com/kb/2323530

Comments

Popular posts from this blog

SharePoint Designer 2013 Approval Workflow with Comments

Change SharePoint server hostname and Web Application Names