Saturday, February 26, 2022

How to troubleshoot High CPU Usage / Memory Leaks / Performance Issues in Sitecore - Part 1

 

Summary

The blog is intended to provide guidance on how to troubleshoot any High CPU Usage /  slow performance / memory leaks / hangs by a process in Internet Information Services (IIS). Also this could be used across any applications hosted in IIS but not limited to Sitecore CMS. 

Details

Symptoms of High CPU Usage would include latency in web page response times or slow performing server. There are different ways and various tools and technologies to troubleshoot this issue however I will limit it to default Microsoft Debug Tool. 

First step to troubleshoot this is to create a Memory Dump of the IIS Process when IIS is still processing the client requests. In a Sitecore / .Net hosted app typically this process would be w3wp process. 

Microsoft's Debug Diagnostics Tool comes in very handy and can be used to capture, analyze memory dump files when IIS processes client requests.

How to Configure Debug Diagnostics Tool: 

  • Once the tool is downloaded open DebugDialog.Collection.exe from the configured folder ( default folder - C:\Program Files\DebugDiag) and navigate to Tools - Options and Settings - Performance Log - Enable Performance Counter Data Logging as shown below 


  • Next step is to create a dump file which can be done using two routes such as a configuration of a hang rule or using manual method. In this series I'll explain the automated process.  

Method 1 - How to create and configure Rules: 

  • Navigate to the rules tab and add a new rule. For this series let us assume that it's a performance issue and select & configure the Performance option as below: 



  • As explained in the screenshot below this can be done based on different triggers either using the Counters Threshold limit or latency in the Response Times. Assuming we are experiencing very slow responses I will select HTTP Response Times option.


  • Configure the website URL here where we are experiencing the slowness and update the timeout or ping options based on the application. Default setup looks at any host that includes with or without https and both ports. There is an option here to specify a specific folder / path / virtual directory as well. 


  • Prerequisite - If you choose the ETW option make sure to have tracing enabled on your server. Depending on whether it's a Windows 10 or Windows Server here are few instructions from Microsoft on how to enable it. 
  • Last step is to configure the dump target, location and frequency of the dump files. While creating the targets select application pool or w3wp process. 

  • Once the rule is activated browse the configured site / web service in any browser. IIS will start tracking and based on the configurations provided earlier dumps will be generated in the configured folder. 

In the next part of this blog I'll show how to read / analyze these dumps. 







Tuesday, February 22, 2022

Sitecore XConnect Operation #0, AlreadyExists, Contact and XConnect FacetOperationException: Operation #1, ReferenceNotFound Issue

Description: We have been seeing below issues when trying to save or retrieve XConnect Data
2022-02-17 13:56:51.743 -05:00 [Error] Sitecore.XConnect.Operations.AddContactOperation: Sitecore.XConnect.Operations.EntityOperationException: Operation #0, AlreadyExists, Contact 2022-02-17 13:56:51.743 -05:00 [Error] Sitecore.XConnect.Operations.SetFacetOperation`1[Sitecore.XConnect.Facet]: Sitecore.XConnect.Operations.FacetOperationException: Operation #1, ReferenceNotFound, Contact, Classification 2022-02-17 13:56:51.743 -05:00 [Error] ["XdbContextLoggingPlugin"] XdbContext Batch Execution Exception Sitecore.XConnect.Operations.EntityOperationException: Operation #0, AlreadyExists, Contact 2022-02-17 13:56:51.743 -05:00 [Error] ["XdbContextLoggingPlugin"] XdbContext Batch Execution Exception
Inorder to resolve this issue there are multiple checks one has to go through :

Scenario 1

Check with Sitecore as Sitecore already has a hot fix for below issue: 

Sitecore XP 9.0.0: SC Hotfix 316493-1.zip
Sitecore XP 9.0.1: SC Hotfix 307306-1.zip
Sitecore XP 9.0.2: SC Hotfix 307348-1.zip
Sitecore XP 9.1.0: SC Hotfix 329879-1.zip
Sitecore XP 9.1.1: SC Hotfix-343592-1.zip
Sitecore XP 9.2.0: SC Hotfix 490409-1.zip
Sitecore XP 9.3.0: SC Hotfix 449095-1.zip

Here is the reference to the Knowledge Center Article for Sitecore Support. 

Note : Make sure to apply this fix in Standalone instances and test it thoroughly before proceeding to  CM and CD instances. Also this fix should be applied first in CM before any modifications in CD Instances.  

Tip: You can verify this dll fix by navigating to the bin folder and look at the properties : 






Once the above hotfix is applied it should fix the issue. 

If the issue still persists continue below..

Scenario 2: 

Verify the shared session state settings and make sure that: 

  • Session State Stores such as Private and Shared are in sync between the config files. Tip: Use instance's show admin page where you can view all the configurations ( including the patch files ).
  • Similarly verify Session State Processes ( In-Proc , Out-of-Proc / Custom ) are in sync. 
  • Finally check the Load Balancer configuration as the settings for above would depend on whether if Sticky Sessions are configured. Reminder - We can only configure one Session State Server for one CD Cluster. 
Refer to this Knowledge Center Article to tune these settings. 

Scenario 3: 

In our scenario the above fix partially fixed the issue however CPU was running high and Sessions were locked and ended up in a Deadlock Sate. 

After thorough research and help from Sitecore from we found out : 

  • SQL Sessions on the server were blocked. We had to end the sessions, recycle App Pool. 
  • Removed orphaned Contact which had high number of Interactions using Sitecore ADM Module. This module comes in very handy to maintain Analytical Data, rebuild indexes and purge Contact Data when not needed. Here is a link to the module - Sitecore ADM Module

Hope this helps. Feel free to share any similar experiences and resolution paths.