Some troubleshooting for Microsoft Cluster (MSCS)


Cluster Admin cannot connect to the Cluster
  1. Instead of connect to your cluster using VIRTUAL_HOSTNAME, you may try to connect using "." (a single dot), which means "itself".

Forcing Quorum (What to Do If You Lose Quorum) 
“the secondary site can be forced to continue even though the cluster software believes it does not have quorum. This is known as forcing quorum.”
1.       Stop Cluster service on ALL nodes using Cluster Admin
2.      One must decide which node to be the Active Node. Let’s use assume Node X as an example.
3.       Setup ForceQuorum registry key node X:
a.        HKLM\SYSTEM\CurrentControlSet\Services\ClusSvc\Parameters
b.      Add a new REG_SZ  key “ForceQuorum” and the value should be “NodeX” (without quotes).
4.       Start Cluster service on Node X.

Back to Normal
1.       Turn on the other node.
2.       Stop Cluster Service on all node
3.       Remove registry key setting made before.
4.       Startup the Cluster Service on Node X again.
5.       Boot the other node.


How to Change Quorum Disk Designation (KB 280353)
Spilt Brain scenario – loss of communication between the nodes (RVD case)
To designate another quorum disk
1.     Start Cluster Administrator (CluAdmin.exe).
2.     Right-click the cluster name in the upper-left corner, and then click Properties.
3.     Click the Quorum tab.
4.     In the Quorum resource box, click a different disk resource.
5.     If the disk has more than one partition, click the partition where you want the cluster-specific data to be kept, and then click OK.
NOTE: If you cannot start Cluster service because the quorum disk is unavailable, use the /FIXQUORUM switch to start Cluster service. You are then able to change the quorum disk designation.

When you change the quorum disk designation, Cluster service does not remove the /Mscs directory  from the old drive. For administrative purposes, you may want to delete this old directory, or keep it as a backup. Do not continue running Cluster service with the /FIXQUORUM switch enabled. When the new quorum disk is established, stop the service and restart it without a switch. Then it is safe to bring other nodes online.

Comments