Supermicro bug on running VMs in Nested method.

One of our internal cloud environment which is running ESXi hosts under SuperMicro ( SYS-1027GR ) got failed with the PSOD referencing Fatal (unrecoverable) MCE on one of physical CPU .It happened almost all the ESX hosts on the cluster frequently but not always right away after the restart.

Sample from the log:

cpu2:33455)@BlueScreen: Machine Check Exception: Fatal (unrecoverable) MCE on PCPU2 in world 33455:vmnic1-pollW System has encountered a Hardware Error – Please contact the hardware vendor

Raised the support case with SuperMicro and VMware , after long investigation VMware Engineer identified it is known BUG in the SuperMicro  Servers if we running any VM in a nested manner.The issue is due to Intel erratum (which Intel has to acknowledge and release microcode/BIOS fix) this is a long-term solution to this issue. It seems that nested virtualization combined with PCI passthrough caused some errata in the CPU microcode on the Intel CPUs to make the hosts crash. 1603071 is the related bug number  mentioned by VMware.

To fix this PSOD, temporarily we have disabled the  Hardware Virtualization from the VMs option. Working with SuperMicro and VMware for the workaround and lon-term solution.

Posted in ESXi issue, VMware | Tagged , , , | 1 Comment

vSphere client and Powercli fails to connect vCenter after TLSv1.0 disabled.

As per the KB 2148819 , TLSv1.0 has been disabled from the VC and issue started to connect the VC from the desktop client and also using the powercli .

Error from the Desktop client:

Error from the Powercli

Using the web-client there is no issue on connecting the vcenter or ESX hosts and only the issue is from the desktop client and powercli . After reading the KB again noticed that they already mentioned in notes about the issue and pointed to the another KB 2149000 which describes the issue and add to do few changes on the below file with few MS .Net patches

 C:\Program Files (x86)\VMware\Infrastructure\Virtual Infrastructure  Client\Launcher\VpxClient.exe.config

Edit the VpxClient.exe.config file by setting the parameters

<add key = "EnableTLS12" value =  "false" /> as
<add key = "EnableTLS12" value =  "true" />

After doing the changes also had the same issue and finally it got resolved by re-installing the desktop client.

But still connecting the vCenter using the powercli was not fixed and finally found the another KB 2137109 which asked to do the below registry changes which fixed the issue.

Must use PowerCLI 6.0 R1 or later. Earlier versions of PowerCLI work with versions of the .NET Framework that cannot use the TLSv1.1 and TLSv1.2 protocols by editing the registry.
  • For 32-bit processes, change the following registry key value to 1.

    Key: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\.NETFramework\[.NET_version]
    Value: SchUseStrongCrypto (DWORD)

  • For 64-bit processes, in addition to the above registry key, change the following registry key value to 1.

    Key: HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\.NETFramework\[.NET_version]
    Value: SchUseStrongCrypto (DWORD)

Reference :

Posted in ESXi issue, vCSA 6.0, VMware | Tagged , , | Leave a comment

Primary\Secondary DNS IP Fail-Over bug in VMware vCenter Server Appliance 6.0 Update 2 ( VCSA U2)

We have the PRD setup with external PSC and VC which is configured with the Primary DNS and Secondary DNS . Due to the hardware issue on our primary DNS server , it went down and we couldn’t connect the VC.

All other application in our environment was working fine and we login to the PSC and VC with the port 5480 ( https://VC:5480 ) and manually changed the primary DNS IP to the working DNS server and within few seconds , VC started connecting to the PSC and allowing the AD authentication .

In our investigation we couldn’t find any concert reason for the failure and also tested in lab by just changing the Primary DNS to some unknown IP and didnt find any issue on the connectivity .

Finally raised the ticket with VMware and they confirmed that the issue is because of some bug in the VCSA Update 2 and they are working on to fix the issue in the next update 3 and also they confirmed it has been fixed in the VCSA 6.5 version but still no answer for my lab  environment which is working fine on changing the primary DNS.

UPDATE 3/16/2017 : VC 6.0 U3 release notes doesn’t show anything related to this bug fix and when we checked with VMware they confirmed still it is in testing stage and not included in the latest U3 update..

Also pls find the blog which list all the known issue on the VCSA


Posted in vCSA 6.0, VCSA6.5 | Tagged , , , | 2 Comments

Useful information and links about Microsoft Remote Procedure Call (RPC)

The diagram below shows the RPC workflow starting with the registration of the server application with the RPC Endpoint Mapper (EPM) in step 1 to the passing of data from the RPC client to the client application in step 7.


  1. Server app registers its endpoints with the RPC Endpoint Mapper (EPM)
  2. Client makes an RPC call (on behalf of a user, OS or application initiated operation)
  3. Client side RPC contacts the target computers EPM and ask for the endpoint to complete the client call
  4. Server Machine’s EPM responds with an endpoint
  5. Client side RPC contacts the server app
  6. Server app executes the call, returns the result to the client RPC
  7. Client side RPC passes the result back to the client app

How RPC Works

Troubleshooting “RPC server is unavailable” error, reported in failing AD replication scenario.

Restricting Active Directory RPC traffic to a specific port

How to configure RPC dynamic port allocation to work with firewalls

Have you set static port on the DC for netlogon or for any other interfaces?

Long logon time after you set a specific static port for NTDS and NETLOGON in a Windows Server 2008 R2-based domain environment

AD replication fails with an RPC issue after you set a static port for NTDS in a Windows-based domain environment

Logon fails after you restrict client RPC to DC traffic in Windows Server 2012 R2 or Windows Server 2008 R2

Use the script that helps to test the RPC connectivity via TCP: This script tests TCP network connectivity to not just the RPC Endpoint Mapper on port 135, but it also checks TCP network connectivity to each of the registered endpoints returned by querying the EPM.  Many firewall teams have a difficult time with RPC, and they will end up allowing the Endpoint Mapper on port 135, but forget to also allow the ephemeral ports through the firewall.  This script uses localhost by default, but obviously you can specify a remote machine name or IP address to test a server across the network.  The script works by P/Invoking functions exported from rpcrt4.dll to get an enumeration of registered endpoints from the endpoint mapper, so it’s not just a wrapper around portqry.exe.

One of the issue if the ephemeral ports are blocked between clients and the domain controller it will show the RPC error while trying to join a client machine to the domain. Client gets joined to the domain and later fails with error “Changing the Primary Domain DNS name of this computer to “” failed. The name will remain “ The error was: The RPC server is unavailable”.

use the below link to make sure we opened the required ports for the communication between clients and the DC.

How to configure a firewall for domains and trusts





Posted in Windows | Tagged , , | Leave a comment

Useful links about Windows Failover Clustering.

Free Ebook


Understanding the Cluster Debug Log in 2008


Troubleshooting Cluster Logs 101 – Why did the resources failover to the other node?


Measuring Disk Latency with Windows Performance Monitor (Perfmon)


Planning Failover Cluster Node Sizing


Configuring Windows Failover Cluster Networks


Windows Server 2008 R2 Failover Clustering – Best Practice Guide


Windows Server 2008 R2 Cluster: List of Hotfixes


What is RHS and what does it do?


Resource Hosting Subsystem (RHS) In Windows Server 2008 Failover Clusters


Understanding how Failover Clustering Recovers from Unresponsive Resources


978527 The Resource Hosting Subsystem (Rhs.exe) process stops unexpectedly when you start a cluster resource in Windows Server 2008 R2


815267 How to enable User Mode Hang Detection on a server cluster in Windows Server 2003 and in Windows 2000 Server SP4


Decoding Bugcheck 0x0000009E


Comparing Hotfixes across Multiple Node Failover Clusters


Keep your Failover Clustering deployment healthy!


Video –

Posted in SQL server Failover Cluster, Windows | Tagged , , | Leave a comment

Steps to upgrade the vCSA (PSC\VC ) 6.0 to 6.5.

Read the KB 2147548 \ KB 2147686 before upgrading the vshpere environment for the Compatibility considerations and also for the best practices to make sure all the products which is connected to the vsphere will support the 6.5 version.

Update 3/16/2017: As per the VC update 3 release notes “ Upgrading from vCenter Server 6.0 Update 3 to vCenter Server 6.5 is not supported”

Update 8/1/2017: We can now migrate VC6.0 update 3 to VC 6.5 U1

Once ISO is downloaded , open the content inside the folder and we can see the below folder structure and we have several ways to install or upgrade the VCSA 6.5.Basically we have below three types of the installation and Readme will give the clear instruction on the installation methods.

a) The UI based installer
b) The command-line installer
c) Migration from vCenter Server Windows to Appliance

First good factor on 6.5 is no need to have the  Client Integration Plugin and we can see UI Based Installer steps.

Before starting the upgrade, rename the existing  PSC\VC on the vcenter as old because during the upgrade process it will ask  for the target PSC\VC name for the deployment and if it is already on the same VC or ESX then it will show the error as VM name already exists in the vcenter.

Another important thing is we need one temporary IP for migrating the data from the OLD version to the new psc\VC 6.5.

One more item is we need to make sure to remove the DRS from fully automated during the upgrade process.

Upgrading the appliance is the two stage process . First stage involves deploying the new VM to the ESX\VC and the second process is copying the data from the source to destination appliance .

Open Windows Explorer and in the vCenter Server Appliance installation
directory, navigate to the ‘vcsa-ui-installer\win32’ folder.Click on the isntaller.exe


It will open the console to select for the new installation or upgrade.


Next it will show the option and process for upgrade


Accept the license agreement


Next provide the source PSC\VC IP and also select the ESX\VC which the source appliance is running.



Next provide the target ESXi\VC in which appliance to be deployed.


Next have to provide the Target VM settings to be deployed so like mentioned if we are not renaming the existing VM name in the inventory then it will show the error as already VM name is exists ..


So make sure the current appliance name is renamed in the inventory.


Select the datastore


Provide the temporary IP to copy the data and make sure it is in same vlan


It will show all the configuration details .


It will go all the below process


It was in 99% for almost 10 mins while configuring the network and the first phase got completed.

On the second phase it will show the below warning to make sure the appliance is not under fully automated DRS to avoid vmotion of the VM to the other host.


if all the configuration is correct then the PSC installation will be completed.


We can verify the PSC by login in to the https//PSC:5480 and alsp https://PSC/psc.

Also we can notice the old PSC will be in power-down in Vcenter.

Vcenter Appliance upgrade

Next we can see the vcenter upgrade and we need to follow the same steps as PSC so I will be covering only the second phase of the installation.

Once the first-phase is completed the second phase will start with the pre-upgrade steps .


Same like PSC it will show the warnings to remove fully automate DRS and also it will highlight to make sure other extension will work with the new upgrade .


Next it will show the data size to be copied to the new target server and the size will be depends on our environment.Also we can decide what data we need to copy to the new upgrade target VM.


It will show the source and target vcenter and its configuration.


It will show the warning that source VCenter will be shoutdown.


upgrade process will be started.


We can see the status of copying the data and setting the target VM.




So once the installtion is completed we can take the vcenter using the web and pls note from the Vsphere 6.5 we cant use the desktop C# client to access the vcenter.














Posted in Vcenter Appliance, vCSA 6.0, VCSA6.5, VMware | Tagged , , | Leave a comment

vdcrepadmin to find the replication status and design the Platform Services Controller 6.0

After attending the VMworld PSC session , I was thinking about to test the VDCREPADMIN tool which helps to find the replication status and to re-desgin the PSC.

Currently we have three PSCs which connects to each other in an in-line fashion, with each PSC installed against the previous PSC, rather than a hub-and-spoke fashion where all of the PSCs would terminate to a central PSC or mesh topology.



VDCREPADMIN Showservers is to displays all of the PSCs in a vSphere domain.

Login to the appliance and go to the below path

cd /usr/lib/vmware-vmdir/bin

Run this command to show all PSCs in the vSphere domain:

vdcrepadmin -f showservers -h PSC_FQDN -u administrator -w Administrator_Password


From the output we can see the PSC Names , Site and Domain.

VDCREPADMIN showpartners is to display the partner PSC.

vdcrepadmin -f showpartners -h PSC_FQDN -u administrator -w Administrator_Password



from the out put we can find the partnership between the PSCs which was installed in an in-line fashion, with each PSC installed against the previous PSC

  • PSC35.* has a replication partnership with PSC236
  • PSC36.* has a replication partnership with both PSC35.* and PSC37.*
  • PSC37.* has a replication partnership with both PSC236

VDCREPADMIN showpartnerstatus is to display the current replication partner of the PSC and also the current replication status between the two nodes.


Pls note you have to run the showpartnerstatus from each PSC to list the exact partner list and status.

Also from the output we can find the current sync with all the replication partner with the curren update sequence number ( USN ) value and in case of any failure check the log /var/log/vmware/vmdird/vmdird-syslog.log

VDCREPADMIN Createagreement is to create the replication agreements between the PSC with the same vSphere domains and not  between disparate (separate) vSphere domains.

So in our example we are creating the agreement between PSC37 and PSC35 so that in case of PSC36 failure still we have the replication with other partner in the domain.


Before running the agreement check the current partner.

vdcrepadmin -f showpartners -h PSC_FQDN -u administrator -w Administrator_Password


Use the following command to create a new replication agreement between PSCs .

vdcrepadmin -f createagreement -2 -h Source_PSC_FQDN -H New_PSC_FQDN_to_Replicate -u administrator -w Administrator_Password


If we have more number of PSC then plan to have the mesh topology and by using the createagreement we can plan the same. Due to replication time, it may take a few seconds to minutes for a complete mesh topology to be configured.

VDCREPADMIN Removeagreement is to remove the agreement from the replication partner.

First check the current partnership from the specified PSC:

vdcrepadmin -f showpartners -h PSC_FQDN -u administrator -w Administrator_Password 

Use the following command to remove an existing replication agreement between PSCs:

vdcrepadmin -f removeagreement -2 -h Source_PSC_FQDN -H PSC_FQDN_to_Remove_from_Replication -u administrator -w Administrator_Password


Reference :

KB 2127057

INF8225  – VMworld


Posted in Install and Configure VMware vCSA 6.0, Platform Services Controller (PSC ), VC6.0 Appliance Installation Issue, Vcenter Appliance, vCSA 6.0, VMware | Tagged , , , , , | 1 Comment