Recently we have upgraded the VMtools version from 10.305\10.346 to 11265 – 11.0.1 and we noticed few VMs went to hung status and noticed the below alert in windows VMs.
vmware.log:
2020-02-07T12:50:58.182Z| vcpu-0| I125: Guest: vsep: AUDIT: VFileSocketMgrCloseSocket : Mux is disconnected <—————————————— 2020-02-07T12:50:58.297Z| vmx| I125: VigorTransportProcessClientPayload: opID=3997b233-39-9b26 seq=290: Receiving MKS.IssueTicket request. 2020-02-07T12:50:58.297Z| vmx| I125: SOCKET 5 (129) creating new listening socket on port -1 2020-02-07T12:50:58.297Z| vmx| I125: Issuing new webmks ticket a9161e… (120 seconds) 2020-02-07T12:50:58.297Z| vmx| I125: VigorTransport_ServerSendResponse opID=3997b233-39-9b26 seq=290: Completed MKS request. 2020-02-07T12:50:58.666Z| vcpu-0| I125: Guest: vsep: AUDIT: SetupConsumerContext : Setting event Type as 256 from 0 2020-02-07T12:50:58.667Z| vcpu-1| I125: Guest: vsep: AUDIT: SetupConsumerContext : Setting event Type as 256 from 0 2020-02-07T12:50:58.676Z| vcpu-1| I125: Guest: vsep: AUDIT: SetupConsumerContext : Setting event Type as 256 from 0
VMware ticket has been raised and they recommended to upgrade the NSX Manager to 6.4.4 and confirmed the below
There is an internal bug which confirms that this is a known issue with VMware tools version you are using ( 11.0.1 ) and there is no external documentation available confirming this aspect. We have confirmed based on an engineering ticket that we have referred. As per the engineering ticket, this should be made available in the release notes of 11.0.5 and expected to be fixed in 11.1. There is no ETA mentioned about these releases.
2019 started with the lot of new surprises in the company roadmap and one of the main change is to move the on-perm to the cloud which means reducing the VMware footprint. Initially it was tough to accept the change but it was good planing on the change from the management end like providing enough training on the AWS\Azur and once team is having sufficient knowledge and confident then start on migrating the work load to the cloud.
I was put in AWS Training and lot of new learnings which allowed to prepare for the certification and after several months of preparation and experience, completed the AWS solution architect certification. Last three months started migrating the application to AWS which is very challenging to understand the current design of the application and planning on it to run in cloud.Gained some experience in CHEF and last two months started working on CI\CD also with few python automation.
Past four years I got the opportunity to attend the VMware conference but this year since the focus is on the cloud, didnt get the chance to attend but at the same time had change to attend my first AWS-Reinvent which is awesome for learning and explore new services in AWS..
It was a good year and looking forward 2020 to learn more in cloud services.
One of our vCenter was having issue to login using the AD Credentials . We verified the DNS and the other VC ‘s which connects to the same DNS and AD , found no issues.
When we checked the websso.log , noticed the below error.
2019-11-25T16:08:43.717Z vsphere.local 8d2b3655-340a-46db-b879-5b680911c743 ERROR] [IdentityManager] Failed to authenticate principal [ADUSER@ADDOMAIN] for tenant [vsphere.local]com.vmware.identity.interop.idm.IdmNativeException: Native platform error [code: 851968][null][null]
atcom.vmware.identity.interop.idm.LinuxIdmNativeAdapter.AuthenticateByPassword(LinuxIdmNativeAdapter.java:180)
atcom.vmware.identity.idm.server.provider.activedirectory.ActiveDirectoryProvider.authenticate(ActiveDirectoryProvider.java:279)
atcom.vmware.identity.idm.server.IdentityManager.authenticate(IdentityManager.java:2777)
atcom.vmware.identity.idm.server.IdentityManager.authenticate(IdentityManager.java:9145)
at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at sun.rmi.server.UnicastServerRef.dispatch(Unknown Source)
at sun.rmi.transport.Transport$2.run(Unknown Source)
at sun.rmi.transport.Transport$2.run(Unknown Source)
We tried by rebooting the VC and also removing and adding the AD , even-though we are able to search the AD objects but the authentication was getting failed and finally the below steps fixed the issue.
Removed the VC from the domain.
Deleted the computer account from the AD
Re-added the VC back to the domain.
Rebooted the VC, tested connection which was working fine.
By the help of the link , configured the AWS Server Migration Service and at final stage of the sync it got failed with the error ” Instance failed to boot and establish network connectivity”
So we stopped all the non-microsoft services on the windows instance and tried the sync and it got completed successfully.
As per my previous blog on SMB1 AD authentication issue in 6.5u1 , VMware communicated that it will be fixed after the 6.7 U2 update but it looks like in the recent 6.5 U3 update it got fixed.In the release notes they mentioned some fix related to AD and we tested with few hosts and able to connect the AD now without issue.
PR 2268193: Managing the Active Directory lwsmd service from the Host Client or vSphere Web Client might fail
Managing the Active Directory lwsmd service from the Host Client or vSphere Web Client might fail with the following error: Failed – A general system error occurred: Command /etc/init.d/lwsmd timed out after 30 secs.
In our POC for our new application , noticed the significant differences on the performance on two different environment which is running on same hardware model.Even though the network layer is different from the environment A and B the issue is when the data is copied locally .
Later we noticed when we change the EVC Mode from Intel Nehalem Generation to Intel Ivy Bridge a good performance improvement.
From the VMwareblog , it is mentioned that ” For most enterprise applications you can see there is no, or an almost immeasurable, performance impact when using EVC. But, there are certain corner cases, like encryption, that are crippled when instructions sets like AES-NI set are not available (Example: Oracle Transparent Data Encryption, OpenSSL)” so our data in POC is also encrypted and to make sure we setup the test VM and configured the Apache for the speed test with SSL on port 1000.
The htdocs folder was in localdisk (tintri NFS datastore) and tested with localdisk (local SAS disk datastore)
As per the VMware Blog also they mentioned the test result and for encryption there is a huge different so when setting up the new cluster , we should understand the type of application and traffic type , based on that we have t select the cluster EVC mode.
Note : Anything above Nehalem Generation wont be supported by VMware 7.0 version for HP Gen8 old hardware models so pls check the VMware compatible guide and choose the EVC mode.