Azure DevTest Labs User

One of the use cases is to allow the QA team users to access the image and deeply the VM with no permission to the Azure portal and other resources.

DevTest Labs User from Azure looks satisfying the below requirements.

  • Requirement to only provide DevTest Lab User role to individual QA Testers (and not groups) to isolate one users VMs from another.
  • Ability to turn off marketplace images so users can deploy virtual machines only from approved images.
  • Ability to restrict virtual machine sizes to only pre-approved sizes.
  • No need to create per-user resource groups to apply RBAC and restrict access to specific users.
  • Ability to create “formulas” that are a preconfigured collection of virtual machine settings alongside an image, to further simplify virtual machine deployment.

Posted in Azure | Tagged , | Leave a comment

 VMware Licensing for Java integration in VCSA appliance.

Under the new Java license policy either we need to buy new Java License for any software which use java or else corresponding software vendor should already own the license as part of the product.

We have inquired with VMware and it has been confirmed that No separate licenses are needed for VCSA other than VMware Licenses.

Posted in VCSA6.5, VCSA6.7, vcsa7.0, VCSA8.0, VMware, vROPs | Tagged , | Leave a comment

ESXi host logs to Splunk issue got resolved in the latest VMware release, 7.0 Update 3o

We have noticed that our ESXI hosts are acting as Syslog clients and data from those hosts are being sent to our Syslog Server(port 6514) and then we are redirecting those logs  to Splunk.

We noticed that ESXI host stops reporting to Splunk frequently and every time we have to restart the Syslog Server service or need to re-install the splunk vib in esxi host to report again in spunk server.

This issue has been fixed with the latest VMware release 7.0 update 3o.

Posted in ESXi issue, logs, VMware | Tagged , | Leave a comment

Options to decrypt the VMs protected by VMware native key provider.

VC in which the native key provider was enabled got deleted and couldn’t recover it from the backup. Unfortunately, there is no other backup for the primary key.

 The only option to recover is to clone VMs to remove the encryption and then cutting production over from the current live VMs to the clones. If this does not prove successful, the VMs will need to be rebuilt. 

We need to make sure to backup the VC and protect the primary key in a safe place.

Posted in VMware | Tagged , | Leave a comment

NFS4.1 datastore disconnected after the Netapp storage upgrade\failover.

we have NFS-3 and NFS 4.1 presented to this ESXI hosts both are from the same Netapp storage. As part of the Netapp Storage upgrade failover was performed in the storage end and experienced some APDs error on the hosts, we observed only issues with NFS.41 datastore and NFS-3 datastore in are connected state without any issue.

Error on the host end.

“cpu0:2098542)StorageApdHandlerEv: 110: Device or filesystem with identifier [] has entered the All Paths Down state.2023-06-10T10:52:58.189Z cpu0:)StorageApdHandlerEv: 110: Device or filesystem with identifier [] has entered the All Paths Down state.2023-06-10T10:55:18.193Z cpu0:2098542)StorageApdHandlerEv: 126: Device or filesystem with identifier [] has entered the All Paths Down Timeout state after being in the All Paths Down state for 140 seconds. I/Os will no$

We involved Netapp to check the storage log and below are the findings.

When LIF is migrated to another node, storage will send RST to close existing connection and send GARP to update mac address of the port where LIF is moved, so the client can initiate new connection to LIF over that mac.

Trace review

Storage sent RST to close the connection and sent GARP to update new MAC

    Client uses new MAC and initiated TCP connection to storage, and it is successful

    Client sent Bind_conn_to_session call and storage response is NFS4ERR_BADSESSION. Once that error is received, clients should send Exchange ID and initiate new session with storage.

    When the client sends Exchange ID, the storage response is NFS4ERR_DELAY. When the NFS4ERR_DELAY response is received on the client, the client will retry the same operation.

    Later, we could see Exchange ID operation is responded to successfully by storage and the client Acknowledged the response.

    After that, the client is not sending calls to initiate the session and only sends TCP keep-alive calls. Same behavior is seen when LIF is reverted back to the node.From the traces it clearly shows that after acknowledging the response the client is not able to send further calls to initiate the session.

    Vmware is involved to check further in the logs based on the Netapp input.

    From the logs, once NFS41 client gets the EXiD from the server, it is comparing the cluster roles. NFS41 client is expecting 393, but got 655, so it is bailing out there.

    WARNING: NFS41: NFS41ExidNFSProcess:2054: EXCHANGE_ID error: NFS4ERR_DELAY
    2023-06-21T14:27:44.755Z cpu4:2099419)WARNING: NFS41: NFS41ExidNFSProcess:2054: EXCHANGE_ID error: NFS4ERR_DELAY

    As per VMware it looks like the NFS server role got changed from 0x60**** before upgrade to 0x10*** after upgrade. Vmkernel.log doesn’t have the original server logs that shows the role before upgrade may be they rolled over. If the server role got changed, our NFS41 client doesn’t re-initiate the sessions but it expects to un-mount and remount the datastores.In the captured logs prior to the upgrade the role flags as 0x00060*** which is (EXCHGID4_FLAG_USE_PNFS_MDS | EXCHGID4_FLAG_USE_PNFS_DS), after the upgrade we got these role flags as 0x00010**** which is EXCHGID4_FLAG_USE_NON_PNFS.

    Further investigation VmWare explained the NFS41 client behavior WRT server role change.

    – pNFS settings was changed in on NetApp Server / volume level (from _USE_PNFS_MDS|_USE_PNFS_DS to USE_NON_PNFS)
    – Later there is NetApp server switchover as part of NetApp Server upgrade, some of the ESXis stopped the Session continuation after the new connection is up

    From the RFC5661 Section 13.1 the server roles are agreed upon during the EXCHANGE_ID operation, after this exchange if the server changes the roles there is no protocol operation to inform this role change to the client. This role mismatch can be a problem because the client is assuming the older role and the server is assuming the newer role.

    In this case, after NetApp Server upgrade the connection reset happened and when the ESXi client re-establishes the connection it got new role in the EXCHANGE_ID for the existing data store. Our current client implementation doesn’t support the role change for already mounted datastores as this involves releasing the older resources and setting up the new resources. So rejected this new role and stopped progressing with the Session establishment.

    Since there is no protocol support for the dynamic change of server role, the VMware recommended method is to un-mount and remount the datastore after server role changes. Which helps us setup the appropriate context based on the server role, then any NetApp Server upgrades will be seamless without any disruption.

    The only workaround is to reboot the host which will reinitiate the connection and connect the datastore back to the hosts.

    Posted in Dell, ESXi issue, Storage, VMware | Tagged , , , | Leave a comment

    Azcopy CONCURRENCY VALUE

    As per MS normal azcopy depends on the CPU but blob to blob is serverto server APIs so as use AZCOPY_CONCURRENCY_VALUE= more than 1000.

    https://learn.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-optimize?toc=%2Fazure%2Fstorage%2Fblobs%2Ftoc.json#increase-the-number-of-concurrent-requests

    ” If you’re copying blobs between storage accounts, consider setting the value of the AZCOPY_CONCURRENCY_VALUE environment variable to a value greater than 1000. You can set this variable high because AzCopy uses server-to-server APIs, so data is copied directly between storage servers and does not use your machine’s processing power. ” 

    Previously it was taken more than 30 mins to download the data but once we provided AZCOPY_CONCURRENCY_VALUE=2000 and was able to copy from blob to blob in 3.41 minutes.

    Posted in AZCopy, Azure, Cloud | Tagged , , | Leave a comment

    Adding an RDM LUN to the second node in the MSCS cluster fails with the error: Incompatible device backing specified for the device

    Followed the document https://docs.vmware.com/en/VMware-vSphere/6.5/vsphere-esxi-vcenter-server-651-setup-mscs.pdf to configure the windows cluster but it is getting failed with the error

    Tried a few VMware KB articles to fix the issue https://kb.vmware.com/s/article/2054897 but all the options got failed and finally the issue got fixed by moving the OS Disk to the RDM LUN.

    Posted in SQL server Failover Cluster, Vcenter Appliance, VMware, Windows | Tagged , , | Leave a comment

    Deploy Diagnostic Settings for Azure Function to Event Hub

    Azure policy to set the diagnostic Settings for Azure Function to Event Hub so that all the events will be routed to the corresponding event hub.

    Azure policy can automate this process and for resources that already exist, a remediation task has to be created.

    Below is code that will help to create the Azure Policy to set the Azure Functions diagnostic settings to event-hub.

    https://github.com/Ganeshsekarbabu/Azure-Policy

    Posted in Azure, Azure event hub, Azure policy, Cloud | Tagged , , | Leave a comment

    Deploy Diagnostic Settings for Storage Analytics to Event Hub.

    {
    “properties”: {
    “displayName”: “Deploy Diagnostic Settings for Storage Analytics to Event Hub”,
    “policyType”: “Custom”,
    “mode”: “Indexed”,
    “metadata”: {
    “createdBy”: “9e1a4c2c-4a14-468c-bad0-0ed38afbb990”,
    “createdOn”: “2023-01-17T02:48:32.7901427Z”,
    “updatedBy”: null,
    “updatedOn”: null,
    “category”: “Storage”
    },
    “parameters”: {
    “eventHubRuleId”: {
    “type”: “String”,
    “metadata”: {
    “displayName”: “Event Hub Authorization Rule Id”,
    “description”: “The Event Hub authorization rule Id for Azure Diagnostics. The authorization rule needs to be at Event Hub namespace level. e.g. /subscriptions/{subscription Id}/resourceGroups/{resource group}/providers/Microsoft.EventHub/namespaces/{Event Hub namespace}/authorizationrules/{authorization rule}”,
    “strongType”: “Microsoft.EventHub/Namespaces/AuthorizationRules”,
    “assignPermissions”: true
    }
    },
    “eventHubName”: {
    “type”: “String”,
    “metadata”: {
    “displayName”: “Event Hub name”,
    “assignPermissions”: true
    }
    },
    “eventHubLocation”: {
    “type”: “String”,
    “metadata”: {
    “displayName”: “Event Hub Location”,
    “description”: “The location the Event Hub resides in. Only Logic Apps in this location will be linked to this Event Hub.”,
    “strongType”: “location”
    },
    “defaultValue”: “”
    },
    “servicesToDeploy”: {
    “type”: “Array”,
    “metadata”: {
    “displayName”: “Storage services to deploy”,
    “description”: “List of Storage services to deploy”
    },
    “allowedValues”: [
    “storageAccounts”,
    “blobServices”,
    “fileServices”,
    “tableServices”,
    “queueServices”
    ],
    “defaultValue”: [
    “storageAccounts”,
    “blobServices”,
    “fileServices”,
    “tableServices”,
    “queueServices”
    ]
    },
    “diagnosticsSettingNameToUse”: {
    “type”: “String”,
    “metadata”: {
    “displayName”: “Setting name”,
    “description”: “Name of the diagnostic settings.”
    },
    “defaultValue”: “storageAccountsDiagnosticsLogsToEventHub”
    },
    “effect”: {
    “type”: “String”,
    “metadata”: {
    “displayName”: “Effect”,
    “description”: “Enable or disable the execution of the policy”
    },
    “allowedValues”: [
    “DeployIfNotExists”,
    “Disabled”
    ],
    “defaultValue”: “DeployIfNotExists”
    },
    “StorageDelete”: {
    “type”: “String”,
    “metadata”: {
    “displayName”: “StorageDelete – Enabled”,
    “description”: “Whether to stream StorageDelete logs to the Log Analytics workspace – True or False”
    },
    “allowedValues”: [
    “True”,
    “False”
    ],
    “defaultValue”: “True”
    },
    “StorageWrite”: {
    “type”: “String”,
    “metadata”: {
    “displayName”: “StorageWrite – Enabled”,
    “description”: “Whether to stream StorageWrite logs to the Log Analytics workspace – True or False”
    },
    “allowedValues”: [
    “True”,
    “False”
    ],
    “defaultValue”: “True”
    },
    “StorageRead”: {
    “type”: “String”,
    “metadata”: {
    “displayName”: “StorageRead – Enabled”,
    “description”: “Whether to stream StorageRead logs to the Log Analytics workspace – True or False”
    },
    “allowedValues”: [
    “True”,
    “False”
    ],
    “defaultValue”: “True”
    },
    “Transaction”: {
    “type”: “String”,
    “metadata”: {
    “displayName”: “Transaction – Enabled”,
    “description”: “Whether to stream Transaction logs to the Log Analytics workspace – True or False”
    },
    “allowedValues”: [
    “True”,
    “False”
    ],
    “defaultValue”: “True”
    }
    },
    “policyRule”: {
    “if”: {
    “allOf”: [
    {
    “field”: “type”,
    “equals”: “Microsoft.Storage/storageAccounts”
    },
    {
    “anyOf”: [
    {
    “value”: “[parameters(‘eventHubLocation’)]”,
    “equals”: “”
    },
    {
    “field”: “location”,
    “equals”: “[parameters(‘eventHubLocation’)]”
    }
    ]
    }
    ]
    },
    “then”: {
    “effect”: “[parameters(‘effect’)]”,
    “details”: {
    “type”: “Microsoft.Insights/diagnosticSettings”,
    “roleDefinitionIds”: [
    “/providers/Microsoft.Authorization/roleDefinitions/749f88d5-cbae-40b8-bcfc-e573ddc772fa”,
    “/providers/Microsoft.Authorization/roleDefinitions/f526a384-b230-433a-b45c-95f59c4a2dec”,
    “/providers/Microsoft.Authorization/roleDefinitions/b24988ac-6180-42a0-ab88-20f7382dd24c”
    ],
    “existenceCondition”: {
    “allOf”: [
    {
    “count”: {
    “field”: “Microsoft.Insights/diagnosticSettings/metrics[*]”,
    “where”: {
    “allOf”: [
    {
    “field”: “Microsoft.Insights/diagnosticSettings/metrics[*].category”,
    “equals”: “Transaction”
    },
    {
    “field”: “Microsoft.Insights/diagnosticSettings/metrics[*].enabled”,
    “equals”: “True”
    }
    ]
    }
    },
    “greater”: 0
    },
    {
    “field”: “Microsoft.Insights/diagnosticSettings/logs.enabled”,
    “contains”: “true”
    },
    {
    “field”: “Microsoft.Insights/diagnosticSettings/eventHubAuthorizationRuleId”,
    “equals”: “[parameters(‘eventHubRuleId’)]”
    }
    ]
    },
    “deployment”: {
    “properties”: {
    “mode”: “incremental”,
    “template”: {
    “$schema”: “http://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#”,
    “contentVersion”: “1.0.0.0”,
    “parameters”: {
    “servicesToDeploy”: {
    “type”: “array”
    },
    “diagnosticsSettingNameToUse”: {
    “type”: “string”
    },
    “resourceName”: {
    “type”: “string”
    },
    “eventHubRuleId”: {
    “type”: “string”
    },
    “eventHubName”: {
    “type”: “string”
    },
    “location”: {
    “type”: “string”
    },
    “Transaction”: {
    “type”: “string”
    },
    “StorageRead”: {
    “type”: “string”
    },
    “StorageWrite”: {
    “type”: “string”
    },
    “StorageDelete”: {
    “type”: “string”
    }
    },
    “variables”: {},
    “resources”: [
    {
    “condition”: “[contains(parameters(‘servicesToDeploy’), ‘blobServices’)]”,
    “type”: “Microsoft.Storage/storageAccounts/blobServices/providers/diagnosticSettings”,
    “apiVersion”: “2017-05-01-preview”,
    “name”: “[concat(parameters(‘resourceName’), ‘/default/’, ‘Microsoft.Insights/’, parameters(‘diagnosticsSettingNameToUse’))]”,
    “location”: “[parameters(‘location’)]”,
    “dependsOn”: [],
    “properties”: {
    “eventHubAuthorizationRuleId”: “[parameters(‘eventHubRuleId’)]”,
    “eventHubName”: “[parameters(‘eventHubName’)]”,
    “metrics”: [
    {
    “category”: “Transaction”,
    “enabled”: “[parameters(‘Transaction’)]”,
    “retentionPolicy”: {
    “days”: 0,
    “enabled”: false
    },
    “timeGrain”: null
    }
    ],
    “logs”: [
    {
    “category”: “StorageRead”,
    “enabled”: “[parameters(‘StorageRead’)]”
    },
    {
    “category”: “StorageWrite”,
    “enabled”: “[parameters(‘StorageWrite’)]”
    },
    {
    “category”: “StorageDelete”,
    “enabled”: “[parameters(‘StorageDelete’)]”
    }
    ]
    }
    },
    {
    “condition”: “[contains(parameters(‘servicesToDeploy’), ‘fileServices’)]”,
    “type”: “Microsoft.Storage/storageAccounts/fileServices/providers/diagnosticSettings”,
    “apiVersion”: “2017-05-01-preview”,
    “name”: “[concat(parameters(‘resourceName’), ‘/default/’, ‘Microsoft.Insights/’, parameters(‘diagnosticsSettingNameToUse’))]”,
    “location”: “[parameters(‘location’)]”,
    “dependsOn”: [],
    “properties”: {
    “eventHubAuthorizationRuleId”: “[parameters(‘eventHubRuleId’)]”,
    “eventHubName”: “[parameters(‘eventHubName’)]”,
    “metrics”: [
    {
    “category”: “Transaction”,
    “enabled”: “[parameters(‘Transaction’)]”,
    “retentionPolicy”: {
    “days”: 0,
    “enabled”: false
    },
    “timeGrain”: null
    }
    ],
    “logs”: [
    {
    “category”: “StorageRead”,
    “enabled”: “[parameters(‘StorageRead’)]”
    },
    {
    “category”: “StorageWrite”,
    “enabled”: “[parameters(‘StorageWrite’)]”
    },
    {
    “category”: “StorageDelete”,
    “enabled”: “[parameters(‘StorageDelete’)]”
    }
    ]
    }
    },
    {
    “condition”: “[contains(parameters(‘servicesToDeploy’), ‘tableServices’)]”,
    “type”: “Microsoft.Storage/storageAccounts/tableServices/providers/diagnosticSettings”,
    “apiVersion”: “2017-05-01-preview”,
    “name”: “[concat(parameters(‘resourceName’), ‘/default/’, ‘Microsoft.Insights/’, parameters(‘diagnosticsSettingNameToUse’))]”,
    “location”: “[parameters(‘location’)]”,
    “dependsOn”: [],
    “properties”: {
    “eventHubAuthorizationRuleId”: “[parameters(‘eventHubRuleId’)]”,
    “eventHubName”: “[parameters(‘eventHubName’)]”,
    “metrics”: [
    {
    “category”: “Transaction”,
    “enabled”: “[parameters(‘Transaction’)]”,
    “retentionPolicy”: {
    “days”: 0,
    “enabled”: false
    },
    “timeGrain”: null
    }
    ],
    “logs”: [
    {
    “category”: “StorageRead”,
    “enabled”: “[parameters(‘StorageRead’)]”
    },
    {
    “category”: “StorageWrite”,
    “enabled”: “[parameters(‘StorageWrite’)]”
    },
    {
    “category”: “StorageDelete”,
    “enabled”: “[parameters(‘StorageDelete’)]”
    }
    ]
    }
    },
    {
    “condition”: “[contains(parameters(‘servicesToDeploy’), ‘queueServices’)]”,
    “type”: “Microsoft.Storage/storageAccounts/queueServices/providers/diagnosticSettings”,
    “apiVersion”: “2017-05-01-preview”,
    “name”: “[concat(parameters(‘resourceName’), ‘/default/’, ‘Microsoft.Insights/’, parameters(‘diagnosticsSettingNameToUse’))]”,
    “location”: “[parameters(‘location’)]”,
    “dependsOn”: [],
    “properties”: {
    “eventHubAuthorizationRuleId”: “[parameters(‘eventHubRuleId’)]”,
    “eventHubName”: “[parameters(‘eventHubName’)]”,
    “metrics”: [
    {
    “category”: “Transaction”,
    “enabled”: “[parameters(‘Transaction’)]”,
    “retentionPolicy”: {
    “days”: 0,
    “enabled”: false
    },
    “timeGrain”: null
    }
    ],
    “logs”: [
    {
    “category”: “StorageRead”,
    “enabled”: “[parameters(‘StorageRead’)]”
    },
    {
    “category”: “StorageWrite”,
    “enabled”: “[parameters(‘StorageWrite’)]”
    },
    {
    “category”: “StorageDelete”,
    “enabled”: “[parameters(‘StorageDelete’)]”
    }
    ]
    }
    },
    {
    “condition”: “[contains(parameters(‘servicesToDeploy’), ‘storageAccounts’)]”,
    “type”: “Microsoft.Storage/storageAccounts/providers/diagnosticSettings”,
    “apiVersion”: “2017-05-01-preview”,
    “name”: “[concat(parameters(‘resourceName’), ‘/’, ‘Microsoft.Insights/’, parameters(‘diagnosticsSettingNameToUse’))]”,
    “location”: “[parameters(‘location’)]”,
    “dependsOn”: [],
    “properties”: {
    “eventHubAuthorizationRuleId”: “[parameters(‘eventHubRuleId’)]”,
    “eventHubName”: “[parameters(‘eventHubName’)]”,
    “metrics”: [
    {
    “category”: “Transaction”,
    “enabled”: “[parameters(‘Transaction’)]”,
    “retentionPolicy”: {
    “days”: 0,
    “enabled”: false
    },
    “timeGrain”: null
    }
    ]
    }
    }
    ],
    “outputs”: {}
    },
    “parameters”: {
    “diagnosticsSettingNameToUse”: {
    “value”: “[parameters(‘diagnosticsSettingNameToUse’)]”
    },
    “eventHubRuleId”: {
    “value”: “[parameters(‘eventHubRuleId’)]”
    },
    “eventHubName”: {
    “value”: “[parameters(‘eventHubName’)]”
    },
    “location”: {
    “value”: “[field(‘location’)]”
    },
    “resourceName”: {
    “value”: “[field(‘name’)]”
    },
    “Transaction”: {
    “value”: “[parameters(‘Transaction’)]”
    },
    “StorageDelete”: {
    “value”: “[parameters(‘StorageDelete’)]”
    },
    “StorageWrite”: {
    “value”: “[parameters(‘StorageWrite’)]”
    },
    “StorageRead”: {
    “value”: “[parameters(‘StorageRead’)]”
    },
    “servicesToDeploy”: {
    “value”: “[parameters(‘servicesToDeploy’)]”
    }
    }
    }
    }
    }
    }
    }
    },
    “id”: “/subscriptions/6cae13a3-4be0-48e0-9466-d9f2f0f33bc9/providers/Microsoft.Authorization/policyDefinitions/9ecfc4b5-5444-4c6e-832e-c06ac3ef2ecc”,
    “type”: “Microsoft.Authorization/policyDefinitions”,
    “name”: “9ecfc4b5-5444-4c6e-832e-c06ac3ef2ecc”,
    “systemData”: {
    “createdBy”: “ganesh.sekarbabu@autodesk.com”,
    “createdByType”: “User”,
    “createdAt”: “2023-01-17T02:48:32.7728818Z”,
    “lastModifiedBy”: “ganesh.sekarbabu@autodesk.com”,
    “lastModifiedByType”: “User”,
    “lastModifiedAt”: “2023-01-17T02:48:32.7728818Z”
    }
    }
    Posted in Azure, Azure event hub, Azure policy, Cloud | Tagged , , | Leave a comment

    vCenter CPU Usage reaches 3000 % in VIMTOP

    The issue got started after upgrading the vCenter Server 7.0 Update 3h | 13 SEP 2022 | ISO Build 20395099. We have noticed the CPU of the VC went very high and couldn’t login to the VC. We tried a few options like restarting the services and rebooting the VC but nothing helped to fix the issue.

    Later we identified the VPXA log rotation was very fast, every 5 -7 mins new logs are generated so raised the ticket with VMWare and it is confirmed that it is a known issue on this patch and recommended upgrading the VC to the latest.

    After upgrading the VC to the vCenter Server 7.0 Update 3i | 08 DEC 2022 | ISO Build 20845200 verified the CPU and it is less than 10%.

    We didn’t find any resolved issue in the patch release notes but the latest patch helped to fix the issue.

    Note: VMware has released another new patch so pls plan to go with whatever latest available patch.

    Posted in vcsa7.0, VMware | Tagged , , | Leave a comment