Deploy Diagnostic Settings for Storage Analytics to Event Hub.

{
“properties”: {
“displayName”: “Deploy Diagnostic Settings for Storage Analytics to Event Hub”,
“policyType”: “Custom”,
“mode”: “Indexed”,
“metadata”: {
“createdBy”: “9e1a4c2c-4a14-468c-bad0-0ed38afbb990”,
“createdOn”: “2023-01-17T02:48:32.7901427Z”,
“updatedBy”: null,
“updatedOn”: null,
“category”: “Storage”
},
“parameters”: {
“eventHubRuleId”: {
“type”: “String”,
“metadata”: {
“displayName”: “Event Hub Authorization Rule Id”,
“description”: “The Event Hub authorization rule Id for Azure Diagnostics. The authorization rule needs to be at Event Hub namespace level. e.g. /subscriptions/{subscription Id}/resourceGroups/{resource group}/providers/Microsoft.EventHub/namespaces/{Event Hub namespace}/authorizationrules/{authorization rule}”,
“strongType”: “Microsoft.EventHub/Namespaces/AuthorizationRules”,
“assignPermissions”: true
}
},
“eventHubName”: {
“type”: “String”,
“metadata”: {
“displayName”: “Event Hub name”,
“assignPermissions”: true
}
},
“eventHubLocation”: {
“type”: “String”,
“metadata”: {
“displayName”: “Event Hub Location”,
“description”: “The location the Event Hub resides in. Only Logic Apps in this location will be linked to this Event Hub.”,
“strongType”: “location”
},
“defaultValue”: “”
},
“servicesToDeploy”: {
“type”: “Array”,
“metadata”: {
“displayName”: “Storage services to deploy”,
“description”: “List of Storage services to deploy”
},
“allowedValues”: [
“storageAccounts”,
“blobServices”,
“fileServices”,
“tableServices”,
“queueServices”
],
“defaultValue”: [
“storageAccounts”,
“blobServices”,
“fileServices”,
“tableServices”,
“queueServices”
]
},
“diagnosticsSettingNameToUse”: {
“type”: “String”,
“metadata”: {
“displayName”: “Setting name”,
“description”: “Name of the diagnostic settings.”
},
“defaultValue”: “storageAccountsDiagnosticsLogsToEventHub”
},
“effect”: {
“type”: “String”,
“metadata”: {
“displayName”: “Effect”,
“description”: “Enable or disable the execution of the policy”
},
“allowedValues”: [
“DeployIfNotExists”,
“Disabled”
],
“defaultValue”: “DeployIfNotExists”
},
“StorageDelete”: {
“type”: “String”,
“metadata”: {
“displayName”: “StorageDelete – Enabled”,
“description”: “Whether to stream StorageDelete logs to the Log Analytics workspace – True or False”
},
“allowedValues”: [
“True”,
“False”
],
“defaultValue”: “True”
},
“StorageWrite”: {
“type”: “String”,
“metadata”: {
“displayName”: “StorageWrite – Enabled”,
“description”: “Whether to stream StorageWrite logs to the Log Analytics workspace – True or False”
},
“allowedValues”: [
“True”,
“False”
],
“defaultValue”: “True”
},
“StorageRead”: {
“type”: “String”,
“metadata”: {
“displayName”: “StorageRead – Enabled”,
“description”: “Whether to stream StorageRead logs to the Log Analytics workspace – True or False”
},
“allowedValues”: [
“True”,
“False”
],
“defaultValue”: “True”
},
“Transaction”: {
“type”: “String”,
“metadata”: {
“displayName”: “Transaction – Enabled”,
“description”: “Whether to stream Transaction logs to the Log Analytics workspace – True or False”
},
“allowedValues”: [
“True”,
“False”
],
“defaultValue”: “True”
}
},
“policyRule”: {
“if”: {
“allOf”: [
{
“field”: “type”,
“equals”: “Microsoft.Storage/storageAccounts”
},
{
“anyOf”: [
{
“value”: “[parameters(‘eventHubLocation’)]”,
“equals”: “”
},
{
“field”: “location”,
“equals”: “[parameters(‘eventHubLocation’)]”
}
]
}
]
},
“then”: {
“effect”: “[parameters(‘effect’)]”,
“details”: {
“type”: “Microsoft.Insights/diagnosticSettings”,
“roleDefinitionIds”: [
“/providers/Microsoft.Authorization/roleDefinitions/749f88d5-cbae-40b8-bcfc-e573ddc772fa”,
“/providers/Microsoft.Authorization/roleDefinitions/f526a384-b230-433a-b45c-95f59c4a2dec”,
“/providers/Microsoft.Authorization/roleDefinitions/b24988ac-6180-42a0-ab88-20f7382dd24c”
],
“existenceCondition”: {
“allOf”: [
{
“count”: {
“field”: “Microsoft.Insights/diagnosticSettings/metrics[*]”,
“where”: {
“allOf”: [
{
“field”: “Microsoft.Insights/diagnosticSettings/metrics[*].category”,
“equals”: “Transaction”
},
{
“field”: “Microsoft.Insights/diagnosticSettings/metrics[*].enabled”,
“equals”: “True”
}
]
}
},
“greater”: 0
},
{
“field”: “Microsoft.Insights/diagnosticSettings/logs.enabled”,
“contains”: “true”
},
{
“field”: “Microsoft.Insights/diagnosticSettings/eventHubAuthorizationRuleId”,
“equals”: “[parameters(‘eventHubRuleId’)]”
}
]
},
“deployment”: {
“properties”: {
“mode”: “incremental”,
“template”: {
“$schema”: “http://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#”,
“contentVersion”: “1.0.0.0”,
“parameters”: {
“servicesToDeploy”: {
“type”: “array”
},
“diagnosticsSettingNameToUse”: {
“type”: “string”
},
“resourceName”: {
“type”: “string”
},
“eventHubRuleId”: {
“type”: “string”
},
“eventHubName”: {
“type”: “string”
},
“location”: {
“type”: “string”
},
“Transaction”: {
“type”: “string”
},
“StorageRead”: {
“type”: “string”
},
“StorageWrite”: {
“type”: “string”
},
“StorageDelete”: {
“type”: “string”
}
},
“variables”: {},
“resources”: [
{
“condition”: “[contains(parameters(‘servicesToDeploy’), ‘blobServices’)]”,
“type”: “Microsoft.Storage/storageAccounts/blobServices/providers/diagnosticSettings”,
“apiVersion”: “2017-05-01-preview”,
“name”: “[concat(parameters(‘resourceName’), ‘/default/’, ‘Microsoft.Insights/’, parameters(‘diagnosticsSettingNameToUse’))]”,
“location”: “[parameters(‘location’)]”,
“dependsOn”: [],
“properties”: {
“eventHubAuthorizationRuleId”: “[parameters(‘eventHubRuleId’)]”,
“eventHubName”: “[parameters(‘eventHubName’)]”,
“metrics”: [
{
“category”: “Transaction”,
“enabled”: “[parameters(‘Transaction’)]”,
“retentionPolicy”: {
“days”: 0,
“enabled”: false
},
“timeGrain”: null
}
],
“logs”: [
{
“category”: “StorageRead”,
“enabled”: “[parameters(‘StorageRead’)]”
},
{
“category”: “StorageWrite”,
“enabled”: “[parameters(‘StorageWrite’)]”
},
{
“category”: “StorageDelete”,
“enabled”: “[parameters(‘StorageDelete’)]”
}
]
}
},
{
“condition”: “[contains(parameters(‘servicesToDeploy’), ‘fileServices’)]”,
“type”: “Microsoft.Storage/storageAccounts/fileServices/providers/diagnosticSettings”,
“apiVersion”: “2017-05-01-preview”,
“name”: “[concat(parameters(‘resourceName’), ‘/default/’, ‘Microsoft.Insights/’, parameters(‘diagnosticsSettingNameToUse’))]”,
“location”: “[parameters(‘location’)]”,
“dependsOn”: [],
“properties”: {
“eventHubAuthorizationRuleId”: “[parameters(‘eventHubRuleId’)]”,
“eventHubName”: “[parameters(‘eventHubName’)]”,
“metrics”: [
{
“category”: “Transaction”,
“enabled”: “[parameters(‘Transaction’)]”,
“retentionPolicy”: {
“days”: 0,
“enabled”: false
},
“timeGrain”: null
}
],
“logs”: [
{
“category”: “StorageRead”,
“enabled”: “[parameters(‘StorageRead’)]”
},
{
“category”: “StorageWrite”,
“enabled”: “[parameters(‘StorageWrite’)]”
},
{
“category”: “StorageDelete”,
“enabled”: “[parameters(‘StorageDelete’)]”
}
]
}
},
{
“condition”: “[contains(parameters(‘servicesToDeploy’), ‘tableServices’)]”,
“type”: “Microsoft.Storage/storageAccounts/tableServices/providers/diagnosticSettings”,
“apiVersion”: “2017-05-01-preview”,
“name”: “[concat(parameters(‘resourceName’), ‘/default/’, ‘Microsoft.Insights/’, parameters(‘diagnosticsSettingNameToUse’))]”,
“location”: “[parameters(‘location’)]”,
“dependsOn”: [],
“properties”: {
“eventHubAuthorizationRuleId”: “[parameters(‘eventHubRuleId’)]”,
“eventHubName”: “[parameters(‘eventHubName’)]”,
“metrics”: [
{
“category”: “Transaction”,
“enabled”: “[parameters(‘Transaction’)]”,
“retentionPolicy”: {
“days”: 0,
“enabled”: false
},
“timeGrain”: null
}
],
“logs”: [
{
“category”: “StorageRead”,
“enabled”: “[parameters(‘StorageRead’)]”
},
{
“category”: “StorageWrite”,
“enabled”: “[parameters(‘StorageWrite’)]”
},
{
“category”: “StorageDelete”,
“enabled”: “[parameters(‘StorageDelete’)]”
}
]
}
},
{
“condition”: “[contains(parameters(‘servicesToDeploy’), ‘queueServices’)]”,
“type”: “Microsoft.Storage/storageAccounts/queueServices/providers/diagnosticSettings”,
“apiVersion”: “2017-05-01-preview”,
“name”: “[concat(parameters(‘resourceName’), ‘/default/’, ‘Microsoft.Insights/’, parameters(‘diagnosticsSettingNameToUse’))]”,
“location”: “[parameters(‘location’)]”,
“dependsOn”: [],
“properties”: {
“eventHubAuthorizationRuleId”: “[parameters(‘eventHubRuleId’)]”,
“eventHubName”: “[parameters(‘eventHubName’)]”,
“metrics”: [
{
“category”: “Transaction”,
“enabled”: “[parameters(‘Transaction’)]”,
“retentionPolicy”: {
“days”: 0,
“enabled”: false
},
“timeGrain”: null
}
],
“logs”: [
{
“category”: “StorageRead”,
“enabled”: “[parameters(‘StorageRead’)]”
},
{
“category”: “StorageWrite”,
“enabled”: “[parameters(‘StorageWrite’)]”
},
{
“category”: “StorageDelete”,
“enabled”: “[parameters(‘StorageDelete’)]”
}
]
}
},
{
“condition”: “[contains(parameters(‘servicesToDeploy’), ‘storageAccounts’)]”,
“type”: “Microsoft.Storage/storageAccounts/providers/diagnosticSettings”,
“apiVersion”: “2017-05-01-preview”,
“name”: “[concat(parameters(‘resourceName’), ‘/’, ‘Microsoft.Insights/’, parameters(‘diagnosticsSettingNameToUse’))]”,
“location”: “[parameters(‘location’)]”,
“dependsOn”: [],
“properties”: {
“eventHubAuthorizationRuleId”: “[parameters(‘eventHubRuleId’)]”,
“eventHubName”: “[parameters(‘eventHubName’)]”,
“metrics”: [
{
“category”: “Transaction”,
“enabled”: “[parameters(‘Transaction’)]”,
“retentionPolicy”: {
“days”: 0,
“enabled”: false
},
“timeGrain”: null
}
]
}
}
],
“outputs”: {}
},
“parameters”: {
“diagnosticsSettingNameToUse”: {
“value”: “[parameters(‘diagnosticsSettingNameToUse’)]”
},
“eventHubRuleId”: {
“value”: “[parameters(‘eventHubRuleId’)]”
},
“eventHubName”: {
“value”: “[parameters(‘eventHubName’)]”
},
“location”: {
“value”: “[field(‘location’)]”
},
“resourceName”: {
“value”: “[field(‘name’)]”
},
“Transaction”: {
“value”: “[parameters(‘Transaction’)]”
},
“StorageDelete”: {
“value”: “[parameters(‘StorageDelete’)]”
},
“StorageWrite”: {
“value”: “[parameters(‘StorageWrite’)]”
},
“StorageRead”: {
“value”: “[parameters(‘StorageRead’)]”
},
“servicesToDeploy”: {
“value”: “[parameters(‘servicesToDeploy’)]”
}
}
}
}
}
}
}
},
“id”: “/subscriptions/6cae13a3-4be0-48e0-9466-d9f2f0f33bc9/providers/Microsoft.Authorization/policyDefinitions/9ecfc4b5-5444-4c6e-832e-c06ac3ef2ecc”,
“type”: “Microsoft.Authorization/policyDefinitions”,
“name”: “9ecfc4b5-5444-4c6e-832e-c06ac3ef2ecc”,
“systemData”: {
“createdBy”: “ganesh.sekarbabu@autodesk.com”,
“createdByType”: “User”,
“createdAt”: “2023-01-17T02:48:32.7728818Z”,
“lastModifiedBy”: “ganesh.sekarbabu@autodesk.com”,
“lastModifiedByType”: “User”,
“lastModifiedAt”: “2023-01-17T02:48:32.7728818Z”
}
}
Advertisement
Posted in Azure, Azure event hub, Azure policy, Cloud | Tagged , , | Leave a comment

vCenter CPU Usage reaches 3000 % in VIMTOP

The issue got started after upgrading the vCenter Server 7.0 Update 3h | 13 SEP 2022 | ISO Build 20395099. We have noticed the CPU of the VC went very high and couldn’t login to the VC. We tried a few options like restarting the services and rebooting the VC but nothing helped to fix the issue.

Later we identified the VPXA log rotation was very fast, every 5 -7 mins new logs are generated so raised the ticket with VMWare and it is confirmed that it is a known issue on this patch and recommended upgrading the VC to the latest.

After upgrading the VC to the vCenter Server 7.0 Update 3i | 08 DEC 2022 | ISO Build 20845200 verified the CPU and it is less than 10%.

We didn’t find any resolved issue in the patch release notes but the latest patch helped to fix the issue.

Note: VMware has released another new patch so pls plan to go with whatever latest available patch.

Posted in vcsa7.0, VMware | Tagged , , | Leave a comment

Azure AZCopy architecture design

Posted in AZCopy, Azure, Cloud | Tagged , , , , | Leave a comment

Datastore with NFS4 slowness issue and Netapp\VMware findings.

Already mentioned in my previous blog here about the NFS 4 slowness issue compared to the NFS3 and we captured the packets and worked with the Netapp, below are the findings of our test when we tried ls -lahR | wc -l

  • There was no latency found in the perf-archives.
  • Took packet traces of nfsv3 and nfsv4 while listing the directory, 30 minutes apart.
  • There is a huge jump in lookup calls in V4.1.  From the V4.1 packet traces, even though readdir call is returning entries with file name FH and attributes, still the client is sending explicit lookup calls (i.e. compound with PUTFH,LOOKUP,GETFH,GETATTR) for directory entries.

    NFS3 SRT’s

    Index  Procedure/Opcodes
    /Commands     Calls  Min SRT Max SRT Avg SRT Sum SRT
    1      GETATTR 55     0.000059      0.004604      0.000414       0.022770
    3      LOOKUP 14363  0.000060      0.471455      0.000318       4.565892 <<<
    4      ACCESS 2874   0.000048      0.012290      0.000353       1.015239
    17     READDIRPLUS   6275   0.000072      0.922698       0.001608      10.091936 <<<
    18     FSSTAT 3      0.000065      0.004150      0.001443       0.004329


    NNFS 4.1 SRT’s
    Index  Procedure/Opcodes
    /Commands     Calls  Min SRT Max SRT Avg SRT Sum SRT
    1      COMPOUND (proc #)    95810  0.000020      1.020477       0.001411      135.205752
    3      ACCESS 2887   0.000068      0.004859      0.000321       0.925366
    9      GETATTR 95788  0.000068      1.020477      0.001411       135.198447
    10     GETFH  86659  0.000112      1.020477      0.001164       100.898088
    15     LOOKUP 80923  0.000094      1.020477      0.001144       92.574459 <<<
    16     LOOKUPP 5754   0.000123      0.845588      0.001448       8.330835

    22     PUTFH  95806  0.000068      1.020477      0.001411       135.205653
    26     READDIR 6233   0.000147      0.900400      0.005354       33.373798   <<<
    53     SEQUENCE      95810  0.000020      1.020477       0.001411      135.205752

We decided to test it on the same single datastore by mounting it as NFS 3 and capturing the data, once the test is done then unmount it and mount it as NFS 4 so on the same datastore with the same size and data below are the result.

Based on the result, VMware will be working on the option to enable the Lookup cache and mostly it will be available in future ESXi patches.

Posted in ESX command, ESXi issue, ESXi Patches, VMware | Tagged , , , , | Leave a comment

Useful tips to find the resource types under the azure policy.

In our organization, we have restrictions to use all the services in azure and have to work with the different teams to enable the resource types in policy for the resources.

Recently I was working to add the non-azure VM in Azure ARC and it was failing with an error.

“FATAL   RequestCorrelationId:281e537a-90cfa3a4003 Message: Resource ‘12345’ was disallowed by policy. ” while troubleshooting found this link which helped to find the policy and it is very easy to search

https://www.azadvertizer.net/azpolicyadvertizer_all.html#%7B%7D

For EX to find the appropriate policy, we can search Azure ARC and it will list all the categories.

We can see for the guest configuration “Microsoft.HybridCompute/machines” resource types.

Posted in Azure, Cloud | Tagged , , | Leave a comment

NFSv3 datastore is much faster than NFSv4

Recently we have noticed in a few datastore the search operation is taking more time than expected and in our testing, it was identified that compare to NFSv4, NFSv3 results are better.

Our testing is to first list files from the host shell and we mounted the same storage as nfsv4 and nfsv3 to the same host. We ran this command from the host shell against both storages. time ls -lahR | wc -l . nfsv4 takes 1 minute and 30 seconds to finish running this command. When it is nfsv3, it only takes 15 seconds. 

Then tried using Powershell. Use the below way to search files. 

$searchspec = New-Object VMware.Vim.HostDatastoreBrowserSearchSpec

$searchSpec.matchpattern = “*.vmx”

$taskMoRef = $dsBrowser.SearchDatastoreSubFolders_Task($datastorePath2, $searchSpec) 

The data transfer rate is pretty much the same but the slowness issue is at the list/search file for nfs4.

Based on the results we involved the vendors to investigate the issue. NetApp conducted a thorough investigation and determined there were no performance issues, but VMware acknowledged there was a problem with NFS v4.1. Below was the reply from the VMware.

“Based on the analysis, Engineering team has identified that the issue related to slow search on NFSv4.1 is caused because the NFSv4.1 does not support “Directory Name Lookup Cache(DNLC) yet. However, for NFSv3, most of the LOOKUP calls are served from cache, which avoids sending a LOOKUP instruction to the NFS server. The VMware engineering is working to add this feature for NFSv4.1 however we do not have a version confirmation where this is expected to be included.”

We applied the latest patches but the issue still exists and hopefully, in the future patch it will get fixed.

Update: OCT2022

For more details check the blog.

Posted in ESX command, ESXi issue, ESXi Patches, logs, VCSA6.7, vcsa7.0, VCSA8.0, VMware | Tagged , , , , | Leave a comment

Issue connecting Azure VM using Azure AD from our laptop 

Azure AD has been configured and we are able to login to the Azure VM from another Azure VM using the AD credentials but it is getting failed when we try to connect using our local laptop.

One of the prerequisites is to make sure the local laptop should show AzureAdJoined : YES but still having issues and the error it failed is ” The logon attempt failed”.

dsregcmd /status it was showing the AzureAdJoined : YES.

After a few searches, identified the issue because the local GPO applied to the laptop.

Specifically, this is called out in the doc for AAD Login to Windows VMs here: https://docs.microsoft.com/en-us/azure/active-directory/devices/howto-vm-sign-in-azure-ad-windows#unauthorized-client

Here’s the doc for that particular setting: https://docs.microsoft.com/en-us/windows/security/threat-protection/security-policy-settings/network-security-allow-pku2u-authentication-requests-to-this-computer-to-use-online-identities

Reference:

https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsyfuhs.net%2Fhow-authentication-works-when-you-use-remote-desktop

Posted in Azure, Cloud | Tagged , , , , | Leave a comment

Reboot issue MCE error on Dell PowerEdge R6525 running ESXi 7.0 Update 3c

We have new hardware running Dell PowerEdge R6525\AMD EPYC 7713 64-Core Processor with ESXi 7.0 Build: 19193900 Release: ESXi 7.0 Update 3c (ESXi 7.0 Update 3c) and PRD VMs were migrated to the 15 hosts cluster. After a few weeks, started noticing randomly ESXi started rebooting and after further troubleshooting, we upgraded all the hardware firmware and BIOS ( 2.5.6 – upto 2.6.0 ) but the issue didn’t fix.

After monitoring for several weeks, identified DRS rule which running the Linux VMs on certain hosts are most affected compared to windows running hosts so with the help of the vendor changed the CPU and also motherboard on a few hosts but it didn’t help.

All the hosts failed with the ERROR : (Fatal/NonRecoverable) 2 (System Event) 13 Assert + Processor Transition to Non-recoverable

The issue was escalated to the top technical team in Dell and after several months, the vendor asked us to upgrade the BIOS to the 2.6.6  and finally, it helped us to arrest the reboot.

  1. Error from ESX logs – showing memory error
    2022-04-03T05:34:42 13 – Processor 1 MEMEFGH VDD PG 0 Assert + Processor Transition to Non-recoverable
  2.  After the above error, server was running till 12PM UTC
    1. 2022-04-03T10:00:00.611Z heartbeat[2308383]: up 5d6h22m15s, 94 VMs; [[2103635 vmx 67108864kB] [2114993 vmx 134084608kB] [2105683 vmx 134090752kB]] []
      Reboot might have happened between this time

Note : We have another environment that runs the same hardware R6525 with ESXi6.7 U3 but didn’t face any issue and after several analyses, we couldn’t find any solid evidence points the issue was caused by Linux VMs or applications running on the same.

Posted in Dell | Tagged , , , , | Leave a comment

NFS 4.1 datastores might become inaccessible after failover or failback operations of storage arrays

NFS 4.1 datastores might become inaccessible after failover or failback operations of storage arrays.

When storage array failover or failback operations take place, NFS 4.1 datastores fall into an All-Paths-Down (APD) state. However, after the operations are complete, the datastores might remain in APD state and become inaccessible.


As per the VMware this issue is happening in hosts older than build version 16075168 and it is resolved in the newer version. We tested it in our environment and the newer version works fine without any datastore failure.

Posted in Storage, Storage\Backup, VMware | Tagged , , , | Leave a comment

VCSA upgrade stuck in 88%


VCSA 7.0 U3b upgrade stuck in 88%

Resolution

Vami page was stuck at 88% for more than a hour.

Removed the update_config file and restarted the VAMI, but update was not done.

Downloaded the fp.iso patch and patched the VCSA via VAMI successfully.

Posted in VMware | Tagged , , | 2 Comments