With Storage Spaces Direct, every Metadata / write IO is redirected to the node who owns the Cluster Shared Volume. If you’re using NTFS the volume will be in “Block Redirected Mode”, if ReFS is used the volume will be in “File System Redirected Mode”. You can see this with Powershell; Get-ClusterSharedVolumeState.
More info on a 2012R2 blog here and in the WSLAB scenarios.
Assuming the recommended filesystem, ReFS, for Storage Spaces Direct is used the following occurs on a running cluster:
We have a 3-node Storage Spaces Direct cluster with two Cluster Shared Volumes which are using the 3-Way mirroring resiliency.
‘Node1’ owns ‘CSV1’ and has two virtual machines running. ‘Node-2’ owns ‘CSV2’. As soon as VM2, which has disks on CSV2, starts to do writes all IO will be redirected to the owner node who owns the CSV where the disks of the VM are on. In short: The write IO of VM2 has to travel to Node2 before it can be committed.
For one virtual
machine this is not a very big issue. Imagine you have hundreds of VMs on your
cluster where a big portion of the VMs need to make an extra hop (to the owner
node of the CSV they are on) over the network for the IO.
IO is latency sensitive and an extra hop on the network does not make it any better but it is also taking up extra bandwidth on your network and extra CPU cycles on your CPU.
This is also the reason why VMfleet aligns the virtual machines with the CSVs and Hosts they are on, simply because we want the least amount of overhead and see what the cluster can do.
I always tell my customers that they cannot expect the performance that VMfleet shows in daily use. Because it does not reflect a VMfleet scenario where everything is aligned, but we could at least try to get as close as possible in terms of alignment right?
Powershell to the rescue!
I wrote a script that will get a complete inventory of a cluster; Location of virtual machine disks, owners of CSVs, virtual machine hosts.
Based on this information we now can define which virtual machines should be moved where to align it with the CSV (and get an optimized IO path!). The script can be ran locally or target a remote cluster.
The script supports the -WhatIf statement (thanks to Ben) so that you can just see what would have happened if you ran it.
If the VM is off, it will use quick migration. If you want to use Quick migration for all virtual machines, specify the -quickmigration switch.
The script checks the host where VMs are move to for memory utilization and keeps 8GB free by default. It will throw an error when there is not enough free memory.
You can specify your own preference through the parameter ‘NodePhysicalmemorybufferGB’
Download (or contribute):
Definitely let me know what you see in your monitoring. Do you see your network or CPU utilization go down? What percentage?
Thanks for reading, follow me on Twitter for updates.
If you have any questions or feedback, leave a comment or drop me an email.
Darryl van der Peijl