We have four virtualization hosts running Windows Server 2012 R2 Datacenter with Hyper-V role. Those servers are connected via FCoE links to an external storage system. Logical volumes on that storage system are designated as Cluster Storage Volumes and used to store files of highly available virtual machines. Often VMs stored on the same SCV run on different hosts.
We have some serious performance issues in the cluster including complete I/O freezes on some CSVs in specific scenarios. I'm working with a vendor's technical support engineer, and he's already informed that the system is completely healthy. However, analyzing the storage system performance logs, he discovered huge amount of SCSI persistent reservation errors (over 30 000 in 2 hours, i.e. more that 4 per second) on the LUs in question.
Is it normal? In 2012 R2, all Hyper-V hosts are able to access the same SCVs simultaneously, so it might explain the presence of such conflicts per se. However, I suspect that this number is too big.
Evgeniy Lotosh
MCSE: Server infractructire, MCSE: Messaging