Cannot add new node to windows hyper-v cluster–SCSI disk validation error

We make extensive use of Hyper-V within Basefarm and I recently encountered a strange problem when doing maintenance on a 5 node windows cluster running the hyper-v role. For reasons I won’t bore you with here (pre-production testing basically), I had been evicting and adding nodes to my host level windows cluster, but when trying to add a node back I encountered errors in the validation tests. This was strange as the node had actually previously been in the cluster and nothing had been changed on it whatsoever, so I already knew that it had previously passed the validation test and been successfully running as a cluster node!

The cluster validation reported an error of this format in the storage tests named

Validate SCSI device Vital Product Data (VPD)

cluster_validate_error

It looks like the above picture when you view the validation report.

The actual errors returned were like this:

Failed to get SCSI page 83h VPD descriptors for cluster disk 1 from node <nodename> status 2

(I’ve removed the node name here obviously but it does say specifically which one in the report):

Fortunately before too long I found that this was due to a bug in hyper-v validation for which a hotfix is available here

http://support.microsoft.com/kb/2531907

Downloading and installing this on all the nodes and potential new nodes resolves the error.

This goes to show the value of pre-production testing as the aim of this cluster is to provide a dynamically expanding virtualisation service for one of our largest customers. If it came to a production situation where I needed to add a node quickly, this is not the type of scenario I would be wanting to troubleshoot live!

1 reply
  1. Rafael
    Rafael says:

    Unfortunately this hotfix did not work in my environment. So now I’m gonna format all the nodes and test them again.

Comments are closed.