I did encounter failover problem in one of my servers.

There are 2 nodes on cluster. Both cluster nodes are attached to external storage.When one node down, the other node should be able to take over automatically. Resources are supposed to show on active node only. When i failover, in cluster admin, under group, it showed cluster group as node 2 but on resources, it still showed node1 which means cluster group is working fine but not able to bring up resources. When I tried to manual failover, it threw following error messages.

ErrorsError loading resource DLL ‘SQSRVRES.DLL’
Error loading resource ‘SQAGTRES.DLL’
Cluster service is requesting a bus reset for device \Device\ClusDisk0.
The SQL Server service terminated with service-specific error 17058 (0x42A2)

There are few steps tried to troubleshoot.
1. Cold shutdown , power cycle and startup. Normally cluster checkpoint will sync again with resources
2. Follow this articlehttp://support.microsoft.com/kb/872931
3. Try this articlehttp://support.microsoft.com/kb/923838,
4. Try to load 2 sql dll files from node 1 to node 2 (SQAGTRES.DLL and SQSRVRES.DLL). Two dll files are located under system32 folder.

After everything has done, the problem still persists but there is slight improvment. There is no more sql error and the owner of resources showing up node2 which is not even show node 2 Previously. However, still it is offline.
When i try to bring online, I finally get the clear error messgae’ cluster resources cannot be brought online. The owner node cannot run this resources’

Once getting that error, it is pretty clear what has happened. Just check on the owner list and found out node2 is disappeared from the list. Throwing so many errors make me confused what is the actual error.

At the end, the problem is resolved by adding node2 to owner list using this command.
cluster res “SQL Server resource name” /addowner:node2
cluster res “SQL Server Agent resource name” /addowner:node2