Migrate a MS SQL cluster with a shared RDM disk in a VMware environment

by David Fong

We had a need to migrate a MS SQL cluster with a shared RDM disk in a VMware environment to a new storage for both the OS disks and the RDM. The two nodes on the clustered are located on different ESXi hosts. We put the database files and logs on the RDM disk other than the OS is on a VMFS datastore. It was not a very straight forward migration that involves un-mapping and re-mapping RDMs, coping the databases and all the related files, and finally migrating the OS drives.

We’ve decided to copy all the data to the new database disk and change the drive letter(s) to reflect the original ones while the MS SQL is offline and that worked out for us. Here’s a brief description of the steps we decided to take:

Preparation:

Allocate a new RAW LUN for the Database, visible to both ESXi hosts
Allocate two new 2TB LUN for OS, formatted as vmfs storage, one for each ESXi host and visible to both (so that the 2nd node can find the RDM file)
Backup whole OS
Turn off the related application so there’ll be no new writes to the Databases.
Backup all Databases

DB LUN (RDM) migration:

Add the new LUN to cluster node 1 as RDM
– Pick SCSI controller 1, SCSI ID 1:1 (the old one is using SCSI 1:0)
b. Make a note of the new disk’s path and vol ID
Add new LUN to cluster node 2 as existing HD
– Pick SCSI controller 1, SCSI ID 1:1, pick independent, persistent option
In Disk Manager, create and format new volumes as needed
In Failover Cluster Manager, add new disk to cluster
Assign new disk to MSSQL role
DO NOT STOP mssql role
a. Click MS SQL Role, then click the resource tab at the bottom
b. Stop SQL server/SQL agent resource and leave disk resources online
c. Make sure sql server is offline
Copy/restore from backups all the DB files to the new disk
Change Drive letter(s)
Update SQL server/agent dependencies
Restart SQL server/SQL agent to test.
Test DB failover
Check/update resource dependency
Remove old disk from MSSQL Role AND cluster disks resource
Remove old disk from Disk Manager
Remove old RDM from VMs
Test DB….

OS migration:

Shutdown both cluster nodes
Remove the DB disk from the cluster node 2, DO NOT DELETE DATA FILE
Remove the RDM disk from cluster node 1, DO NOT DELETE DATA FILE
Migrate both nodes to the corresponding new Datastore
Add the new LUN back on cluster node 1as RDM
– pick SCSI Controller 1; SCSI ID 1:0, note Disk file Path and match vol ID
Add the new LUN as existing HD to cluster node 2
– pick SCSI Controller 1; SCSI ID 1:0, and the independent, persistent option
Start-vm db1c
Start-vm db2c
Make sure MS SQL roles starts up correctly
Test DB/failover
Test the application

And that concludes the migration process. There are other migration options like using ALTERDATABASE to change physical location of the DB files, building a new cluster and do a full restore, and vmotion the VMs with the DBM files, etc… but we since we need to keep the same drive letter(s) and with other requirements we eventually decided that the procedure we followed worked best for us.

We did a test migration and recorded the whole process here:

https://drive.google.com/open?id=1tpfEnRvrVny3vmEw3E3x94FqFup_9tPy

It took about 45 min for the test run to complete. But that’s because the test DBs were very small and the OS drives have only the basic Windows install. Please take a look if you’re interested and shoot me a email if you have any questions or comments at davidmfong@stanford.edu.