How to copy raw partition over the net

Earlier today I worked on migrating a D3 database server to a VMware ESX environment. The tool that we used for migration did a good job in converting the RHEL3 operating system and all the Linux filesystems, but failed to copy the D3’s raw data partition:

old-server:~# fdisk -l /dev/sda
[...]
/dev/sda12        10965     17816  55038658+  d3  Unknown

Now what? I didn’t have enough space neither on the source server nor on the destination VM to dump the partition contents to a file, copy across and reload back to /dev/sda12 on the VM. It had to be done online.

Gladly, SSH has the ability to run a command remotely and feed its standard output to some other program running locally. Using that feature it’s easy to copy the raw partition — simply run dd if=/dev/sda12 on the source server and dd of=/dev/sda12 on the destination VM. The first dd without any other parameters will print whatever it reads from /dev/sda12 on the old, source server to its standard output. The second dd, inversely, will write whatever it reads from the standard input down to /dev/sda12 on the new virtual machine. Glue it together with this ssh command:

new-server:~# ssh old-server "dd if=/dev/sda12" | dd of=/dev/sda12

That’s all sweet, butdd doesn’t provide any progress tracking. In my case I had to transfer over 50GB of data and had only a vague idea how fast it goes. Should I wait? Or leave it overnight? Hmm, hard to tell.

Finally I came up with a simple solution for checking progress of dd — get a sample data, say 1kB, from a given offset on both the source and destination partition and compare their checksums:

old-server:~# dd if=/dev/sda12 bs=1k count=1 skip=5M | md5sum
7b10e9e1029c4c0f3901ee13db18a927new-server:~# dd if=/dev/sda12 bs=1k count=1 skip=5M | md5sum
0f343b0931126a20f133d67c2b018a3b

OK, the checksums didn’t match. Yet. I kept re-running the command on the new server and as soon as it returned the same checksum as on the old one I knew it just copied 20GBs.

A side note here: I used skip=5MB and claim it copied 5GB — why’s that? Because skip= skips a given amount of ibs-sized records. In our example ibs=bs=1kB and therefore skip=5M skips 5 millions records, which means it skips to an offset 5GB in /dev/sda12. And reads one kilobyte from there.

Re-running the hashing command every minute or so manually is boring. Instead I wrote a little shell script to record the timestamp when the hashes at a given offset match:

#!/bin/sh
## check-progress.sh from http://hintshop.ludvig.co.nz/show/copy-raw-partition-over-net/
DEVICE=$1
OFFSET=$2
REQ_HASH=$3

if [ "${REQ_HASH}" = "" ]; then
   echo "Usage: $0 {device} {offset} {required-hash}"
   exit 1
fi

while (true) do
   HASH=$(dd if=${DEVICE} count=1 bs=1k skip=${OFFSET} 2>/dev/null | md5sum)
   HASH=${HASH:0:32}
   if [ ${HASH} = ${REQ_HASH} ]; then
      echo "Hashes match: $(date)"
      exit 0
   fi
   echo "Not yet..."
   sleep 30
done

To use it I needed a hash value from a given offset on the old-server and then run the script on the new-server:

new-server:~# ./check-progress.sh /dev/sda12 5M 7b10e9e1029c4c0f3901ee13db18a927
Not yet...
Not yet...
...
Hashes match: Sat Feb 14 11:22:33 NZDT 2009

Once it recorded the timestamp I set it up again with a new offset, say 6M, and waited. That way I was able to track the progress and approximately measure the transfer speed. Precise enough to realise it did about 8GB per 15 minutes. Then I knew I’ve had enough time to go and get some lunch.

Add post to: Delicious Reddit Slashdot Digg Technorati Google
(already: 1) Comment post

Comments

college papers 27.08.2010 4:26

Want to utilize your free time for your own deals? You will be able to do it using the buy papers online service.

Required. 30 chars of fewer.

Required.

captcha image Please, enter symbols, which you see on the image

Comment post