Kiyoshi recovery 2010-02
This page documents the planned steps for the Kiyoshi disk recovery (see #2828).
Monday Night 9:00 pm
- Booting into debirf image (which allows us full access to the underlying filesystems:
- Configure debirf for networking (IP: 209.51.171.182/27, Gateway: 209.51.171.161, netmark: 255.255.255.224)
- Prepare the disks so we can access them:
- Initialize the RAID arrays
# incomplete!! for foo in 0 1 2 ;do mknod /dev/md$foo b 9 $foo ; done mdadm --assemble /dev/md1 /dev/sda2 mdadm --assemble /dev/md2 /dev/sdb2
- Decrypt the RAID arrays
cryptsetup luksOpen /dev/md1 crypt_md1 cryptsetup luksOpen /dev/md2 crypt_md2
- Scan for logical volumes
vgscan --mknodes vgchange -aly
- Initialize the RAID arrays
- Move all logical volumes to sdc2
- Find logical volumes with:
lvs
- Examine to figure out which ones are on which physical volumes
lvdisplay -m vg_kiyoshi0/<logical-volume-name>
- Move with:
pvmove --verbose --name vg_kiyoshi0/<logical-volume-name> <path/to/old/volume> /dev/sdc2
- Find logical volumes with:
- Ensure /dev/sdb is no longer in use
pvdisplay /dev/mapper/md2_crypt
Evaluation point
If the transfer to sdc goes quickly and smoothly, then this is an acceptable stopping point. We can restart with all data coming from sdc and expect a stable (although not redundant) system for tuesday.
If the transfer is going really slowly or we have doubts it can finish by tomorrow, we should stop. The last thing we want to do is move all the data to sdc and when it finally completes at 7:00 am, realize the performance is even worse.
Monday night or Tuesday during the day
- Setup benchmarking to test performance on sdb prior to our change
- Install postmark
aptitude install postmark
- Move testy partition to sdb
pvmove --verbose vg_kiyoshi0/testy /dev/sdc2 /dev/mapper/md2_crypt
- Create a file system
mkfs -t ext3 /dev/mapper/vg_kiyoshi0-testy
- Mount it
mount /dev/mapper/vg_kiyoshi0-tesyt /mnt
- Create a file called postmark.conf:
set location /mnt/ set seed 12345678 set read 1024 set write 1024 set buffering false set transactions 4096 set size 512 2048 set number 51115 run quit
- Run postmark:
postmark postmark.conf
- Install postmark
- If you get Error: Cannot open /mnt/123 for writing then reduce the set number to a lower number
- It should output something like:
guest@chicken:~$ postmark postmark.conf PostMark v1.51 : 8/14/01 Reading configuration from file 'postmark.conf' Creating files...Done Performing transactions...........Done Deleting files...Done Time: 24 seconds total 11 seconds of transactions (372 per second) Files: 53225 created (2217 per second) Creation alone: 51115 files (8519 per second) Mixed with transactions: 2110 files (191 per second) 2061 read (187 per second) 2035 appended (185 per second) 53225 deleted (2217 per second) Deletion alone: 51239 files (7319 per second) Mixed with transactions: 1986 files (180 per second) Data: 2.54 megabytes read (108.33 kilobytes per second) 65.72 megabytes written (2.74 megabytes per second) guest@chicken:~$
- De-commission /dev/sdb
- Move testy partition back
umount /mnt pvmove --verbose --name vg_kiyoshi0/testy /dev/mapper/md2_crypt /dev/sdc2
- Remove as logical volume
pvremove /dev/mapper/md2_crypt
- Unmap crypto layer:
cryptsetup luksClose md2_crypt
- Remove from RAID
mdadm --fail /dev/md2 /dev/sdb2 ### shouldn't this remove /dev/md2 entirely? mdadm --fail /dev/md0 /dev/sdb1
- Move testy partition back
- Properly re-partition
- Setup partitions
# following http://article.gmane.org/gmane.linux.utilities.util-linux-ng/2955 parted /dev/sdb # do we want gpt?? (parted) mklabel gpt (parted) unit s ### dkg thinks we should not go all the way to the end; rather, we should leave a bit of free space ### this is because we don't know if the other disk is exactly the same size or not. ### so we should change -1 to something several sectors in from the end. (parted) mkpart primary ext2 40 -1 # parted will complain about end location, ignore (parted) quit
- Re-add to raid arrays: this step re-adds it to the LVM raid array it was previously a part of and adds it to the boot partition RAID array
mdadm /dev/md2 --create --level=mirror -n 2 /dev/sdb2 missing mdadm --add /dev/md0 /dev/sdb1
- Setup partitions
- add crypto layer to md2
cryptsetup luksFormat /dev/md2 cryptsetup luksOpen /dev/md2 md2_crypt
- Test to make sure this new partition means it really does write/read faster.
- Move test partition back and mount
pvmove --verbose --name vg_kiyoshi0/testy /dev/sdc2 /dev/mapper/md2_crypt mount /dev/mapper/vg_kiyoshi0-test /mnt
- Move test partition back and mount
- Run postmark and compare with earlier test results
postmark postmark.conf
- Move logical volumes back from sdc (see above)
- Restart vservers
This is another stopping place
Tuesday/Wednesday night
- Fail sda on all raids it is a part of.
- Take down the machine
- Replace sda disk with new disk
- Start machine
- Create partition table on sda matching sdb
- Add sda partitions back to RAID
Last modified
14 years ago
Last modified on Feb 9, 2010, 2:58:28 AM
Note:
See TracWiki
for help on using the wiki.