wiki:kiyoshi-recovery-2010-02

Version 10 (modified by Jamie McClelland, 15 years ago) ( diff )

--

Kiyoshi recovery 2010-02

This page documents the planned steps for the Kiyoshi disk recovery (see #2828).

Monday Night 9:00 pm

  • Booting into debirf image (which allows us full access to the underlying filesystems:
    • Configure debirf for networking (IP: 209.51.171.182/27, Gateway: 209.51.171.161, netmark: 255.255.255.224)
    • Prepare the disks so we can access them:
      • Initialize the RAID arrays
        # incomplete!!
        mdadm --scan
        madadm --assemble
        
      • Decrypt the RAID arrays
        cryptsetup luksOpen /dev/md1 crypt_md1
        cryptsetup luksOpen /dev/md2 crypt_md2
        
      • Scan for logical volumes
        vgscan --mknodes
        vgchange -aly
        
    • Move all logical volumes to sdc2
      • Find logical volumes with:
        lvs
        
      • Examine to figure out which ones are on which physical volumes
        lvdisplay -m vg_kiyoshi0/<logical-volume-name>
        
      • Move with:
        pvmove --verbose --name vg_kiyoshi0/<logical-volume-name> <path/to/old/volume> /dev/sdc
        
    • Ensure /dev/sdb is no longer in use
      pvdisplay /dev/dm-2
      

Evaluation point

If the transfer to sdc goes quickly and smoothly, then this is an acceptable stopping point. We can restart with all data coming from sdc and expect a stable (although not redundant) system for tuesday.

If the transfer is going really slowly or we have doubts it can finish by tomorrow, we should stop. The last thing we want to do is move all the data to sdc and when it finally completes at 7:00 am, realize the performance is even worse.

Monday night or Tuesday during the day

  • Setup benchmarking to test performance on sdb prior to our change
    • Install postmark
      aptitude install postmark
      
    • Move testy partition to sdb
      pvmove --verbose vg_kiyoshi0/testy /dev/sdc /dev/dm-2
      
    • Create a file system
      mkfs -t ext3 /dev/mapper/vg_kiyoshi0-testy
      
    • Mount it
      mount /dev/mapper/vg_kiyoshi0-test /mnt
      
    • Create a file called postmark.conf:
      set location /mnt/
      set seed 12345678
      set read 1024
      set write 1024
      set buffering false
      set transactions 4096
      set size 512 2048
      set number 51115
      run
      quit
      
    • Run postmark:
      postmark postmark.conf
      
  • If you get Error: Cannot open /mnt/123 for writing then reduce the set number to a lower number
  • It should output something like:
    guest@chicken:~$ postmark postmark.conf 
    PostMark v1.51 : 8/14/01
    Reading configuration from file 'postmark.conf'
    Creating files...Done
    Performing transactions...........Done
    Deleting files...Done
    Time:
    	24 seconds total
    	11 seconds of transactions (372 per second)
    
    Files:
    	53225 created (2217 per second)
    		Creation alone: 51115 files (8519 per second)
    		Mixed with transactions: 2110 files (191 per second)
    	2061 read (187 per second)
    	2035 appended (185 per second)
    	53225 deleted (2217 per second)
    		Deletion alone: 51239 files (7319 per second)
    		Mixed with transactions: 1986 files (180 per second)
    
    Data:
    	2.54 megabytes read (108.33 kilobytes per second)
    	65.72 megabytes written (2.74 megabytes per second)
    guest@chicken:~$
    
  • De-commission /dev/sdb
    • Move testy partition back
      umount /mnt
      pvmove --verbose --name vg_kiyoshi0/testy /dev/dm-2 /dev/sdc
      
    • Remove as logical volume
      pvremove /dev/dm-2
      
    • Remove from RAID
      mdadm --fail /dev/md2 /dev/sdb2 
      mdadm --fail /dev/md0 /dev/sdb1 
      
  • Properly re-partition
    • Setup partitions
      # following http://article.gmane.org/gmane.linux.utilities.util-linux-ng/2955
      parted /dev/sdb
      # do we want gpt??
      (parted) mklabel gpt
      (parted) unit s
      (parted) mkpart primary ext2 40 -1
      # parted will complain about end location, ignore
      (parted) quit
      
    • Re-add to raid arrays: this step re-adds it to the LVM raid array it was previously a part of and adds it to the boot partition RAID array
      mdadm --add /dev/md2 /dev/sdb2
      mdadm --add /dev/md0 /dev/sdb1
      
  • Test to make sure this new partition means it really does write/read faster.
    • Move test partition back and mount
      pvmove --verbose --name vg_kiyoshi0/testy /dev/sdc2 /dev/dm-2
      mount /dev/mapper/vg_kiyoshi0-test /mnt
      
  • Run postmark and compare with earlier test results
    postmark postmark.conf
    
  • Move logical volumes back from sdc (see above)
  • Restart vservers

This is another stopping place

Tuesday/Wednesday night

  • Fail sda on all raids it is a part of.
  • Take down the machine
  • Replace sda disk with new disk
  • Start machine
  • Create partition table on sda matching sdb
  • Add sda partitions back to RAID
Note: See TracWiki for help on using the wiki.