wiki:ssd

Version 7 (modified by Jamie McClelland, 8 years ago) ( diff )

--

Solid State Drives

To help alleviate slowness from a lot of disk reads and writes, we are in the processing of adding solid state drives to our existing servers. Solid state drives (SSDs) are significantly faster than the traditional spinning SATA drives we currently have installed.

A major source of the disk read/write contention comes from access to MySQL databases. This contention slows everything down and it is one of the significantly bottle necks when a web site that is driven by a MySQL database feels slow to load.

Therefore, our strategy is to move most MySQL database partitions to the newly installed solid state drives. That should allow MySQL powered web sites to load much faster, while also reducing the disk contention on the other partitions.

Technical Overview

Like all disks, solid state drives are installed in a RAID configuration for redundancy and are also encrypted.

They are added to their own dedicated volume group on the host server (typically named: vg_HOST1 - as opposed to vg_HOST0 for the volume group with SATA disks).

Just as each guest is allocated a logical volume named after the guest, each guest is additionally allowed a logical volume from the SSD volume group (also named after the guest). So, a guest may have both:

  • vg_wiwa0-jacobs
  • vg_wiwa1-jacobs

In the guest, the initial disk shows up as /dev/sda and the second disk shows up as /dev/sdb.

For the normal SATA disks - they are added to a logical volume group on the guest and then further allocated.

However, with the SSD disks - a single partition is created from the block device and a filesystem is installed directly on it.

Due to limitations of SSD, the operating system must communicate to the disk which sectors are free for writing (trim/discards), we have to configure all the disk layers specifically to pass these "disard" messages.

Technical Details

Following are the steps for adding a solid state drive to both a host and a guest

Adding solid state drives to a host

First, order the disks (see previous SSDs we have purchased and order the cables and cages needed to install them (see previous cables we have ordered).

Once installed, the should show up via cat /proc/partitions as something like /dev/sda and /dev/sdb.

Next, modify the physical servers .pp file in puppet, changing the call to m_physical server to be:

class { "mayfirst::m_physical": 
  ssd => true
}

Once you push, it should change /etc/lvm/lvm.conf to have: issue_discards = 1

Next, create a single partition on each device, e.g.:

0 wiwa:/etc/lvm# parted /dev/sdb
GNU Parted 3.2
Using /dev/sdb
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) mklabel gpt
(parted) unit s mkpart main 8192 -196608                                       
(parted) p                                                                
Model: ATA SAMSUNG MZ7KM480 (scsi)
Disk /dev/sdb: 937703088s
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 

Number  Start  End         Size        File system  Name  Flags
 1      8192s  937506480s  937498289s               main

(parted) 

Then a RAID 1 array:

mdadm --create --raid-devices=2 --level=1 --metadata=1.0 --verbose /dev/md2 /dev/sda1 /dev/sdb1

Scan your RAID devices:

mdadm --examine --scan

And based on the output, edit /etc/mdadm/mdam.conf to add a line for your new RAID (something like: ARRAY /dev/md/2 metadata=1.0 UUID=70345fc6:661e67ce:715261cf:84112f10 name=wiwa:2).

Now, setup encryption and be sure to enable --allow-discards:

0 wiwa:/etc/lvm# cryptsetup luksFormat /dev/md2 

WARNING!
========
This will overwrite data on /dev/md2 irrevocably.

Are you sure? (Type uppercase yes): YES
Enter passphrase: 
Verify passphrase: 
0 wiwa:/etc/lvm# cryptsetup --allow-discards luksOpen /dev/md2 md2_crypt
Enter passphrase for /dev/md2: 
0 wiwa:/etc/lvm# blkid /dev/md2 
/dev/md2: UUID="b360ef4f-84ea-457c-8f86-a7f94b1f9277" TYPE="crypto_LUKS"
0 wiwa:/etc/lvm# echo md2_crypt UUID=b360ef4f-84ea-457c-8f86-a7f94b1f9277 none luks,discard >> /etc/crypttab 
0 wiwa:/etc/lvm# cat /etc/crypttab 
# <target name> <source device>         <key file>      <options>
md1_crypt UUID=ae7a55a5-cc91-4064-8e5f-06eb293188a2 none luks
md2_crypt UUID=b360ef4f-84ea-457c-8f86-a7f94b1f9277 none luks,discard
0 wiwa:/etc/lvm#  pvcreate /dev/mapper/md2_crypt 
  Physical volume "/dev/mapper/md2_crypt" successfully created
0 wiwa:/etc/lvm# vgcreate vg_wiwa1 /dev/mapper/md2_crypt 
  Volume group "vg_wiwa1" successfully created
0 wiwa:/etc/lvm#

Adding solid state drives to a guest

First, enable a weekly systemd timer that will run the fstrim command to communicate to the device which blocks are not being used.

In puppet, edit the guest's .pp file, adding:

 class { "mayfirst::m_ssd::fstrim": }

On the host, create a logical volume for your given host, e.g.:

0 wiwa:/etc/lvm# lvcreate --size 10GB --name jacobs vg_wiwa1
  Logical volume "jacobs" created
0 wiwa:/etc/lvm#

Change the group ownership to match the user the guests runs as:

1 wiwa:/etc/sv/kvm/jacobs# ls -l /dev/mapper/vg_wiwa1-jacobs 
lrwxrwxrwx 1 root root 8 Sep 23 15:16 /dev/mapper/vg_wiwa1-jacobs -> ../dm-28
0 wiwa:/etc/sv/kvm/jacobs# chgrp jacobs /dev/dm-28
0 wiwa:/etc/sv/kvm/jacobs#

And edit the environment variables for the given host in /etc/sv/kvm/HOST/env.

You must indicate to kvm-manager that the scsi driver should be used instead of the default virtio driver.

  • HDA will probably be set to the path to the regular block device (e.g. /dev/mapper/vg_wiwa0-jacobs). Change it to /dev/mapper/vg_wiwa0-jacobs scsi.
  • Add a new variable: HDB set to the path to the ssd block device followed by scsi (e.g. /dev/mapper/vg_wiwa1-jacobs scsi).

Reboot the guest.

From the guest...

Look for the new block device (/dev/sdb below):

0 jacobs:~# cat /proc/partitions 
major minor  #blocks  name

   8       16   10485760 sdb
   8        0  524288000 sda
   8        1     248832 sda1
   8        2  524037120 sda2
 254        0    3997696 dm-0
 254        1     499712 dm-1
 254        2     999424 dm-2
 254        3    4997120 dm-3
 254        4    4997120 dm-4
 254        5   20971520 dm-5
0 jacobs:~#

Create a single partition:

0 jacobs:~# parted /dev/sdb
GNU Parted 3.2
Using /dev/sdb
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) mklabel gpt                                                      
(parted) unit s mkpart main 8192 -196608                                  
(parted) p                                                                
Model: QEMU QEMU HARDDISK (scsi)
Disk /dev/sdb: 20971520s
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 

Number  Start  End        Size       File system  Name  Flags
 1      8192s  20774912s  20766721s               main

(parted) quit                                                             
Information: You may need to update /etc/fstab.

0 jacobs:~#

Create a filesystem:

mkfs -t ext4 /dev/sdb1

Create a fstrim cron job (TBD on jacobs, needs puppet change).

Replacing MySQL partition

If you are using this partition to replace MySQL...

Stop mysql:

systemctl stop mysql

Unmount /var/lib/mysql:

umount /var/lib/mysql

Edit /etc/fstab, change the line for mounting /var/lib/mysql to match /dev/sdb1 instead of the existing logical volume.

Mount the old logical volume on /mnt:

mount /dev/mapper/vg_jacobs0-var+lib+mysql /mnt

Mount the new partition:

mount /var/lib/mysql

Move the data:

mv /mnt/* /var/lib/mysql

Umount /mnt and remove the logical volume:

umount /mnt
lvremove /dev/mapper/vg_jacobs0-var+lib+mysql

Retart mysql:

systemctl start mysql
Note: See TracWiki for help on using the wiki.