wiki:ssd

Solid State Drives

To help alleviate slowness from a lot of disk reads and writes, we are in the processing of adding solid state drives to our existing servers. Solid state drives (SSDs) are significantly faster than the traditional spinning SATA drives we currently have installed.

A major source of the disk read/write contention comes from access to MySQL databases. This contention slows everything down and it is one of the significantly bottle necks when a web site that is driven by a MySQL database feels slow to load.

Therefore, our strategy is to move most MySQL database partitions to the newly installed solid state drives. That should allow MySQL powered web sites to load much faster, while also reducing the disk contention on the other partitions.

Technical Overview

Like all disks, solid state drives are installed in a RAID configuration for redundancy and are also encrypted.

They are added to their own dedicated volume group on the host server (typically named: vg_HOST1 - as opposed to vg_HOST0 for the volume group with SATA disks).

Just as each guest is allocated a logical volume named after the guest, each guest is additionally allowed a logical volume from the SSD volume group (also named after the guest). So, a guest may have both:

  • vg_wiwa0-jacobs
  • vg_wiwa1-jacobs

In the guest, the initial disk shows up as /dev/sda and the second disk shows up as /dev/sdb.

For the normal SATA disks - they are added to a logical volume group on the guest and then further allocated.

However, with the SSD disks - a single partition is created from the block device and a filesystem is installed directly on it.

Due to limitations of SSD, the operating system must communicate to the disk which sectors are free for writing (trim/discards), we have to configure all the disk layers specifically to pass these "disard" messages.

Technical Details

Following are the steps for adding a solid state drive to both a host and a guest

Adding solid state drives to a host

First, order the disks (see previous SSDs we have purchased and order the cables and cages needed to install them (see previous cables we have ordered).

Once installed, the should show up via cat /proc/partitions as something like /dev/sda and /dev/sdb.

Next, modify the physical servers .pp file in puppet, changing the call to m_physical server to be:

class { "mayfirst::m_physical": 
  ssd => true
}

Once you push, it should change /etc/lvm/lvm.conf to have: issue_discards = 1

Next, create a single partition on each device, e.g.:

0 wiwa:/etc/lvm# parted /dev/sdb
GNU Parted 3.2
Using /dev/sdb
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) mklabel gpt
(parted) unit s mkpart main 8192 -196608                                       
(parted) p                                                                
Model: ATA SAMSUNG MZ7KM480 (scsi)
Disk /dev/sdb: 937703088s
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 

Number  Start  End         Size        File system  Name  Flags
 1      8192s  937506480s  937498289s               main

(parted) 

Then a RAID 1 array:

mdadm --create --raid-devices=2 --level=1 --metadata=1.0 --verbose /dev/md2 /dev/sda1 /dev/sdb1

Scan your RAID devices:

mdadm --examine --scan

And based on the output, edit /etc/mdadm/mdam.conf to add a line for your new RAID (something like: ARRAY /dev/md/2 metadata=1.0 UUID=70345fc6:661e67ce:715261cf:84112f10 name=wiwa:2).

Now, setup encryption and be sure to enable --allow-discards:

0 wiwa:/etc/lvm# cryptsetup luksFormat /dev/md2 

WARNING!
========
This will overwrite data on /dev/md2 irrevocably.

Are you sure? (Type uppercase yes): YES
Enter passphrase: 
Verify passphrase: 
0 wiwa:/etc/lvm# cryptsetup --allow-discards luksOpen /dev/md2 md2_crypt
Enter passphrase for /dev/md2: 
0 wiwa:/etc/lvm# blkid /dev/md2 
/dev/md2: UUID="b360ef4f-84ea-457c-8f86-a7f94b1f9277" TYPE="crypto_LUKS"
0 wiwa:/etc/lvm# echo md2_crypt UUID=b360ef4f-84ea-457c-8f86-a7f94b1f9277 none luks,discard >> /etc/crypttab 
0 wiwa:/etc/lvm# cat /etc/crypttab 
# <target name> <source device>         <key file>      <options>
md1_crypt UUID=ae7a55a5-cc91-4064-8e5f-06eb293188a2 none luks
md2_crypt UUID=b360ef4f-84ea-457c-8f86-a7f94b1f9277 none luks,discard
0 wiwa:/etc/lvm#  pvcreate /dev/mapper/md2_crypt 
  Physical volume "/dev/mapper/md2_crypt" successfully created
0 wiwa:/etc/lvm# vgcreate vg_wiwa1 /dev/mapper/md2_crypt 
  Volume group "vg_wiwa1" successfully created
0 wiwa:/etc/lvm#

Adding solid state drives to a guest

In Puppet

First, enable a weekly systemd timer that will run the fstrim command to communicate to the device which blocks are not being used.

In puppet, edit the guest's .pp file, adding:

 class { "mayfirst::m_ssd::fstrim": }

On the host machine

On the host, create a logical volume for your given host, e.g.:

0 wiwa:/etc/lvm# lvcreate --size 10GB --name jacobs vg_wiwa1
  Logical volume "jacobs" created
0 wiwa:/etc/lvm#

Change the group ownership to match the user the guests runs as:

0 wiwa:/etc# chgrp jacobs /dev/mapper/vg_wiwa1-jacobs 
0 wiwa:/etc#

Make this change permanent. Edit: /etc/udev/rules.d/92-kvm_creator-jacobs.rules and add a new line:

ACTION=="change", SUBSYSTEM=="block", ATTR{dm/name}=="vg_wiwa1-jacobs", GROUP="jacobs"

And edit the environment variables for the given host in /etc/kvm-manager/jacobs/env.

You must indicate to kvm-manager that the scsi driver should be used instead of the default virtio driver.

  • HDA will probably be set to the path to the regular block device (e.g. /dev/mapper/vg_wiwa0-jacobs). Leave it as is (it will default to the virtio driver).
  • Add a new variable: HDB set to the path to the ssd block device (e.g. /dev/mapper/vg_wiwa1-jacobs).
  • Add another new environment variable named after the disk, e.g. HDB_DRIVER with the content: scsi.

Reboot the guest.

On the guest

Look for the new block device (/dev/sdb below):

0 jacobs:~# cat /proc/partitions 
major minor  #blocks  name

   8       16   10485760 sdb
   8        0  524288000 sda
   8        1     248832 sda1
   8        2  524037120 sda2
 254        0    3997696 dm-0
 254        1     499712 dm-1
 254        2     999424 dm-2
 254        3    4997120 dm-3
 254        4    4997120 dm-4
 254        5   20971520 dm-5
0 jacobs:~#

CAREFUL! Don't assume the new device is sdb. The examples below are based on the device appearing as sdb, but if it appears differently, you can create some bad news with these commands.

Create a single partition:

0 jacobs:~# parted /dev/sdb
GNU Parted 3.2
Using /dev/sdb
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) mklabel gpt                                                      
(parted) unit s mkpart main 8192 -196608                                  
(parted) p                                                                
Model: QEMU QEMU HARDDISK (scsi)
Disk /dev/sdb: 20971520s
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 

Number  Start  End        Size       File system  Name  Flags
 1      8192s  20774912s  20766721s               main

(parted) quit                                                             
Information: You may need to update /etc/fstab.

0 jacobs:~#

Create a filesystem:

mkfs -t ext4 /dev/sdb1
Replacing MySQL partition

If you are using this partition to replace MySQL...

Stop mysql:

systemctl stop mysql

Unmount /var/lib/mysql:

umount /var/lib/mysql

Edit /etc/fstab, change the line for mounting /var/lib/mysql. It currently starts something like /dev/mapper/vg_jacobs0-var+lib+mysql. Assuming your ssd card appeared as /dev/sdb, find the UUID of the partition with blkid /dev/sdb1 and then replace with UUID=? (replace the question mark with the actuall UUID number)..

Mount the old logical volume on /mnt:

mount /dev/mapper/vg_jacobs0-var+lib+mysql /mnt

Mount the new partition:

mount /var/lib/mysql

Move the data:

rsync -a /mnt/* /var/lib/mysql

Restart mysql:

systemctl start mysql

If everything goes well unmount /mnt and remove the logical volume:

umount /mnt
lvremove /dev/mapper/vg_jacobs0-var+lib+mysql
Last modified 15 months ago Last modified on Mar 14, 2018, 1:05:46 PM