Version 18 (modified by 5 years ago) ( diff ) | ,
---|
Solid State Drives
To help alleviate slowness from a lot of disk reads and writes, we are in the processing of adding solid state drives to our existing servers. Solid state drives (SSDs) are significantly faster than the traditional spinning SATA drives we currently have installed.
A major source of the disk read/write contention comes from access to MySQL databases. This contention slows everything down and it is one of the significantly bottle necks when a web site that is driven by a MySQL database feels slow to load.
Therefore, our strategy is to move most MySQL database partitions to the newly installed solid state drives. That should allow MySQL powered web sites to load much faster, while also reducing the disk contention on the other partitions.
Technical Overview
Like all disks, solid state drives are installed in a RAID configuration for redundancy and are also encrypted.
They are added to their own dedicated volume group on the host server (typically named: vg_HOST1 - as opposed to vg_HOST0 for the volume group with SATA disks).
Just as each guest is allocated a logical volume named after the guest, each guest is additionally allowed a logical volume from the SSD volume group (also named after the guest). So, a guest may have both:
- vg_wiwa0-jacobs
- vg_wiwa1-jacobs
In the guest, the initial disk shows up as /dev/sda and the second disk shows up as /dev/sdb.
For the normal SATA disks - they are added to a logical volume group on the guest and then further allocated.
However, with the SSD disks - a single partition is created from the block device and a filesystem is installed directly on it.
Due to limitations of SSD, the operating system must communicate to the disk which sectors are free for writing (trim/discards), we have to configure all the disk layers specifically to pass these "disard" messages.
Technical Details
Following are the steps for adding a solid state drive to both a host and a guest
Adding solid state drives to a host
First, order the disks (see previous SSDs we have purchased and order the cables and cages needed to install them (see previous cables we have ordered).
Once installed, the should show up via cat /proc/partitions
as something like /dev/sda and /dev/sdb.
Next, modify the physical servers .pp file in puppet, changing the call to m_physical server to be:
class { "mayfirst::m_physical": ssd => true }
Once you push, it should change /etc/lvm/lvm.conf to have: issue_discards = 1
Next, create a single partition on each device, e.g.:
0 wiwa:/etc/lvm# parted /dev/sdb GNU Parted 3.2 Using /dev/sdb Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) mklabel gpt (parted) unit s mkpart main 8192 -196608 (parted) p Model: ATA SAMSUNG MZ7KM480 (scsi) Disk /dev/sdb: 937703088s Sector size (logical/physical): 512B/512B Partition Table: gpt Disk Flags: Number Start End Size File system Name Flags 1 8192s 937506480s 937498289s main (parted)
Then a RAID 1 array:
mdadm --create --raid-devices=2 --level=1 --metadata=1.0 --verbose /dev/md2 /dev/sda1 /dev/sdb1
Scan your RAID devices:
mdadm --examine --scan
And based on the output, edit /etc/mdadm/mdam.conf to add a line for your new RAID (something like: ARRAY /dev/md/2 metadata=1.0 UUID=70345fc6:661e67ce:715261cf:84112f10 name=wiwa:2
).
Now, setup encryption and be sure to enable --allow-discards
:
0 wiwa:/etc/lvm# cryptsetup luksFormat /dev/md2 WARNING! ======== This will overwrite data on /dev/md2 irrevocably. Are you sure? (Type uppercase yes): YES Enter passphrase: Verify passphrase: 0 wiwa:/etc/lvm# cryptsetup --allow-discards luksOpen /dev/md2 md2_crypt Enter passphrase for /dev/md2: 0 wiwa:/etc/lvm# blkid /dev/md2 /dev/md2: UUID="b360ef4f-84ea-457c-8f86-a7f94b1f9277" TYPE="crypto_LUKS" 0 wiwa:/etc/lvm# echo md2_crypt UUID=b360ef4f-84ea-457c-8f86-a7f94b1f9277 none luks,discard >> /etc/crypttab 0 wiwa:/etc/lvm# cat /etc/crypttab # <target name> <source device> <key file> <options> md1_crypt UUID=ae7a55a5-cc91-4064-8e5f-06eb293188a2 none luks md2_crypt UUID=b360ef4f-84ea-457c-8f86-a7f94b1f9277 none luks,discard 0 wiwa:/etc/lvm# pvcreate /dev/mapper/md2_crypt Physical volume "/dev/mapper/md2_crypt" successfully created 0 wiwa:/etc/lvm# vgcreate vg_wiwa1 /dev/mapper/md2_crypt Volume group "vg_wiwa1" successfully created 0 wiwa:/etc/lvm#
Adding solid state drives to a guest
In Puppet
First, enable a weekly systemd timer that will run the fstrim
command to communicate to the device which blocks are not being used.
In puppet, edit the guest's .pp file, adding:
class { "mayfirst::m_ssd::fstrim": }
On the host machine
On the host, create a logical volume for your given host, e.g.:
0 wiwa:/etc/lvm# lvcreate --size 10GB --name jacobs vg_wiwa1 Logical volume "jacobs" created 0 wiwa:/etc/lvm#
Change the group ownership to match the user the guests runs as:
0 wiwa:/etc# chgrp jacobs /dev/mapper/vg_wiwa1-jacobs 0 wiwa:/etc#
Make this change permanent. Edit: /etc/udev/rules.d/92-kvm_creator-jacobs.rules and add a new line:
ACTION=="change", SUBSYSTEM=="block", ATTR{dm/name}=="vg_wiwa1-jacobs", GROUP="jacobs"
And edit the environment variables for the given host in /etc/kvm-manager/jacobs/env.
You must indicate to kvm-manager that the scsi
driver should be used instead of the default virtio driver.
- HDA will probably be set to the path to the regular block device (e.g.
/dev/mapper/vg_wiwa0-jacobs
). Leave it as is (it will default to the virtio driver). - Add a new variable: HDB set to the path to the ssd block device (e.g.
/dev/mapper/vg_wiwa1-jacobs
). - Add another new environment variable named after the disk, e.g. HDB_DRIVER with the content:
scsi
.
Reboot the guest.
On the guest
Look for the new block device (/dev/sdb below):
0 jacobs:~# cat /proc/partitions major minor #blocks name 8 16 10485760 sdb 8 0 524288000 sda 8 1 248832 sda1 8 2 524037120 sda2 254 0 3997696 dm-0 254 1 499712 dm-1 254 2 999424 dm-2 254 3 4997120 dm-3 254 4 4997120 dm-4 254 5 20971520 dm-5 0 jacobs:~#
Create a filesystem
CAREFUL! Don't assume the new device is sdb. The examples below are based on the device appearing as sdb, but if it appears differently, you can create some bad news with these commands.
Create a single partition:
0 jacobs:~# parted /dev/sdb GNU Parted 3.2 Using /dev/sdb Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) mklabel gpt (parted) unit s mkpart main 8192 -196608 (parted) p Model: QEMU QEMU HARDDISK (scsi) Disk /dev/sdb: 20971520s Sector size (logical/physical): 512B/512B Partition Table: gpt Disk Flags: Number Start End Size File system Name Flags 1 8192s 20774912s 20766721s main (parted) quit Information: You may need to update /etc/fstab. 0 jacobs:~#
Create a filesystem:
mkfs -t ext4 /dev/sdb1
Replacing MySQL partition
If you are using this partition to replace MySQL...
Stop mysql:
systemctl stop mysql
Unmount /var/lib/mysql:
umount /var/lib/mysql
Edit /etc/fstab, change the line for mounting /var/lib/mysql
. It currently starts something like /dev/mapper/vg_jacobs0-var+lib+mysql.
Assuming your ssd card appeared as /dev/sdb, find the UUID of the partition with blkid /dev/sdb1
and then replace with UUID=?
(replace the question mark with the actuall UUID number)..
Mount the old logical volume on /mnt:
mount /dev/mapper/vg_jacobs0-var+lib+mysql /mnt
Mount the new partition:
mount /var/lib/mysql
Move the data:
rsync -a /mnt/* /var/lib/mysql
Restart mysql:
systemctl start mysql
If everything goes well unmount /mnt and remove the logical volume:
umount /mnt lvremove /dev/mapper/vg_jacobs0-var+lib+mysql