#11870 closed Bug/Something is broken (fixed)
Add solid state drives to some servers
Reported by: | Jamie McClelland | Owned by: | Jamie McClelland |
---|---|---|---|
Priority: | Medium | Component: | Tech |
Keywords: | Cc: | ||
Sensitive: | no |
Description
We are in the process of investigating how to restrict disk i/o on a per-kvm basis in #11856.
While this ability is important if we want to ensure more consistent disk i/o speeds, I think it won't actually speed up disk i/o for guests that really need it.
So, I think we should also investigate whether we can speed up disk i/o by adding SSDs to our existing servers.
Attachments (1)
Change History (22)
comment:2 Changed 5 years ago by
comment:3 follow-up: 5 Changed 5 years ago by
that post is confusing. why would you try to use fstrim on an encrypted block device?
also, the resource hog scripts are themselves potentially resource hogs, since they write to disk. Have you considered trying to minimize their use if disk throughput?
comment:4 Changed 5 years ago by
I didn't post that link due to the fstrim part - just as an example of someone using a similar raid/crypt/lvm approach that we use with a SSD.
The resource hog scripts read entirely from proc and only write 4 very small files once a minute (and then another 4 files once an hour to consolidate) - so I don't expect them to have a big impact on disk i/o.
There are a few other scripts that measure disk usage by members on mosh's - which are more io intensive - but they are all running via ionice.
Changed 5 years ago by
Attachment: | install-ssd-trays.pdf added |
---|
comment:6 Changed 5 years ago by
I just attached instructions for how to install the SSD trays I ordered into wiwa. It appears that it will require removing the cover.
comment:7 Changed 5 years ago by
The pricing for SSD cards is all over the place, ranging from as low as $150 for about 500GB to nearly a $1,000.
I've been narrowing the search by focusing on ssd drives designed for data center/enterprise use and for write capacity.
Although it is a bit more expensive - I'm considering the Samsung SM863 for $289. It's 480GB and has a good review. Here are the manufacturer's page.
I plan to get two and put them in a RAID.
comment:8 Changed 5 years ago by
Unfortunately, my latest attempt to install the drives has failed.
But, I do know what we need.
There is one power source available. I had a molex cable that fit and provided two connectors that fit the SSD drives but... instead of a 12 inch cable I need a 24 inch cable. Also, the power connectors for the SSD drives need to be flat, not L shaped.
The sata cable connectors are in the middle of the server, which means the two cables I brought were long enough but they were flat on one side and L shaped on the other. We need flat on both sides.
Lastly, the server manufacturer sent us two trays - one to fit in the DVD slot and one to fit in the back. Neither one really works (we don't have a DVD slot). However, if we had two of the back trays on can stack them in the back and it should do the trick.
comment:9 Changed 5 years ago by
I don't know if this would help at all, but there are adapters that allow you to put a pair of 2.5" drives in a 3.5" tray.
A few examples (which just happen to be the first ones I found on the fine web)
comment:10 Changed 5 years ago by
Yeah, those would be perfect but... all of our 3.5" trays are taken :(.
So... I've just ordered:
- straight sata cables
- one molex extension cable
- one molex 4 pin to 15 pin splitter with straight connectors
- A second tray that should fit
comment:11 Changed 5 years ago by
Owner: | set to Jamie McClelland |
---|---|
Status: | new → assigned |
What is the latest update on this jamie?
comment:12 Changed 5 years ago by
I just released a service advisory for attempt number 3 tomorrow night. Let's hope that three's a charm.
comment:13 Changed 5 years ago by
Success!
I'm reading up on a few guides.
General Guides:
- https://wiki.archlinux.org/index.php/Solid_State_Drives
- https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Storage_Administration_Guide/ch-ssd.html
Risks of using Trim:
http://asalor.blogspot.de/2011/08/trim-dm-crypt-problems.html
How to properly use TRIM:
comment:14 Changed 5 years ago by
So far, my conclusions based on reading all of this is:
- The security problems with TRIM (ability to identify disk as encrypted and identify the filesystem type) are not serious enough to outweigh the benefits.
- We should enabled TRIM on all layers
- Via
discard
option in /etc/crypttab - Via
issue_discards = 1
in /etc/lvm/lvm.conf - Seems to be supported by RAID automatically on the kernels we are running (3.16).
- We need to ensure TRIM passed through KVM guests (#12096).
- Via
- We should use a run fstrim regularly rather than rely on the discard fstab option
Nonetheless, there is still some risk (https://www.archlinux.org/news/data-corruption-on-software-raid-0-when-discard-is-used/).
Informative quote from asalor blog:
How to active TRIM on Linux? The first thing to know is that TRIM should be enabled on all I/O abstraction layers. This means that if you have an ext4 partition on top of LVM, which in turn is on top of an encrypted volume with LUKS/dm-crypt, then you must enable support for TRIM in these three layers: The filesystem, LVM and dm-crypt. There is no point in enabling it at the filesystem level if you don’t enable it also on the other layers. The TRIM command should be translated from one layer to another until reaching the SSD.
comment:15 Changed 5 years ago by
I updated puppet to have an $ssd variable when defining a physical server that is false by default. When true, it sets issue_discards = 1
in /etc/lvm/lvm.conf. I set this for wiwa and ran it.
I created a single partition on each disk with:
0 wiwa:/etc/lvm# parted /dev/sdb GNU Parted 3.2 Using /dev/sdb Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) mklabel gpt (parted) unit s mkpart main 8192 -196608 (parted) p Model: ATA SAMSUNG MZ7KM480 (scsi) Disk /dev/sdb: 937703088s Sector size (logical/physical): 512B/512B Partition Table: gpt Disk Flags: Number Start End Size File system Name Flags 1 8192s 937506480s 937498289s main (parted)
Then, created a RAID array:
mdadm --create --raid-devices=2 --level=1 --metadata=1.0 --verbose /dev/md2 /dev/sda1 /dev/sdb1
Then, based on the output of:
mdadm --examine --scan
I pasted the following into /etc/mdadm/mdadm.conf:
ARRAY /dev/md/2 metadata=1.0 UUID=70345fc6:661e67ce:715261cf:84112f10 name=wiwa:2
The next steps are:
- Setup encryption
- Add encrypted device as physical volume for LVM
- Update initramfs so device will be decrypted on boot
- Create volume group: vg_wiwa1
- Create logical volume for first guest server
- Update kvm-manager with latest patch from #12096
- Allocate logical volume to first test guest server
- Reboot guest
- Format device
- Ensure we can extend it at a later time
- Start using it
comment:16 Changed 5 years ago by
I completed these steps with:
0 wiwa:/etc/lvm# cryptsetup luksFormat /dev/md2 WARNING! ======== This will overwrite data on /dev/md2 irrevocably. Are you sure? (Type uppercase yes): YES Enter passphrase: Verify passphrase: 0 wiwa:/etc/lvm# cryptsetup --allow-discards luksOpen /dev/md2 md2_crypt Enter passphrase for /dev/md2: 0 wiwa:/etc/lvm# blkid /dev/md2 /dev/md2: UUID="b360ef4f-84ea-457c-8f86-a7f94b1f9277" TYPE="crypto_LUKS" 0 wiwa:/etc/lvm# echo md2_crypt UUID=b360ef4f-84ea-457c-8f86-a7f94b1f9277 none luks,discard >> /etc/crypttab 0 wiwa:/etc/lvm# cat /etc/crypttab # <target name> <source device> <key file> <options> md1_crypt UUID=ae7a55a5-cc91-4064-8e5f-06eb293188a2 none luks md2_crypt UUID=b360ef4f-84ea-457c-8f86-a7f94b1f9277 none luks,discard 0 wiwa:/etc/lvm# pvcreate /dev/mapper/md2_crypt Physical volume "/dev/mapper/md2_crypt" successfully created 0 wiwa:/etc/lvm# vgcreate vg_wiwa1 /dev/mapper/md2_crypt Volume group "vg_wiwa1" successfully created 0 wiwa:/etc/lvm#
comment:17 Changed 5 years ago by
Create logical volume for jacobs:
0 wiwa:/etc/lvm# lvcreate --size 10GB --name jacobs vg_wiwa1 Logical volume "jacobs" created 0 wiwa:/etc/lvm#
comment:18 Changed 5 years ago by
I had forgotten to grant access to jacobs on the host:
1 wiwa:/etc/sv/kvm/jacobs# ls -l /dev/mapper/vg_wiwa1-jacobs lrwxrwxrwx 1 root root 8 Sep 23 15:16 /dev/mapper/vg_wiwa1-jacobs -> ../dm-28 0 wiwa:/etc/sv/kvm/jacobs# chgrp jacobs /dev/dm-28 0 wiwa:/etc/sv/kvm/jacobs#
Now rebooted jacobs and we have a new disk:
0 jacobs:~# cat /proc/partitions major minor #blocks name 8 16 10485760 sdb 8 0 524288000 sda 8 1 248832 sda1 8 2 524037120 sda2 254 0 3997696 dm-0 254 1 499712 dm-1 254 2 999424 dm-2 254 3 4997120 dm-3 254 4 4997120 dm-4 254 5 20971520 dm-5 0 jacobs:~#
First creating a single partition:
0 jacobs:~# parted /dev/sdb GNU Parted 3.2 Using /dev/sdb Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) mklabel gpt (parted) unit s mkpart main 8192 -196608 (parted) p Model: QEMU QEMU HARDDISK (scsi) Disk /dev/sdb: 20971520s Sector size (logical/physical): 512B/512B Partition Table: gpt Disk Flags: Number Start End Size File system Name Flags 1 8192s 20774912s 20766721s main (parted) quit Information: You may need to update /etc/fstab. 0 jacobs:~#
I'm not sure this step is absolutely necessary. I could have installed a filesystem directly on the device. However, by adding a single partition I am sure we have a good, even sector boundary and seem to leave more options for future uses of the device.
However, I'm intentionally not adding this to a logical volume group because we have a very specific use for this partition, we can always extend it via the host, and I don't want to add any unnecessary disk layers that could slow things down.
comment:19 Changed 5 years ago by
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
jacobs is rebooted and running with the new solid state devices as the mysql partition.
I just added a wiki page documenting how to do this so I'm closing this ticket.
We may want to open new tickets to run this process on existing wiwa guests and also to repeat on a different host.
comment:20 Changed 2 years ago by
Sensitive: | set |
---|
Changed to sensitive as part of leadership decision to make all tickets sensitive.
comment:21 Changed 17 months ago by
Sensitive: | unset |
---|
Please login to add comments to this ticket.
I'm first working on wiwa - to see if the motherboard will support it.
We still have to answer the questions:
I've added "tps" to our resource hog scripts so we can count transactions per second on physical servers (so far, just wiwa has the code) and on guests (so far malcolm and june).
I'm hoping to figure out both:
We'll see...