wiki:disk_alignment

Version 4 (modified by Daniel Kahn Gillmor, 13 years ago) ( diff )

--

Disk Alignment

We use a complex stack of block device tools. As it turns out, these tools will perform slightly better if they all agree on block boundaries. Common block boundaries are powers of two, and the largest block boundaries i've heard of on modern equipment are 4MiB in size (in particular, this is the eraseblock size on certain large flash devices). So this suggestion is to align on 4MiB boundaries (as that will also guarantee alignment on smaller power-of-two boundaries).

setting up

here are the things to think about when setting up a disk:

  • physical disk partitions -- should start at a multiple of 8192s (where s means "512Byte sector")
  • RAID -- mdadm should use 0.90 or 1.0 superblocks, which are written at the end of the device. If metadata version 1.1 or 1.2 are used, then the data_offset field needs to be a multiple of 8192. there does not seem to be a way to force a custom data_offset with metadata 1.1 or 1.2 (i filed 614841 as a wishlist). in squeeze, 1.2 is the default, so use --metadata 1.0 during mdadm --create :(
  • dm_crypt/LUKS -- cryptsetup LuksFormat should use --align-payload=8192
  • LVM
    • pvcreate should use --dataalignment 8192s
    • vgcreate should use --physicalextentsize 8192s (this is currently the default)

checking

  • Partitions: (look at Start column -- you want multiples of 8192) parted /dev/sda unit s print
  • RAID: mdadm --examine $COMPONENTDEVICE | grep Version
    • if the version is 1.1 or 1.2, then look at the data offset: mdadm --examine $COMPONENTDEVICE | grep 'Data Offset:
  • dm_crypt/LUKS: cryptsetup luksDump $BASEDEVICE | grep '^Payload offset:'
  • LVM
    • Physical volumes: (look at "1st PE"): pvs --units s -o +pe_start
    • Volume groups: (look at "Ext"): vgs --units s -o +vg_extent_size

fixing

if any of these are wrong, what do we do?

Note: See TracWiki for help on using the wiki.