wiki:red-mosh-reorganization

Version 6 (modified by https://id.mayfirst.org/jamie, 4 months ago) (diff)

--

Red/Mosh Re-organization

The red control panel and filesystem layout of our MOSH servers limits us in several critical ways.

This proposal for re-organization has the following goals:

  • Take immediate steps to relieve disk i/o contention
  • Prepare MOSH'es to handle a network file system as an intermediate step in our infrastructure project
  • Make it easier to move user accounts and web sites from one membership to another
  • Eliminate the confusing "hosting order" - so user accounts, DNS records, email lists, etc are linked directly to a membership, rather than being part of a hosting order collection

Phase zero: all databases, all solid state drives

The fastest way to alleviate both the greatest cause and symptom of disk I/O is to place all databases on solid state drives.

Currently, only three of our 10 physical servers have solid state drives. On those three servers, we have been allocating SSD space for MySQL partitions. However, on the remaining seven, MySQL databases have been stored on regular spinning disks.

To get all databases on SSD, we should allocate two virtual machines on each of our three physical hosts with SSD and allocate all remaining SSD space to these machines. These will become our MySQL servers for all members.

We will setup an internal network (e.g. 10.12.53.0/24) so all database ports are shielded from direct access over the Internet.

We will also setup several MySQL proxies (that will respond via round robin DNS to mysql.mayfirst.org). Each proxy will properly route the database request to the proper back end host, so members will all simply configure their web apps to use the host mysql.mayfirst.org.

Phase one: backup chages

Our current backup strategy places an enormous disk i/o strain on our servers. While this strain is mostly confined to off hours (for US and Mexico members), it often continues for some servers past 10 or 11 am America/New_York time, thus contributing to the disk i/o strain during our peak time.

The problem is that we run both an onsite rdiff-backup with 10 days of increments saved followed immediately by an rsync off-site backup. That means all files on all servers are ready twice.

We could signficantly reduce this load by instead running an rsync backup to our onsite backup server, then running an incremental backup from the onsite backup server to the offsite server during the day (and perhaps switching from rdiff-backup to borg-backup to see if we get better performance (see [ticket/12677#comment:15]).

Goals

Immediatley relieve disk i/o contention.

Phase two: File system re-organization

The first step is to change the file system layout on MOSH'es.

Currently, user home directories are stored in /home/members/<member-short-name>/sites/<hosting-order-identifier>/users/<username> and web sites are stored in /home/members/<member-short-name>/sites/<hosting-order-identifier>/{web,include,logs}.

Since the membership name is part of the home directory and web paths, moving a web site or username to a different membership requires a tedious and error prone process of updating the user account home directory, web site DirectoryRoot, etc.

In addition, web sites and user paths are grouped together, making it harder to separate user accounts used for email from those used to secure FTP to update a web site.

New Layout

The proposed new layout would be:

  • Home directories: /home/members/users/<username>
  • Web directories: /home/members/sites/<hosting-order-id>/{web,include,logs,conf/apache,conf/php}

A web site's apache configuration would live in conf/apache and /etc/apache2/sites-enabled/<hosting-order-id>.conf would be a symlink to it. A web site's php pool configuration file would live in conf/php and /etc/php/7.0/fpm/pool.d/<hosting-order-id>.conf would be a symlink to it.

Transition

Since all MOSH servers have /home as a single partition, moving web site and user directories around should take minimal disk i/o.

As part of the transition, we could both:

  • perform a sed replacement for any instance of the old path with the new path (so php configuration files are properly updated)
  • leave a symlink from the old location to the new location in case anything was missed

Goals

Once completed, this step would achive the goal of making it easier to move users and web sites between memberships since the only record of what membership a user or web site belongs to would be in the control panel database.

Phase three: eliminate the hosting order

The hosting order would be renamed "web site" (and the hosting-order-id in the path above would simply be renamed web-site-id in the database). We would still keep the following items grouped together:

  • web configuration
  • web application
  • schedule job (cron job)

However, the following items would be moved up to the member level:

  • Email address
  • Email list
  • High volume email list
  • DNS
  • mysql database (databases would move to a network database server model)
  • mysql username

We would eliminate Hosting order access - we would no longer support access control based on a web site.

Goals

With this step done, we would no longer have hosting orders

Phase four: central authentication and authorization

The last phase before deploying containers is to replace /etc/passwd, /etc/shadow, and /etc/group with a central authentication/authorization system (NIS with ldap or mysql).

With this step in place, every MOSH would have every login available on it.

Now, moving a web site between MOSH'es would require:

  • Creating symlinks on the target MOSH (home directories, web config, php config, databases) and reloading necessary services. Now everything is available on both servers at the same time. NOTE: the web files can live on both servers at the same time, but the database has to be removed from one server before it is started on the other to avoid data corruption.
  • Update DNS (or preferably the internal routing mechanism so it can be instant).
  • Remove symlinks from old MOSH

Phase five: network file system

Once we have a network file system available, the various network drives would be mounted in /media, e.g. /media/fs/<id>/{users,sites,databases}.

We could transition each MOSH on a case-by-case basis by simply moving all data from /home/members to their respective locations on the network file system and then replacing the directories in /home/members with symlinks to /media/fs.

Depending on the network file system chosen, we could store MySQL databases and maintain symlinks in /var/lib/mysql.

Phase six: containers and beyond

Once these pieces are in place, switching to a container based system is relatively simple as user and web directories can easily be mounted inside a container.