| 1 | = Understanding why a MOSH is under heavy load = |
| 2 | |
| 3 | When you find a MOSH under heavy load it is often due to a single web site or user getting slammed. |
| 4 | |
| 5 | Here are some tricks to find that user. |
| 6 | |
| 7 | == Resource Hog == |
| 8 | |
| 9 | The `mf-resource-hog` script will output the top CPU users by minute going back between 1 and 2 days. |
| 10 | |
| 11 | You can modify it's behavior in a number of ways: |
| 12 | |
| 13 | * `mf-resource-hog cpu`: outputs CPU usage. This is the defalt |
| 14 | * `mf-resource-hog disk-read`: outputs disk read in kB per user. |
| 15 | * `mf-resource-hog disk-write`: outputs disk writes in kB per user. |
| 16 | |
| 17 | The second argument specifies an interval: |
| 18 | |
| 19 | * `mf-resource-hog cpu minute`: outputs usage on a per-minute basis. This is the default. |
| 20 | * `mf-resource-hog cpu hour`: outputs usage on a per-hour basis. |
| 21 | * `mf-resource-hog cpu day`: outputs usage on a per-day basis. |
| 22 | * `mf-resource-hog cpu month`: outputs usage on a per-month basis. |
| 23 | |
| 24 | If you want real-time continuous output, you can run the collector script: |
| 25 | |
| 26 | `mf-resource-hog-collector [cpu|disk-read|disk-write]` |
| 27 | |
| 28 | Or, you can check out `man pidstat` to read up on the underlying tool used to generate this data. |
| 29 | |
| 30 | === How does it work? === |
| 31 | |
| 32 | There are three scripts in all: |
| 33 | |
| 34 | ==== mf-resource-hog-collector ==== |
| 35 | |
| 36 | When run from a cron job (''not'' a terminal), `mf-resource-hog-collector` runs `pidstat` or `pidstat -d` depending on whether cpu or disk-read/disk-write is passed. It runs for 60 seconds, then totals the usage statistics for each user in the output and writes this output to a file in /var/log/mfpl/resource-hog/minute. |
| 37 | |
| 38 | When run from a terminal, it outputs the data to standard out instead (and limits output to the top 4 users). |
| 39 | |
| 40 | This script runs every minute. |
| 41 | |
| 42 | ==== mf-resource-hog-consolidator ==== |
| 43 | |
| 44 | Once an hour, `mf-resource-hog-consolidator` runs. Every hour it averages the totals from /var/log/mfpl/resource-hog/minute and places them in /var/log/mfpl/resource-hog/hour. |
| 45 | |
| 46 | At midnight, it averages the totals from /var/log/mfpl/resource-hog/hour and places them in /var/log/mfpl/resource-hog/day. It also deletes old files that have already been averaged. |
| 47 | |
| 48 | On the first of the month, it averages the totals from /var/log/mfpl/resource-hog/day and places them in /var/log/mfpl/resource-hog/month. |
| 49 | |
| 50 | ==== mf-resource-hog ==== |
| 51 | |
| 52 | The `mf-resource-hog` script is the main user script. It simply displays the output of `head -n4 /var/log/mfpl/resource-hog/[interval]/[resource]`. |
| 53 | |
| 54 | == sysstat == |
| 55 | |
| 56 | Another useful tool enabled on all MOSH'es is `sysstat`. It can provide a recent history of resource usage on the server. This information helps you determine if a resource constraint has only been happening recently, or has been on-going for some time. |
| 57 | |
| 58 | It collects data via a cron job that runs `sa1`. To view the data, run: |
| 59 | |
| 60 | {{{ |
| 61 | sar |
| 62 | }}} |
| 63 | |
| 64 | To see the data from yesterday: |
| 65 | |
| 66 | {{{ |
| 67 | sar -1 |
| 68 | }}} |
| 69 | |
| 70 | The `sysstat` commands won't break down resource usage on a per-user basis. |