Changes between Version 5 and Version 6 of how-to/servers/puppet/setup-nagios-monitor
- Timestamp:
- Dec 15, 2016, 2:29:05 PM (7 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
how-to/servers/puppet/setup-nagios-monitor
v5 v6 1 1 == Creating a nagios monitor == 2 This page explains how to create a nagios monitor configuration in [wiki:how-to/puppet puppet]. You will need to make changes to get the correct configuration for the specific type of monitoring needed.3 2 4 === Set up an executable === 5 These files are stored in puppet/modules/mayfirst/files/monitor-utils/ you can 6 find examples of different versions of monitoring scripts there. 3 === Overview === 4 5 Our nagios server is jojobe.mayfirst.org (which is available via https as https://monitor.mayfirst.org/). 6 7 We monitor network accessible services (like https, smtp, imap, etc) the standard way: Our nagios server periodically tries to connect over the network to these services on each server to see if they are still running properly. 8 9 In addition, it's useful to check services that are not publicly accessible over the network - such as disk usage, or whether MySQL (which only listens on a local port) is still running. 10 11 The standard way to setup nagios alerts for local services like these is to have every server run a nagios service open on a public IP address. Then, the nagios server connects on this port and queries all the local services. 12 13 We do it differently because we want to avoid running another publicly accessible service on every server. 14 15 We run local scripts on each server via cron jobs and then use SCP to copy the output to our nagios server every hour. Then the nagios server checks the output of the scripts to decide whether to throw an alert or not. 16 17 === Set up a script to run locally === 18 19 These files are stored in puppet/modules/mayfirst/files/monitor-utils/ you can find examples of different versions of monitoring scripts there. Each script should generate output that starts with either OK:, WARNING: or CRITICAL:. It then sends this output via standard in to the script mf-monitor-output, with the '''type''' of check as the first argument, for example: 20 21 {{{ 22 echo "Warning: /root partition at 80%" | mf-monitor-output df 23 }}} 24 25 In this example "df" is the '''type''' of check being run. 26 27 The `mf-monitor-output` script is smart enough to detect if it is being run via a terminal (e.g. by an admin) and if so, it prints the output to standard out so you can read it. On the other hand, if there is no terminal (e.g. cron job), it writes the output to /var/log/mfpl/monitor/$(hostname).$(type).txt, which then gets copied to jojobe. 28 29 If you want to add a check, review the existing files for an example. 7 30 8 31 === Set up a cronjob === 9 projects/puppet/modules/mayfirst/templates/monitor-utils/cron.d/mf-monitor10 32 11 You'll need to add your script to this directory. 33 Your script won't get called unless it is included in the cron job. You'll need to edit: 34 35 {{{ 36 puppet/modules/mayfirst/templates/monitor-utils/cron.d/mf-monitor 37 }}} 38 39 And add your script. 12 40 13 41 === Add to utils.pp === 42 43 You also need to ensure your script gets copied. 44 45 Modify: 46 47 {{{ 14 48 puppet/modules/mayfirst/manifests/utils.pp 49 }}} 15 50 16 51 The code should look something like this, with the correct file from the executable specified.: 17 52 18 53 {{{ 19 file { "/usr/local/sbin/mf-monitor-mailq": 20 source => "puppet:///modules/mayfirst/monitor-utils/mf-monitor-mailq", 21 ensure => present, 22 mode => 755, 23 owner => "root", 24 group => "root" 54 file { "/usr/local/sbin/mf-monitor-df": 55 source => "puppet:///modules/mayfirst/monitor-utils/mf-monitor-df", 25 56 } 26 57 }}} 27 58 28 59 === Define hostgroup === 60 61 Next, we have to define a host group - this is a group of servers that will use this check. 62 63 See: 64 65 {{{ 29 66 projects/puppet/modules/mayfirst/files/nagios/nagios3/conf.d/ 67 }}} 30 68 31 69 This code section should look something like this: … … 33 71 {{{ 34 72 define hostgroup { 35 hostgroup_name mailq-servers36 alias MailCheck Servers73 hostgroup_name df-servers 74 alias File System Check Servers 37 75 } 38 76 }}} 39 77 40 === Define nagios command === 41 puppet/modules/mayfirst/files/nagios/nagios3/commands.cfg 78 === Add the check as a service === 42 79 43 Should look like this:44 45 {{{46 define command{47 command_name check-upgrade48 command_line /usr/local/share/nagios/plugins/mf-nagios-check-upgrade '$HOSTNAME$'49 }50 }}}51 === Create parsing script ===52 You will also need to create a script that parses the output of the monitoring53 files.54 55 '''puppet/modules/mayfirst/files/nagios/nagios-plugins/plugins/mf-SCRIPT-NAME'''56 57 You can model scripts that already exist to check this.58 59 === Add the check as a service ===60 80 The service part of the infrastructure is the display component for nagios. 61 81 62 '''puppet/modules/mayfirst/files/nagios/nagios3/conf.d/services_nagios2.cfg''' 82 See: 83 {{{ 84 puppet/modules/mayfirst/files/nagios/nagios3/conf.d/services_nagios2.cfg 85 }}} 63 86 64 87 Copy a pre-existing stanza and make the necessary changes. It will look something like this: … … 66 89 {{{ 67 90 define service{ 68 hostgroup_name upgrade-servers69 service_description Upgrade70 check_command check-upgrade91 hostgroup_name df-servers 92 service_description DF 93 check_command mf-checker!df 71 94 notification_interval 0 72 95 use generic-service … … 74 97 }}} 75 98 99 NOTE: the ! separates the command from the first argument ('''type''' of check). In this example "df" is the service being checked. Replace "df" with the service you created. 100 76 101 === Finally add the hostgroup to nagios manifest === 77 This is not a mandatory step. If the monitor should be run on all servers, then add the hostgroup/service here. Otherwise leave it out, but be sure to include in the monitor script a line that specifies under what context the script should be run. For example (from mf-monitor-fcgid): 102 103 This is not a mandatory step. If the monitor should be run on all servers, then add the hostgroup/service here. 104 105 Otherwise leave it out and instead add it to the individual server's .pp file. 78 106 79 107 {{{ 80 # Only run if fcgid is installed 81 [ ! -e "/etc/apache2/mods-enabled/fcgid.conf" ] && exit 0 108 projects/puppet/modules/mayfirst/manifests/nagios.pp 82 109 }}} 83 110 84 '''projects/puppet/modules/mayfirst/manifests/nagios.pp'''85 86 One example for standard_hostgroups is:87 111 {{{ 88 112 if ( $include_standard_hostgroups == true ) { … … 95 119 }}} 96 120 97 This is from 'define m_nagios_host'.98 99 '''Make sure all executable scripts have execute permissions'''