Version 11 (modified by Daniel Kahn Gillmor, 9 years ago) (diff)


Puppet layout and deployment


Traditionally, puppet uses a puppet master, which is a dedicated server that has a copy of the puppet configuration files. Each node runs a puppet process as root that contacts the master, receives the catalog specific to the server they are running, and then executes the catalog.

MF/PL does not use a puppet master. Instead of having a single server with the puppet configuration files, we have distributed the configuration files to every server (aka node) on our network via git, placing the files in /etc/puppet.

Each node runs a cron job that does a git remote update against git:// It checks for the most recent git tag that is signed by a member of the MFPL support team, merges that tag, and then runs puppet.

Alternatively, MFPL admins can push changes into a puppet bare repo on each node, which will run a post-update hook that pulls the changes into /etc/puppet and runs puppet.


Below is the layout of our puppet git repository.

holds code for running operations on our servers from the command line using the freepuppet command. For a list of options run:
helper/freepuppet-helper -l
this is where information about each MFPL server is stored, plus some general information specific to our servers
  • site.pp: this is the file that boostraps all other files
  • globals.pp: global variables specific to May First/People Link, used throughout
  • modules.pp: one line to import every puppet module that we use. We only use one module: mayfirst
  • nodes/: this directory contains the files for each server in our network
    • production/: one file for every production server in our network
    • dev/: one file for every test/dev server
modules are abstracted collections of manifests, templates and files. All of our puppet code is in the mayfirst module.

Puppet Resources

Puppet is fully documented.

However, we only use a small subset of puppet features. Below is a very brief overview.


A file resources defines a file that should be created on the server:

file { "/etc/aliases":
	ensure => present,
	content => template("mayfirst/postfix/aliases.erb"),

This file resource ensures that a file located at /etc/aliases on the target server will be created. The content of the file will be based on its template (this template contains variables that will be dynamically filled in).


An exec resources defines a command that will be run everytime puppet is run:

exec { "gpg-ssh-genkey-$user":
		command => "ssh-keygen  -t rsa -b '$length' -f '$homedir/.ssh/id_rsa' -N ''",
		environment => "HOME=$homedir",
		require =>  [ Package["gnupg"], Package["openssh-client"] ],
		user => $user,
		unless => "test -f '$homedir/.ssh/id_rsa'"

The unless clause provides a test to see if the command should be run.


A define statement is like a function. It allows you to define a set of things to happen every time it is called.

Here's how a define is, well, defined:

define m_gpg::publish_user_key ( $keyserver = '' ){
  $user = $title

  $keyserver_arg = $keyserver ? {
    '' => '',
    default => "--keyserver $keyserver"

  exec { "gpg-send-key-$user":
    command => "gpg $keyserver_arg --send-key $(gpg --list-secret-key --with-colons | grep ^sec | cut -d: -f5)", 
    require => [ Package["gnupg"], Exec["gpg-pem2openpgp-$user" ] ],
    user => $user,

And here's how it is called:

m_gpg::publish_user_key { "root": keyserver => $mfpl_keyserver }

You can repeatedly call the same define as many times as you want.


A class is a bigger collection of resources that can only be called once per node.

Here's an example of a simple class:

class m_resolv ( $caching_dns_ips = "", $location ) {
  $ips = $caching_dns_ips ? {
    "" => $location ? {
      telehouse => [ "", "" ],
      xo => [ "", "" ],
      sunsetpark => [ "", "" ],
      default => [ "", "" ]
    default => $caching_dns_ips
  file { "/etc/resolv.conf":
    ensure => present,
    content => template("mayfirst/resolv/resolv.conf.erb")

And here is the class being called:

class { "m_resolv":
    caching_dns_ips => [ "", "" ],
    location => "telehouse" 


The node is the basic building block for a server. Each individual server has a corresponding node. The node invokes everything, pulling it all together:

node ""  {

  $namesake = ""
  $purpose = "Offsite backup server for both MFPL rack machines and members who backup their office server; web interface to backup files"

  class { "m_sshd::default": }
  class { "m_minimal":
    location => "sunsetpark",
    backup_includes => [ "/var/log", "/etc", "/root", "/usr/local"  ],
    backup_rsync_target => "",
    caching_dns_ips => [ "", "" ]
  class { "m_esmtp": }
  m_monkeysphere::publish_server_keys { ali: }
  m_gpg::publish_user_key { "root": keyserver => $mfpl_keyserver }

  class { "m_backupninja::server": }

$namesake and $purpose are not used by puppet - they are there for human readability.

Variable Scoping

Global variables are defined in puppet/manifests/globals.pp and have an mfpl_ prefix. They are accessible anywhere in the code, provided you use the double colon syntax: $::, e.g. $::mfpl_admin_user_ids.

All other variables should be properly scoped.