The async Puppet pattern

Want to help support this blog? Try out Oh Dear, the best all-in-one monitoring tool for your entire website, co-founded by me (the guy that wrote this blogpost). Start with a 10-day trial, no strings attached.

We offer uptime monitoring, SSL checks, broken links checking, performance & cronjob monitoring, branded status pages & so much more. Try us out today!

Profile image of Mattias Geniar

Mattias Geniar, May 17, 2016

Follow me on Twitter as @mattiasgeniar

I’m pretty sure this isn’t tied to Puppet and is probably widely used by everyone else, but it only occurred to me recently what the structural benefits of this pattern are.

Async Puppet: stop fixing things in one Puppet run

This has always been a bit of a debated topic, both for me internally as well as in the Puppet community at large: should a Puppet run be 100% complete after the first run?

I’m starting to back away from that idea, having spent countless hours optimising my Puppet code to have the “one-puppet-run-to-rule-them-all” scenario. It’s much easier to gradually build your Puppet logic in steps, each step activating when the next one has caused its final state to be set.

What I’m mostly seeing this scenario shine in is the ability to automatically add monitoring from within your Puppet code. There’s support for Nagios out of the box and I contributed to the zabbixapi ruby gem to facilitate managing Zabbix host and templates from within Puppet.

Monitoring should only be added to a server when there’s something to monitor. And there’s only something to monitor once Puppet has done its thing and caused state on the server to be as expected.

Custom facts for async behaviour

So here’s a pattern I particularly like. There are many alternatives to this one, but it’s simple, straight forward and super easy to understand – even for beginning Puppeteers.

  1. A first Puppet run starts and installs Apache with all its vhosts
  2. The second Puppet run starts and gets a fact called “apache_vhost_count”, a simple integer that counts the amount of vhosts configured
  3. When that fact is a positive integer (aka: there are vhosts configured), monitoring is added

This pattern takes 2 Puppet runs to be completely done: the first gets everything up-and-running, the second detects that there are things up-and-running and adds the monitoring.

Monitoring wrappers around existing Puppet modules

You’ve probably done this: you get a cool module from Forge (Apache, MySQL, Redis, …), you implement it and want to add your monitoring to it. But how? It’s not cool to hack away in the modules themselves, those come via r10k or puppet-librarian.

Here’s my take on it:

  1. Create a new module, call it “monitoring”
  2. Add custom facts in there, called has_mysql, has_apache, … for all the services you want
  3. If you want to go further, create facts like apache_vhost_count, mysql_databases_count, … to count the specific instance of each service, to determine if it’s being used or not.
    • Use those facts to determine whether to add monitoring or not:
      if ($::has_apache > 0) and ($::apache_vhost_count > 0) {
      

@@zabbix_template_link { “zbx_application_apache_${::fqdn}": ensure => present, template => ‘Application - PHP-FPM’, host => $::fqdn, require => Zabbix_host [ $::fqdn ], } }

Is this perfect? Far from it. But it's pragmatic and it gets the job done.

The facts are easy to write and understand, too.

<pre>Facter.add(:apache_vhost_count) do

confine :kernel => :linux setcode do if File.exists? “/etc/httpd/conf.d/” Facter::Util::Resolution.exec(‘ls -l /etc/httpd/conf.d | grep 'vhost-' | wc -l’) else nil end end end

It's mostly bash (which most sysadmins understand) -- and very little Ruby (which few sysadmins understand).

The biggest benefit I see to it is that whoever implements the modules and creates the server manifests doesn't have to toggle a parameter called `enable_monitoring` (been there, done that) to decide whether or not that particular service should be monitored. Puppet can now figure that out on its own.

## Detecting Puppet-managed services

Because some services are installed because of dependencies, the custom facts need to be clever enough to understand when they're being managed by Puppet. For instance, when you install the package "httpd-tools" because it contains the useful `htpasswd` tool, most package managers will automatically install the "httpd" (Apache) package, too.

Having that package present shouldn't trigger your custom facts to automatically enable monitoring, it should probably only do that when it's being managed by Puppet.

A very simple workaround (up for debate whether it's a good one), is to have each Puppet module write a simple file to `/etc/puppet-managed` in each module.

<pre>$ ls /etc/puppet-managed

apache mysql php postfix …

Now you can extend your custom facts with the presence of that file to determine if A) a service is Puppet managed and B) if monitoring should be added.

<pre>Facter.add(:has_apache) do

confine :kernel => :linux setcode do if File.exists? “/sbin/httpd” if File.exists? “/etc/puppet-managed/apache” # Apache installed and Puppet managed true else # Apache is installed, but isn’t Puppet managed nil end else # Apache isn’t installed nil end end end

(example explicitly split up in order to add comments)

You may also be tempted to use the `defined()` ([manual][2]) function, to check if Apache has been defined in your Puppet code and then add monitoring. However, that's dependent on the resource order in which it's evaluated.

Your code may look like this:

<pre>if (defined(Service['httpd']) {

Apache is managed by Puppet, add monitoring ?

}

Puppet's [manual][2] explains the big caveat though:

> Puppet depends on the configuration’s evaluation order when checking whether a resource is declared.

In other words: if your monitoring code is evaluated _before_ your Apache code, that `defined()` will always return `false`.

Working with facter circumvents this.

Again, this pattern isn't perfect, but it allows for a clean separation of logic and -- if your team grows -- an easier way to separate responsibilities for the monitoring team and the implementation team to each have their own modules with their own responsibilities.


Want to subscribe to the cron.weekly newsletter?

I write a weekly-ish newsletter on Linux, open source & webdevelopment called cron.weekly.

It features the latest news, guides & tutorials and new open source projects. You can sign up via email below.

No spam. Just some good, practical Linux & open source content.