cron.weekly issue #39: Spark, PHP-FPM, Riemann, Ansible, Varnish, PostgreSQL & more!

cron.weekly is a newsletter about Linux, open source & webdevelopment. Want to get it in your inbox every Sunday? Subscribe below!

I respect your privacy and you won't get spam. Ever. Just a weekly-ish newsletter about Linux and open source.
Image of Mattias Geniar

Mattias Geniar, July 31, 2016

Follow me on Twitter as @mattiasgeniar

Welcome to cron.weekly issue #39 for Sunday, July 31st, 2016.

Lots of guides this time with really practical hands-on tips for running your Linux servers.

I also added a new section this time called “videos“. Many interesting ones but they require a bit more time to digest, so I decided to structure them separately.

You’ll also notice this edition is quite long. I feel it’s a bit too long. Starting next week, I’ll make work of keeping only the very interesting articles and making each issue slightly shorter, but easier to read.


“Serverless” is just a name. We could have called it “Jeff”

This post goes on explaining what “serverless” means and how the name that was chosen to represent that _ideology _is just that: a name.

Why Uber engineering switched from PostgreSQL to MySQL

Every organisation has different needs, and in the case of Uber PostgreSQL wasn’t cutting it. Mind you, neither was MySQL – but they built a tool on top of MySQL called “Schemaless” that handles the database sharding.

Why do we automate?

This post explains the reasons why we as sysadmins or developers automate certain tasks. It isn’t just to save money, but to gain consistency, reduce friction and have the ability to schedule at will.

Kernel 4.7 released

Linus announced the 4.7 kernel last week. As usual, the KernelNewbies site has all the changes listed nicely.

Debian websites available as Onion services

A nice move from the Debian folks: all their online services are available over the TOR network as Onion URLs.

Tools & Projects

Apache Mesos 1.0

A big release for the Apache Mesos project! Apache Mesos abstracts CPU, memory, storage, and other compute resources away from machines (physical or virtual), enabling fault-tolerant and elastic distributed systems to easily be built and run effectively.

Apache Spark 2.0

A whole new internal API, many performance improvements and a lot of bugfixes. Apache Spark is a fast, in-memory data processing engine to efficiently execute streaming, machine learning or SQL workloads, often on top of Hadoop.


Molecule is designed to aid in the development and testing of Ansible roles including support for multiple instances, operating system distributions, virtualization providers and test frameworks by using Vagrant or Docker instances.

Flynn 1.0

Flynn is a “next generation” open source platform as a service (PaaS).  Flynn is designed to run anything that can run on Linux, not just stateless web apps. Flynn comes with highly available database appliances, including PostgreSQL, MySQL, and MongoDB.


A collection of Ansible playbooks, scalable from one container to an entire data center.


Following the same trend, “DebOps for WordPress” is a tool that gives anyone in the WordPress community access to a fast and secure WordPress server. It’s meant to be easy to use and require little to no system administrator knowledge.


Ack calls itself “beyond grep”: it’s a blazing fast alternative for grep with better search possibilities and “designed for code”.

OpenVZ 7.0

The OpenVZ project announced a big new release last week. One of the biggest changes is replacing their own hypervisor with KVM and support for kernel 3.10 (CentOS/RHEL 7).

Guides & Tutorials

What you have to know about Consul and how to beat the outage problem

If you’re using Consul as service discovery, this is an interesting read to better understand how Consul deals with failure, the terminology explained and how to recover from a cluster where no leader can be elected.

Bash Keyboard Shortcuts

Did you know there are a ton of shortcuts you can use in Bash to make you more efficient? From quickly moving the cursor to editing commands to manipulating your Bash history!

A better way to run PHP-FPM

This post explains 2 new concepts for running a PHP application as a PHP-FPM pool: the on-demand process manager and running multiple PHP-FPM masters to have a unique APC or OPcache.

Sed command examples in Linux and Unix – How to use

The ‘sed’ command isn’t the easiest one to get started with, but this post has a lot of good examples on how to use it.

The Art of Monitoring: Introducing Riemann

Riemann is a network monitoring system. This post explains the concepts and terminology and gives a quick overview of what “stream processing” means in terms of monitoring.

Tips for PostgreSQL

For those of us running PostgreSQL instances and running the occasional SQL query, this post can come in handy. A good list of tips for manipulating the psql shell, searching through history, quick copy-shortcuts and more.

A PHP and Docker Workflow

A practical guide on how you can setup a Docker container workflow for a PHP application, with some useful wrappers to make working with the Docker instances easier.

Varnish Agent: an HTML frontend to manage & monitor your varnish installation

If you’ve ever used Varnish – the reverse caching proxy – you’ll know the CLI tools can be complicated. That’s mainly because Varnish is complicated. The Varnish Agent is a useful web frontend you can install to help manipulate your Varnish instance, monitor stats and write custom VCL logic on the fly.

How does MySQL Replication really work?

A slightly older post, but still very relevant: it explains the different MySQL replications forms (statement vs. row based), the I/O thread and the SQL thread and the concept of replication lag. Must-read for everyone who has ever installed or is thinking of installing a MySQL replication setup.

Making your own web debugging proxy

Nginx has a lot of features, one of the most widely used is the Proxy one. This most explains how you can use Nginx to create an open proxy, useful in debugging situations. It’s also a good reminder that you need to be careful how to configure your Nginx instances, because a simple misconfiguration can lead to a publicly usable web proxy for anyone to use.

Load Balancing Websocket Connections

A really good post if you’re challenged by Websockets: because of their long-live nature, traditional HTTP load balancing usually doesn’t work very well. Some good pointers on the Websocket protocol and tips on which layer of the OSI model you can best load balance these on.

Bash Script Templates

A useful set of boilerplate bash scripts you can use as a starting point for creating your own scripts.


Lesson learnt the hard way: Postgres in Production (video)

Some really good tips given at a PostgreSQL user conference about running it in production. The speaker introduces 5 “failures” they experienced and how each was handled. Very honest talk.

How Netflix Gives all its Engineers SSH Access to Instances Running in Production (video)

Some cool insights in how Netflix operates and handles security, in this case for SSH. Many tips on SSH key management, how to reduce friction in security, automated scanning, using “bastions” (jump-hosts) for logging and auditing, …

The Black Magic Of SSH / SSH Can Do That? (video)

This presentation explains the port forwarding feature of SSH (remote & local), dynamic port forwarding, remote commands and much more cool things you can do with SSH.


LinuxCon Europe

The schedule for LinuxCon Europe is available online and it’s massive. The event takes place in Berlin Germany on October 4th to October 6th.

Want to subscribe to the cron.weekly newsletter?

I write a weekly-ish newsletter on Linux, open source & webdevelopment called cron.weekly.

It features the latest news, guides & tutorials and new open source projects. You can sign up via email below.

No spam. Just some good, practical Linux & open source content.