cron.weekly issue #119: dino, search engines, btrfs, Alpine, json & more


cron.weekly is a newsletter about Linux, open source & webdevelopment. Want to get it in your inbox every Sunday? Subscribe below!

I respect your privacy and you won't get spam. Ever. Just a weekly-ish newsletter about Linux and open source.

Want to help support this blog? Try out Oh Dear, the best all-in-one monitoring tool for your entire website, co-founded by me (the guy that wrote this blogpost). Start with a 10-day trial, no strings attached.

We offer uptime monitoring, SSL checks, broken links checking, performance & cronjob monitoring, branded status pages & so much more. Try us out today!

Image of Mattias Geniar

Mattias Geniar, February 02, 2020

Follow me on Twitter as @mattiasgeniar

Hi everyone! 👋

Welcome to cron.weekly issue #119.

Please sit back, grab a coffee or tea and enjoy a good lengthy issue. Plenty of reading material on Intel vulnerabilities, Btrfs vs. ZFS, a truckload of new tools and guides to learn from.

I hope you all have a killer week ahead of you! 💪

News & general 🗞

CacheOut - A new Intel CPU vulnerability

A name, a logo and a website: this months’ security issue is called CacheOut. It’s another vulnerability in Intel CPU’s (similar to the MDS attacks and Meltdown and Spectre).

What stands out in the CacheOut vulnerability is this bit:

Unlike previous MDS issues, we show in our work how an attacker can exploit the CPU’s caching mechanisms to select what data to leak, as opposed to waiting for the data to be available.

It sounds like the MDS vulnerability on steroids.

For some more reading on the topic: here’s Intel’s INTEL-SA-00329 disclosure.

Five Years of Btrfs

“In 2015, I decided to use the Btrfs file system to store all my data. Its flexibility turned out to be more valuable than I expected. This article assumes you have some knowledge of file systems.

Why ZFS is not good at growing and reshaping pools (or shrinking them)

In response to the Btrfs article above (which looks at ZFS critically), a long-term ZFS user chimes in: “ZFS is not a good choice if you want to modify your pool disk layout significantly over time. ZFS works best if the only change in your pools that you do is replacing drives with bigger drives."

Linus Torvalds pulled WireGuard VPN into the 5.6 kernel source tree

Earlier this week, Linus Torvalds merged David Miller’s net-next into his source tree for the Linux 5.6 kernel. This merger added plenty of new network-related drivers and features to the upcoming 5.6 kernel, with No.1 on the list being simply “Add WireGuard.”

#Wireguard and Multipath TCP/#MPTCP will be part of kernel 5.6

In addition to Wireguard being pulled in for the kernel 5.6 release, it also includes the last code parts of the Multipath TCP branch. In other words: the 5.6 kernel is networking-feature heavy!

Alpine makes Python Docker builds 50× slower, and images 2× larger

When you’re choosing a base image for your Docker image, Alpine Linux is often recommended. Using Alpine, you’re told, will make your images smaller and speed up your builds. And if you’re using Go that’s reasonable advice. But if you’re using Python … then it’s a whole different story.

Curl to shell isn’t so bad

Piping curl to s(hell) claims that using curl example/install | sh to install software is a “glaring security vulnerability”. This post looks at that claim critically with some thought-provoking arguments.

Ambitions for a Unix Shell (Oil shell)

This post describes the goals the Oil Shell wants to achieve. It’s main target is to be able to replace Bash, but it doesn’t stop there. I like how there’s a clear goal & roadmap, I hope the author achieves it!

(A few) Ops Lessons We All Learn The Hard Way

So. Many. Truths.

Tools & Projects 🛠

smicallef/spiderfoot

SpiderFoot is an open source intelligence automation tool. Its goal is to automate the process of gathering intelligence about a given target, which may be an IP address, domain name, hostname or network subnet.

fabiolb/fabio

fabio is a fast, modern, zero-conf load balancing HTTP(S) and TCP router for deploying applications managed by consul. Register your services in consul, provide a health check and fabio will start routing traffic to them. No configuration required.

paybase/qp

query-pipe: command-line Newline Delimited JSON (NDJSON) querying tool for filtering and transforming JSON.

dalance/procs

procs procs is a replacement for ps written by Rust. It’s features are colored output, keyword searching, showing TCP/UDP ports, read/write througput etc.

dino

Dino is a secure and open-source application for decentralized messaging. It uses the XMPP (“Jabber”) protocol and is interoperable with other XMPP clients and servers.

typesense/typesense

Typesense is a fast, typo-tolerant search engine for building delightful search experiences.

slashbeast/better-initramfs

Small and reliable initramfs solution supporting (remote) rescue shell, lvm, dmcrypt luks, software raid, tuxonice, uswsusp and more.

ShellHub

ShellHub is a modern SSH server for remotely accessing Linux devices via command line (using any SSH client) or web-based user interface. It is intended to be used instead of sshd. ShellHub enables teams to easily access any Linux device behind firewall and NAT.

sampointer/dy

Dy allows you to construct YAML from a directory tree. This can be useful if you have large Kubernetes YAML files that you want to split into more logical directory structures.

pigz

Pigz stands for parallel implementation of gzip, is a fully functional replacement for gzip that uses multiple processors and multiple cores to the hilt when compressing data. (I learned that gzip by default is single-threaded.)

whalebrew/whalebrew

Whalebrew creates aliases for Docker images so you can run them as if they were native commands. It’s like Homebrew, but with Docker images.

mono: a new typeface

A free & open source, developer focussed typeface by Jetbrains called “mono”.

zeitgeist: dependency management for DevOps

Zeitgeist is an ops-focussed dependency manager. It will let you define your dependencies in a YAML file, dependencies.yaml, and help you ensure these dependencies versions are consistent within your project and up-to-date.

postgresqlco.nf

PostgresqlCO.NF (CONF for short) is your postgresql.conf documentation and ultimate recommendations’ source. Our mission is to help you tune and optimize all of your PostgreSQL configuration. With around 290 configuration parameters in postgresql.conf (and counting), it is definitely a difficult task!

Guides & Tutorials 🎓

How to build a Search-Engine with Common Unix-Tools (PDF)

This is a slightly older presentation that resurfaced with a good summary of creating “pipelines” (in Bash-sense) to query & filter data using native Linux tooling like tr, sed etc.

binhnguyennus/awesome-scalability

An updated and organized reading list for illustrating the patterns of scalable, reliable, and performant large-scale systems. Concepts are explained in the articles of prominent engineers and credible references.

The Architecture of a Large-Scale Web Search Engine, circa 2019

I found this to be a really good write-up on modern search architecture, introducing Kafka, Cassandra, Granne, Keyvi and several other tools. At the very least, you get a good sense of what each open source tool has to offer and what use cases it serves.

Building a simple VPN with WireGuard with a Raspberry Pi as Server

Now that wireguard will be part of the upcoming Linux 5.6 Kernel it’s time to see how to best integrate it with my Raspberry Pi based LTE-Router/Access Point Setup.

Sysadmin tools: How to use iptables

If you want to fully manage network traffic to and from your Linux system, the iptables command is what you need to learn.This article provides general advice on creating iptables entries and several generic examples to get you started.

distri: 20x faster initramfs (initrd) from scratch

This was a really interesting read on how to make the initramfs generation a lot faster. It looks at why it’s slow in the first place and how the Go programming language can help speed up the build steps.

Using GNU Recutils

There are hundreds of cool command line tools that have been made over the years built on the unix philosophy. One such package is GNU Recutils, a set of tools and libraries to access human-editable, plain text databases called recfiles.

Building containers without Docker

This post outlins several ways to build containers without the need for Docker itself. It uses OpenFaaS as the case-study, which uses OCI-format container images for its workloads.

The Grymoire’s tutorial on AWK

This is a huge and in-depth tutorial on mastering the awk tool.

Writing a polyglot script

This is actually a fun exercise: can you write a script that’s valid in both Python and Ruby?

PostgreSQL user management

This post digs deeper into user management and permissions of PostgreSQL, which uses roles for authentication. There are two different kind of roles: groups and users.



Want to subscribe to the cron.weekly newsletter?

I write a weekly-ish newsletter on Linux, open source & webdevelopment called cron.weekly.

It features the latest news, guides & tutorials and new open source projects. You can sign up via email below.

No spam. Just some good, practical Linux & open source content.