Why do we automate?Mattias Geniar, Tuesday, July 26, 2016 - last modified: Tuesday, August 2, 2016
Well that's a stupid question, to save time obviously!
I know, it sounds obvious. But there's more to it.
The last few years I've been co-responsible for determining our priorities at Nucleus. Deciding what to automate and where to focus our development and sysadmin efforts. Which tasks do we automate first?
That turns out to be a rather complicated question with many if's and but's.
For a long time, I only looked at the time-saved metric to determine what we should do next. I should've been looking at many more criteria.
To save time
This is the most common reason to automate and it's usually the only factor that helps decide whether there should be an effort to automate a certain task.
Example: time consuming capacity planning
Task: every week someone has to gather statistics about the running infrastructure to calculate free capacity in order to purchase new capacity in time. This task takes takes an hour, every week.
Efforts to automate: it takes a developer 2 days work to gather info via API's and create a weekly report to management.
Gain: the development efforts pay themselves back in about 16 weeks. Whether this is worth it or not depends on your organisation.
Source: XKCD: Automation
It's an image usually referenced when talking about automation, but it holds a lot of truth.
The "time gained" metric is multiplied by the people affected by it. If you can save 10 people 5 minutes every day, you've practically gained an extra workday every week.
To gain consistency
Sometimes a task is very complicated but doesn't need to happen very often. There are checklists and procedures to follow, but it's always a human (manual) action.
Example: complicated migrations
Task: an engineer sometimes has to move e-mail accounts from one server to another. This doesn't happen very often but consists of a large number of steps where human error is easily introduced.
Efforts to automate: it may take a sysadmin a couple of hours to create a series of scripts to help automate this task.
Gain: the value in automating this is in the quality of the work. It guarantees a consistent method of migrations that everyone can follow and creates a common baseline for clients. They know what to expect and the quality of the results is the same every time.
At the same time, this kind of automation reduces human made mistakes and leads to a combined knowledge set. If everyone who is an expert in his/her own domain contributes to the automation, it can bring together the skill set of very different people to create a much bigger whole: a collection of experiences, knowledge and opinions that ultimately lead to better execution and higher quality.
To gain speed, momentum and velocity
There are times when things just take a really long time in between tasks. It's very easy to lose focus or forget about follow-up tasks because you're distracted in the meanwhile.
Example: faster server setups and deliveries
Task: An engineer needs to install a new Windows server. Traditionally, this takes many rounds of Windows Updates and reboots. Most software installations require even more reboots.
Efforts to automate: a combination of PXE servers or golden templates and a series of scripts or config management to get the software stack to a reasonable state. A sysadmin (or team of) can spend several days automating this.
Gain: the immediate gain is in peace of mind and speed of operations. It reduces the time of go-live from several hours to mere minutes. It allows an operation to move much faster and consider new installations trivial.
This same logic applies to automating deployments of code or applications. By taking away the burden of performing deploys, it becomes much cheaper and easier to deploy very simple changes instead of prolonging deploys and going for big waterfall-like go-lives with lots of changes at once.
To schedule tasks
Some things need to happen at ungodly hours or at such a rapid interval that it's either impossible or impractical for a human to do.
Example: nightly maintenances
Task: Either as a one-time task or a recurring event, a set of MySQL tables needs to be altered. Given the application impact, this needs to happen outside office hours.
Efforts to automate: It will depend on the task at hand, but it's usually more work to automate than it is to do manually.
Gain: No one has to look at the task anymore. The fact that the maintenance can now be scheduled during off hours without human intervention makes it so that all preparations can be done during office hours -- well in advance -- and won't cause anyone to lose sleep over it.
It's quite common to spend more time making the script or automation than the time you would spend on it manually. The benefit however is that you no longer need to do things at night and you can prepare things, ask feedback from colleagues and take your time to think about the best possible way to handle it.
There is an additional benefit too: you automate to make things happen when they should, not when you remember they should.
To reduce boring or less fun tasks
If there's a recurring task that no one likes to do but is crucial to the organisation, it's probably worth automating.
Example: combining and merging related support tickets
Task: In a support department, someone is tasked to categorise incoming support tickets, merge the same tickets or link related tickets and distribute tasks.
Efforts to automate: A developer may spend several days writing the logic and algorithms to find and merge tickets automatically, based on pre-defined criteria.
Gain: A task that may be put on hold for too long because no one likes to do it, suddenly happens automatically. While it may not have been time consuming, the fact that it was put on hold too often impacts the organisation.
The actual improvement is to reduce the mental burden of having to perform those tasks in the first place. If your job consists of a thousand little tasks every day, it becomes easy to lose track of priorities.
To keep sysadmins and developers happy
Sometimes you automate things, not necessarily for any of the reasons above, but because your colleagues have signalled that it would be fun to automate it.
The tricky part here is assessing the value for the business. In the end, there should be value for the company.
Example: creating a dashboard with status reports
Task: Create a set of dashboards to be shown on monitors and TVs in the office.
Efforts to automate: Some hardware hacking with Raspberry Pi's, scripts to gather and display data and visualise the metrics and graphs.
Gain: More visibility in open alerts and overall status of the technology in the company.
Everyone that has dashboards knows the value they bring, but assessing whether it's worth the time and energy put into creating them is a very hard thing to do. How much time can you afford to spend creating them?
Improvements like these often come from colleagues. Listen to them and give them the time and resources to help implement them.
Validate your automation
The risk with putting so much faith and trust in your automation is that you may become blind for mistakes.
For instance, if you don't occasionally re-evaluate the rules or parameters on which you based your automation, you may well be doing the wrong things. It could be even worse, because automation usually happens behind the scenes, mistakes like these can go on for weeks/months without someone noticing.
Imagine writing a set of scripts to calculate margins or stock supplies, only to have the business demands shift without updating any of the parameters that are responsible for those decisions.
That'll quickly turn into a ticking time bomb.
Maybe we need to automate the validation & checking of our automation?
When to automate?
Given all these reasons on why to automate, this leaves the most difficult question of all: when to automate?
How and when do you decide whether something is worth automating? The time spent vs. time gained metric is easy to calculate, but how do you define the happiness of colleagues? How much is speed worth in your organisation?
Those are the questions that keep me up.