Functional DevOps in a Dysfunctional World
This is a pseudo-transcript of a presentation I gave at the linux.conf.au 2018 Real World Functional Programming Miniconf.
What is DevOps about? For me it’s about my relationship to the phrase
It works on my machine.
I’ve been guilty of saying this in the past, and quite frankly, it isn’t good enough. After the development team has written their last line of code, some amount of work still needs to happen in order for the software to deliver value.
A few jobs ago I was at a small web development shop, and my deployment workflow was as follows:
- Log on to the development server and take careful notes on how it had diverged from the production server.
- Carefully set aside some time to ‘do the deploy’.
- Log on to the production server and do a
git pull
to get the latest code changes. - Perform database migrations according to the notes you made earlier.
- Manually make any other required changes.
Despite my best efforts, I would inevitably run into issues whenever I did this, resulting in site outages and frustrated clients. This was far from ideal, but I wasn’t able to articulate why at the time.
I posit that a better deployment process has the following properties:
Automatic: instead of a manual multi-step process, it has a single step, which can be performed automatically.
Repeatable: instead of only being able to deploy to one lovingly hand-maintained server, it can deploy reliably multiple times to multiple servers.
Idempotent: if the target is already in the desired state, no extra work needs to be done.
Reversible: if it turns out I made a mistake, I can go back to the previous state.
Atomic: an external observer can only see the new state or the old state, not any intermediate state.
I hope to demonstrate how the Nix suite of tools (Nix, NixOS, and NixOps) fulfill these properties and provide a better DevOps experience.
To make things easier, I’m not assuming that you already run NixOS. Any Linux distro should do, as long as you’ve installed Nix. macOS users will be able to follow along until I get to the NixOps section.
Shipping it
Packaging
Suppose we have been given a small Haskell app to get up and running:
Main.hs
{-# LANGUAGE OverloadedStrings #-}
import Web.Scotty
import Data.Monoid (mconcat)
main = scotty 3000 $ do
get "/:word" $ do
beam <- param "word"
html $ mconcat ["<h1>Scotty, ", beam, " me up!</h1>"]
blank-me-up.cabal
name: blank-me-up
version: 0.1.0.0
license: BSD3
build-type: Simple
cabal-version: >=1.10
executable blank-me-up
main-is: Main.hs
build-depends: base >=4.9 && <5
, scotty
default-language: Haskell2010
(This example is taken straight from Scotty’s README.)
Our first step is to build this app and quickly check that it works. We’ll need
Nix and cabal2nix
, which turns .cabal
files into configuration for the Nix
package manager. Assuming we’ve installed cabal2nix
:
$ nix-env -i cabal2nix
<a lot of output>
created <number> symlinks in user environment
How do we know it worked? Try nix-env -q
(short for --query
):
$ nix-env -q
cabal2nix
Okay, assuming the app is in the app
subdirectory, let’s create a directory
called nix
to store our .nix
files and begin:
$ cd nix
$ cabal2nix ../app/ --shell > default.nix
default.nix
might look something like
{ nixpkgs ? import <nixpkgs> {}, compiler ? "default" }:
let
inherit (nixpkgs) pkgs;
f = { mkDerivation, base, scotty, stdenv }:
{
mkDerivation pname = "blank-me-up";
version = "0.1.0.0";
src = ../app;
isLibrary = false;
isExecutable = true;
executableHaskellDepends = [ base scotty ];
license = stdenv.lib.licenses.bsd3;
};
haskellPackages = if compiler == "default"
then pkgs.haskellPackages
else pkgs.haskell.packages.${compiler};
drv = haskellPackages.callPackage f {};
in
if pkgs.lib.inNixShell then drv.env else drv
Now we can build our project by running nix-build
, which tries to build
default.nix
in the current directory if no arguments are provided:
$ nix-build
<lots of output>
/nix/store/<hash>-blank-me-up-0.1.0.0
There should also be a new result
symlink in the current directory, which
points to the path above:
$ readlink result
/nix/store/<hash>-blank-me-up-0.1.0.0
Notice that we’ve built a Haskell executable without having to directly deal
with any Haskell-specific tooling (unless you count cabal2nix
). Nix works
best if you allow it full control over builds, as we do here.
What happens if we run nix-build
again without changing anything?
$ nix-build
/nix/store/<hash>-blank-me-up-0.1.0.0
It should be nearly instantaneous and not require rebuilding anything. Nix tries to think of build outputs as a pure function of its inputs, and since our inputs are unchanged, it is able to give us back the same path that it did before. This is what I mean when I say Nix is declarative.
What if we break our app:
--- a/functional-devops/app/Main.hs
+++ b/functional-devops/app/Main.hs
@@ -4,6 +4,8 @@ import Web.Scotty
import Data.Monoid (mconcat)
+broken
+
main = scotty 3000 $ do
get "/:word" $ do beam <- param "word"
and try to build again?
$ nix-build
<...>
Building executable 'blank-me-up' for blank-me-up-0.1.0.0..
[1 of 1] Compiling Main ( Main.hs, dist/build/blank-me-up/blank-me-up-tmp/Main.o )
Main.hs:7:1: error:
Parse error: module header, import declaration
or top-level declaration expected.
|
7 | broken
| ^^^^^^
builder for '/nix/store/<hash>-blank-me-up-0.1.0.0.drv' failed with exit code 1
error: build of '/nix/store/<hash>-blank-me-up-0.1.0.0.drv' failed
<...>
It fails, as one would hope, but more importantly the previous symlink at
result
is still available! This is because nix-build
completes the build
before atomically updating the symlink at result
to point to the new
artifact. This way, we can move from one known working state to another,
without exposing our users to any intermediate brokenness.
Service Configuration
Okay, now that we’re able to successfully build the app, let’s configure a
service file so that systemd
can manage our app. I don’t know of any tools
that automatically generate this so I always find myself copying and pasting
from an existing service file. Here’s one I prepared earlier.
nix/service.nix
{ config, lib, pkgs, ... }: #1
let #2
blank-me-up = pkgs.callPackage ./default.nix {}; #3
in {
options.services.blank-me-up.enable = lib.mkEnableOption "Blank Me Up"; #4
config = lib.mkIf config.services.blank-me-up.enable { #5
networking.firewall.allowedTCPPorts = [ 3000 ]; #6
systemd.services.blank-me-up = { #7
description = "Blank Me Up";
after = [ "network.target" ];
wantedBy = [ "multi-user.target" ];
serviceConfig = {
ExecStart = "${blank-me-up}/bin/blank-me-up";
Restart = "always";
KillMode = "process";
};
};
};
}
This isn’t intended to be a Nix language tutorial, but there are a few interesting things that I want to point out. For a more comprehensive overview of the language, see here or here.
- These are the arguments to this expression that the caller will pass. Another way to think of this is as a form of dependency injection.
let
expressions work similarly to Haskell.- This is the equivalent of our
nix-build
from before. - We define a single option that enables our service.
- The
config
attribute contains service configuration. - We expose port 3000.
- If you squint this looks a lot like a regular unit file. More on this below.
It would be useful to look at the systemd service file that gets generated from this configuration. To do this, we’ll need one more file:
ops/webserver.nix
{ ... }: {
imports = [ ../nix/service.nix ];
services.blank-me-up.enable = true;
}
This is a function that imports the above configuration and enables the
blank-me-up
service. With this in place, we can do
$ nix-instantiate --eval -E '(import <nixpkgs/nixos/lib/eval-config.nix> { modules = [./ops/webserver.nix]; }).config.systemd.units."blank-me-up.service".text'
We’re using nix-instantiate
to evaluate (--eval
) an expression (-E
) that
uses eval-config.nix
from the library to import the file we created and
output the text of the final unit file. The output of this is pretty messy, but
we can use jq
to clean it up:
$ nix-instantiate --eval -E '(import <nixpkgs/nixos/lib/eval-config.nix> { modules = [./ops/webserver.nix]; }).config.systemd.units."blank-me-up.service".text' | jq -r
Here’s what that looks like on my machine:
Generated systemd
service
[Unit]
After=network.target
Description=Blank Me Up
[Service]
Environment="LOCALE_ARCHIVE=/nix/store/<hash>-glibc-locales-2.27/lib/locale/locale-archive"
Environment="PATH=/nix/store/<hash>-coreutils-8.30/bin:/nix/store/<hash>-findutils-4.6.0/bin:/nix/store/<hash>-gnugrep-3.3/bin:/nix/store/<hash>-gnused-4.7/bin:/nix/store/<hash>-systemd-239.20190219/bin:/nix/store/<hash>-coreutils-8.30/sbin:/nix/store/<hash>-findutils-4.6.0/sbin:/nix/store/<hash>-gnugrep-3.3/sbin:/nix/store/<hash>-gnused-4.7/sbin:/nix/store/<hash>-systemd-239.20190219/sbin"
Environment="TZDIR=/nix/store/<hash>-tzdata-2019a/share/zoneinfo"
ExecStart=/nix/store/<hash>-blank-me-up-0.1.0.0/bin/blank-me-up
KillMode=process Restart=always
Hopefully at this point you’re convinced that Nix can take some quasi-JSON and
turn it into a binary and a systemd
service file. Let’s deploy this!
Deploying
First, we install NixOps:
$ nix-env -i nixops
We also have to set up VirtualBox, which I’ll be using as my deploy target. If
you’re using NixOS this is as simple as adding the following line to
configuration.nix
:
true; virtualisation.virtualbox.host.enable =
and running sudo nixos-rebuild switch
. If you’re using another Linux distro,
install VirtualBox and set up a host-only network called vboxnet0
.
We’ll be using the instructions from the manual as our starting point. Create two files:
ops/trivial.nix
{
network.description = "Web server";
network.enableRollback = true;
webserver = import ./webserver.nix;
}
ops/trivial-vbox.nix
{
webserver =
{ config, pkgs, ... }:
{ deployment.targetEnv = "virtualbox";
deployment.virtualbox.headless = true; # don't show a display
deployment.virtualbox.memorySize = 1024; # megabytes
deployment.virtualbox.vcpu = 2; # number of cpus
};
}
We should now be able to create a new deployment:
$ cd ops
$ nixops create trivial.nix trivial-vbox.nix -d trivial
and deploy it:
$ nixops deploy -d trivial
and assuming that everything goes well, we should see a lot of terminal output
and at least one mention of ssh://root@<ip>
, which is the IP of our target.
We should then be able to go to http://<ip>:3000
and see our web app in
action!
$ curl http://<ip>:3000/help
<h1>Scotty, help me up!</h1>
NixOps also allows us to SSH in for troubleshooting purposes or to view logs:
$ nixops ssh -d trivial webserver
<...>
[root@webserver:~]# systemctl status blank-me-up
Responding to change
This is fantastic, but deployments are rarely fire-and-forget. What happens when our requirements change? In fact, there’s a serious issue with our application, which is that it hardcodes the port that it listens on. If we wanted it to listen on a different port, or to run more than one instance of it on the same machine, we’d need to do something differently.
The correct solution would be to talk to the developers and have them implement support, but in the meantime, how should we proceed?
Patching
Nix gives us full control over each part of the build and deployment process, and we can patch the software as a stopgap measure. Although this scenario is somewhat contrived, I have in fact had to take matters into my own hands like this in the past when the development team hasn’t been able to prioritise fixing a production issue.
Our new expression looks like this:
nix/patched.nix
{ nixpkgs ? import <nixpkgs> {}, compiler ? "default" }:
args@
(import ./default.nix args).overrideAttrs (old: {
postPatch = let
oldImport = ''
import Web.Scotty
'';
newImport = ''
import Web.Scotty
import System.Environment (getArgs)
'';
oldMain = ''
main = scotty 3000 $ do
'';
newMain = ''
main = getArgs >>= \(port:_) -> scotty (read port) $ do
'';
in ''
substituteInPlace Main.hs --replace '${oldImport}' '${newImport}'
substituteInPlace Main.hs --replace '${oldMain}' '${newMain}'
cat Main.hs
'';
})
I’ve added that cat Main.hs
at the end to
- confirm that the file was correctly patched
- emphasise that arbitrary shell commands can be executed
We can create a new service definition to use this expression:
nix/service-patched.nix
{ config, lib, pkgs, ... }:
let
blank-me-up = pkgs.callPackage ./patched.nix { nixpkgs = pkgs; };
cfg = config.services.blank-me-up;
in {
options.services.blank-me-up.enable = lib.mkEnableOption "Blank Me Up";
options.services.blank-me-up.port = lib.mkOption {
default = 3000;
type = lib.types.int;
};
config = lib.mkIf cfg.enable {
networking.firewall.allowedTCPPorts = [ cfg.port ];
systemd.services.blank-me-up = {
description = "Blank Me Up";
after = [ "network.target" ];
wantedBy = [ "multi-user.target" ];
serviceConfig = {
ExecStart = "${blank-me-up}/bin/blank-me-up ${toString cfg.port}";
Restart = "always";
KillMode = "process";
};
};
};
}
We make sure to pass the configured port in on startup and open the firewall appropriately.
Deploying (Again)
We update webserver.nix
to use the patched service and specify a different
port:
{ ... }: {
imports = [ ../nix/service-patched.nix ];
services.blank-me-up.enable = true;
services.blank-me-up.port = 3001;
}
And we can deploy again!
$ nixops deploy -d trivial
The service should now be running on http://<ip>:3001
instead of
http://<ip>:3000
.
$ curl http://<ip>:3001/pull
<h1>Scotty, pull me up!</h1>
If we made a mistake, rolling back is easy:
$ nixops list-generations -d trivial
1 <timestamp>
2 <timestamp> (current)
$ nixops rollback -d trivial 1
switching from generation 2 to 1
webserver> copying closure...
trivial> closures copied successfully
<more output>
and in fact nothing needs to be copied to the target, because the previous deployment is still there.
Conclusion
As demonstrated, the Nix ecosystem allows us to impose order on the usually messy and ad-hoc practice of packaging and deploying software at scale. I’m satisfied that this is the way forward and hope that you will consider using these tools to tackle problems of your own!