In my Jekyll to Unikernel post, I described an automated workflow that would take your static website, turn it into a MirageOS unikernel, and then store that unikernel in a git repo for later deployment. Although it was written from the perspective of a static website, the process was applicable to any MirageOS project. This post covers how things have progressed since then and the kind of automated, end-to-end deployments that we can achieve with unikernels.
If you’re already familiar with the above-linked post then it should be clear that this will involve writing a few more scripts and ensuring they’re in the right place. The rest of this post will go through a real world example of such an automated system, which we’ve set up for building and deploying the unikernel that serves our slide decks — mirage-decks. Once you’ve gone though this post, you should be able to recreate such a workflow for your own needs. In Part 2 of this series I’ll build on this post and consider what the possibilities could be if we extended the system using some of our other tools — thus arriving at something very much like our own Heroku for Unikernels.
Almost all of our OCaml projects now use Travis CI for build and testing (and deployment). In fact, there are so many libraries now that we recently put together an OCaml Travis Skeleton, which means we don’t have to manually keep the scripts in sync across all our repos — and fewer copy/paste/edits means fewer mistakes.
If you’re familiar with the build scripts from last time, then
you can browse the new scripts and you’ll see that they’re broadly similar.
In many cases you may well be able to depend on one or other of the scripts
directly and for a handful of scenarios, you can fork and patch them to
suit you (i.e. for MirageOS unikernels). We can do this because we’ve made it
quick to set up an OCaml environment using an Ubuntu PPA. The rest
of the work is done by the
mirage tool itself so once that’s installed, the
build process becomes fairly straightforward. The complexity around secure
keys was also covered last time, which allowed us to commit the
final unikernel to a deployment repo. That means the remaining step is
to automate the deployment itself.
Committing the unikernel to a deployment repo is where the previous post ended
and a number of people forged ahead and wrote about their
experiences deploying onto AWS and Linode. Many of these deployments
(understandably) involve a number of quite manual steps. It would be
particularly useful to construct a set of scripts that can be fully automated,
such that a
git push to a repo will automatically run through the cycle of
building, testing, storing and activating a new unikernel. We’ve done
exactly this with some of our repos and this post will talk through those
MirageOS unikernels can currently be built for Xen and Unix backends. This is a straightforward step and typically the build matrix is already set up to test that both of them build as expected. For this post, I’ve only considered the Xen backend as that’s our chosen deployment method but it would be equally feasible to deploy the unix-based unikernels onto a *nix machine in much the same way. In this sense, you get to choose whether you want to deploy the unikernels onto a Hypervisor (for isolation and security) or whether running them as unix-processes better suits your needs. The unikernel approach means that both options are open to you, with little more than a command-line flag between them.
In terms of the deployment machines there are several options to consider. The most obvious is to set up a dedicated host, where you have full access to the machine and can install Xen. Another is to have a machine running on EC2 and create scripts to deal with unikernels. You could also build and deploy onto Xen on the Cubieboard2. If you’d rather test out the complete system first, you could set up an appropriate machine in Virtualbox to work with.
For our workflow, we use Xen unikernels which we deploy to a dedicated host. For the sake of brevity, I won’t go into the details of how to set up the machine but you can follow the instructions linked above.
Decks is the source repo that holds many of our slides, which we’ve presented at conferences and events over the years (I admit that I have yet to add mine). The repo compiles to a unikernel that can then serve those slides, as you see at decks.openmirage.org. For maximum fun-factor, we usually run that unikernel from a Cubieboard2 when giving talks.
The toolchain for this unikernel includes build, store and deploy. We’ll recap the first two steps before going through the final one.
Build — In the root of the decks source repo, you’ll notice the
.travis.yml file, which fetches the standard build script mentioned earlier.
Building the unikernel proceeds according to the options in the build matrix.
In this case, two builds occur for Unix and one for Xen with different parameters being used for each. If you look at the actual travis file, you’ll notice there are 26 lines of encrypted data. This is how we pass the deployment key to Travis CI, so that it has push access to the separate mirage-decks-deployment repo. You can read the section in the previous post to see how we send Travis a private key.
Store — One of the combinations in the build matrix (configured for Xen), is intended for deployment. When that unikernel is completed, an additional part of the script is triggered that pushes it into the deployment repo.
After the ‘build’ and ‘store’ steps above, we have a
deployment repository with a collection of Xen unikernels. For
this stage, we have a new set of scripts that live in this repo alongside those
unikernels. Specifically, you’ll notice a folder called
contains four files.
A quick summary of the setup is that we clone the repo onto our deployment
machine and install some hooks there. Then a simple cronjob will perform
git pull at regular intervals. If a merge event occurs, then it means the
repo has been updated and another script is triggered. That script removes the
currently running unikernel and boots the latest version from the repo. It’s
fairly straightforward and I’ll explain what each of the files does below.
Makefile - After cloning the repo, run
make install. This will trigger
install-hooks.sh to set things up appropriately. It’s worth remembering that
from this point on, the git repo on the deployment machine will not be
identical to the deployment repo on GitHub.
install-hooks.sh — The first two lines ensure that the commands
will be run from the root of the git repo. The third line symlinks the
post-merge.hook file into the appropriate place within the
This is the folder where customized git hooks need to be placed in
order to work. The final line adds the file
scripts/crontab to the
deployment machine’s list of cron jobs.
crontab — This file is a cronjob that sets up the deployment machine to
git pull on the deployment repo at regular intervals. Changing the
file in the repo will ultimately cause it to be updated on the deployment
deploy.sh). At the moment, it’s set to run every 11 minutes.
post-merge.hook — Since we’ve already run the Makefile, this file is
symlinked from the appropriate place on the deployment machine’s copy of the
repo. When a
git pull results in new commits being downloaded and merged,
then this script is triggered immediately afterwards. In this case, it just
deploy.sh — This is where the work actually happens and you’ll notice that there really isn’t much to do! I’ve commented in the code below to explain what’s going on.
At this point, we now have a complete system! Of course, this arrangement isn’t perfect and there are number of things we could improve. For example, it depends on a cron job, which means it may take a while before a new unikernel is live. Replacing this with something triggered on a webhook could be an improvement, but it does mean exposing an end-point to the internet. The scripts will also redeploy the current unikernel, even if the only change is to the crontab schedule. Some extra work in the deploy script, using some git tools, might work around this.
Despite these minor issues, we do have a completely end-to-end workflow that takes us all the way from pushing some new changes to deploying a new unikernel! An additional feature is that everything is checked into version control. Right from the scripts to completed artefacts (including a method of transmitting secure keys/data, over public systems).
There is minimal work done outside the code you’ve already seen, though there is obviously some effort involved in setting up the deployment machine. However, as mentioned earlier, you could either use the unix-based unikernels or experiment with Virtualbox VM with Xen just to test out this entire toolchain.
Overall, we’ve only added around 20 lines of code to the initial 50 or so that
we use for the Travis CI build. So for less than 100 lines of code, we have
a complete end-to-end system that can take a MirageOS project from a
git push, all the way through to a live deployment.
In our current system, if the unikernel builds appropriately then we just assume it’s ok to deploy to production. Fire and forget! What could possibly go wrong! Of course, this is a somewhat naive approach and for any critical system it would be better to hook in some additional things.
One obvious improvement would be to introduce a more thorough testing regimen, which would include running unit tests as part of the build. Indeed, various libraries in the MirageOS project are already moving towards this model (e.g see the notes for links).
It’s even possible to go beyond unit tests and introduce more functional/systems/stress testing on the complete unikernel before permitting deployment. This would help to surface any wider issues as services interact and we could even simulate network conditions — achieving something like ‘staging on steroids’.
The scenario we have above also assumes that things work smoothly and nobody needs to know anything. It would be useful to hook in some form of logging and reporting, such that when a new unikernel is deployed a notification can be sent/stored somewhere. In the short term, there are likely existing tools and ways of doing this so it would be a matter of putting them together.
Overall, with the above model, we can easily set up a system where we go from writing code, to testing it via CI, to deploying it to a staging server for functional tests, and finally pushing it out into live deployment. All of this can be done with a few additional scripts and minimal interaction from the developer. We can achieve this because we don’t have to concern ourselves with large blobs of code, multiple different systems and keeping environments in sync. Once we’ve built the unikernel, the rest almost becomes trivial.
This is close enough for me to declare it as a ‘Heroku for unikernels’ but obviously, there’s much more we can (and should) do with such a system. If we extrapolate just a little from where we are now, there are a range of exciting possibilities to consider in terms of automation, scalability and distributed systems. Especially if we incorporate other aspects of the toolstack we’re working towards.
Part 2 of this series is where I’ll consider these possibilities, which will be more speculative and less constrained. It will cover the kinds of systems we can create once the tools are more mature and will touch on ideas around hyper-elastic clouds, embedded systems and what this means for the concept of immutable infrastructure.
Since we already have the ‘backbone’ of the toolchain in place, it’s easier to see where it can be extended and how.
Edit: The second part of this series is now up - “Self Scaling Systems”
Edit2: discuss this post on devel.unikernel.org
Thanks to Anil Madhavapeddy and Thomas Leonard for comments on an earlier draft and Richard Mortier for his work on the deployment toolchain.