The right way to setup a VPN on windows

Remember that clicking orgy from the last blog post? It seems it is possible to completely script that. No more clicking... (And we can remove the Task Scheduler hack alltogether :) )

Firstly run Powershell as Administrator and update the script execution policy for temporary allowing to run scripts:

Get-ExecutionPolicy -List

Execution policy list

If everything displays Undefined update for running unverified scripts:

Set-ExecutionPolicy -ExecutionPolicy RemoteSigned

Execution policy change

Now collect the machine certificate (p12 file) and the CA root and add the following script:

You have to configure the variables on top before running the script.

If you have finished the setup you may reset the execution policy to the value it was before: Set-ExecutionPolicy -ExecutionPolicy Undefined

Thanks @_Tomalak for sending me the pointers to the right direction :)

IKEv2 VPN with StrongSWAN

Wow this was harder than i thought.

I just wanted to get a modern VPN on all my devices without the hassle to install third-party VPN clients on all of them (hello OpenVPN o/). The protocol of choice seems to be IKEv2 as all devices that I own seem to support this and it is more secure than the old PPTP or L2TP protocols the devices could support natively.

But let's just jump directly into it.

IPsec config

On the server we will be using StrongSWAN. All configuration is for Ubuntu 15.10 but should work on any distribution that has StrongSWAN as the configuration did not really change in the last few years.

Install StrongSWAN

At first we need to install StrongSWAN (all steps from here on should be done as the root user, switch to root by issuing sudo su - and typing your password):

apt install strongswan strongswan-plugin-af-alg strongswan-plugin-agent strongswan-plugin-certexpire strongswan-plugin-coupling strongswan-plugin-curl strongswan-plugin-dhcp strongswan-plugin-duplicheck strongswan-plugin-eap-aka strongswan-plugin-eap-aka-3gpp2 strongswan-plugin-eap-dynamic strongswan-plugin-eap-gtc strongswan-plugin-eap-mschapv2 strongswan-plugin-eap-peap strongswan-plugin-eap-radius strongswan-plugin-eap-tls strongswan-plugin-eap-ttls strongswan-plugin-error-notify strongswan-plugin-farp strongswan-plugin-fips-prf strongswan-plugin-gcrypt strongswan-plugin-gmp strongswan-plugin-ipseckey strongswan-plugin-kernel-libipsec strongswan-plugin-ldap strongswan-plugin-led strongswan-plugin-load-tester strongswan-plugin-lookip strongswan-plugin-ntru strongswan-plugin-pgp strongswan-plugin-pkcs11 strongswan-plugin-pubkey strongswan-plugin-radattr strongswan-plugin-sshkey strongswan-plugin-systime-fix strongswan-plugin-whitelist strongswan-plugin-xauth-eap strongswan-plugin-xauth-generic strongswan-plugin-xauth-noauth strongswan-plugin-xauth-pam strongswan-pt-tls-client

This one seems a bit excessive but I just installed everything I could find for StrongSWAN as I am lazy.

Configuration

The next step is to get rid of the default configuration and supply our own:

The best bet here is to just move away the default config in /etc/ipsec.conf (or delete it as it does not contain anything of any value) and copy and paste the config above into it.

You will have to modify some values:

  • yourhostname.net should be the hostname of the box you connect to.
  • rightsourceip should be a private IPv4 network and a subnet of the IPv6 subnet of your server (if your server got a /64 probably add another address part and use a /112 here)
  • rightdns is the dns server that will be sent to the client, I just used Google's free DNS servers here.

If you only want to use IPv4 just remove the v6 addresses.

Packet forwarding

To allow the connected VPN clients to actually talk to each other you'll have to enable packet forwarding, if you don't do that the clients will only be able to speak with the server.

Create a new file in /etc/sysctl.d named 99-vpn.conf:

Reload the settings with sysctl --system

If you want to give the VPN clients Internet access you'll have to enable NAT for the interfaces and routing for IPv6. I just added these lines to /etc/rc.local, you probably want to use the default facility for iptables rules of your distribution though:

Don't forget to actually run the script afterwards to enable the rules without rebooting. (D'oh!)

Generating certificates

To be on the real secure side and to make device provisioning as easy as possible we use X.509 certificates to connect to the VPN.
There are 3 sets of certificates:

  • The root CA
  • The VPN server certificate
  • The client certificates

Switch to /etc/ipsec.d and run all the following in that directory.

Root CA

For a CA we need a key first (we pick a 4096 bit long RSA key here):

ipsec pki --gen --type rsa --size 4096 --outform der > private/strongswanKey.der
chmod 600 private/strongswanKey.der

So let's create a root CA:

ipsec pki --self --ca --lifetime 3650 --in private/strongswanKey.der --type rsa --dn "C=DE, O=Dunkelstern, CN=Dunkelstern VPN Root CA" --outform der > cacerts/strongswanCert.der

So what's all that stuff?

  • First we tell the ipsec tool to create a self signed ca with roughly 10 years of lifetime
  • Use the key we generated
  • Tell the pki tool some settings: The country (DE), the Organisation (Dunkelstern) and the Common Name (Dunkelstern VPN Root CA)

You should probably move all the root CA private files (the key!) off the machine after you're done with them and put them on a disk into a safe or something.

VPN Certificates

So we have a CA but we definitely do not want to use that directly for the VPN server, so we create a derivative Certificate that has the root CA as parent. So at first we need a key again:

ipsec pki --gen --type rsa --size 4096 --outform der > private/strongswanKey.der
chmod 600 private/strongswanKey.der

And now comes the interesing part:

export vpn_host="vpn.example.org"
export vpn_ipv4="10.0.0.1"
export vpn_ipv6="::1"

ipsec pki --pub --in private/vpnHostKey.der --type rsa | ipsec pki --issue --lifetime 730 --cacert cacerts/strongswanCert.der --cakey private/strongswanKey.der --dn "C=DE, O=Dunkelstern, CN=$vpn_host" --san $vpn_ipv4 --san @$vpn_ipv4 --san $vpn_ipv6 --san @$vpn_ipv6 --flag serverAuth --flag ikeIntermediate --outform der > certs/vpnHostCert.der

You'll have to replace some values here:

  • vpn_host the hostname of the VPN server
  • vpn_ipv4 the public IPv4 address
  • vpn_ipv6 the public IPv6 address

Now it is really time to move the CA root private key ;)

Client certificates

To create client certificates I made a small script as you'll probably do this often:

Usage is something like create_vpn_user.sh kopernikus which will drop a p12 file in /etc/ipsec.d/p12/ with the name you supplied. The p12 file is encrypted with a pass-phrase you'll have to supply.

iOS

To get a working VPN config onto an iOS device you'll have to use a *.mobileconfig configuration profile as the VPN GUI of the iPhone and iPad has a bug that prevents valid connections as of iOS 9.3.

Create mobile config profile

Fetch the Apple Configurator 2 from the AppStore on a Mac (it's free but sadly there is no configurator for Windows)

After starting the configurator choose File->New Profile from the menu and fill the generic info field as you want:

Apple Configurator generic info

For the next step you'll need the p12 file and the vpnHostCert.der file to add to the certificate store:

Apple Configurator certificate store

And the last step: Configure the VPN

Apple Configurator VPN

Make sure you select the certificate auth method and the correct certificate. The Remote Identifier is what's in leftid in the ipsec.conf, the Local Identifier is what's in the Certificate's common name (machine@host).

Attention: Set the Encryption algorithm for IKE SA Params and Child SA Params to something sensible, do not use the 3DES default. 3DES is inherently unsafe!

If you're ready, save the config somewhere. So there's another step to get it running: Another Bug workaround!

Open the generated file in some text-editor and search for OverridePrimary and set it to zero!

Install on iOS device

The easiest way to install the configuration profile is just sending it to yourself as an email and then tap the attachment and allow it to install the VPN. If your device is enrolled in MDM (Mobile device management) you can send the profile over the air.

Connect

To connect to the VPN go into Settings->General->VPN and turn on the switch. All traffic will now be sent through the VPN. If the switch turns off immediately you either forgot to set OverridePrimary to zero or you chose an encryption that the server does not understand. Look into the server logfile for more information.

Mac

Just create a config file like you do for an iOS device and double click it. It will open in the System Preferences Profiles pane (which is not visible until you import a profile)

OSX System Prefs

I had to click the plus button and add the profile again as the preference pane did open but it did not automatically import the profile.

If you want to have an icon in the menu bar, switch to the Network pref-pane and tick the checkbox to show that icon:

Show in menu bar

Windows

(UPDATE: Wait stop right here! Read the follow up: The right way to setup a VPN on windows)

Oh my... Microsoft! To get it running on Windows you will have to jump over some obstacles. I am no windows guy, perhaps there's an easier way to do it, please mail me: hallo@dunkelstern.de

Import Certificates

Run the Management Console from the Win+R box with mmc:

Management Console

Now add the Certificates snap-in (File -> Add/Remove Snap in...):

Certificates Snap in

Switch to Console Root -> Certificates -> Personal -> Certificates and import the p12 file by clicking through the import wizard at Action -> All tasks -> Import.

Now you should have two new entries:

  • machinename@host
  • XY VPN Root CA

Move the Root CA to Trusted Root Certificate Authorities -> Certificates to trust it.

Setup VPN

First open the Network and Sharing Center (best done by right-clicking on the network icon in the task bar)

Network and Sharing Center

Now set up a new Network connection:

Network connection Workplace

VPN, not dialup

VPN step 1

Now go back to the Network and Sharing Center and click on Change Adapter Settings on the left. Select your VPN connection and open the properties window.

VPN Properties

Switch to the Security pane and set the VPN type to IKEv2 and the authentication to Use machine certificates.

IKEv2

On the Networking pane open the properties dialog for both Internet Protocol versions and set up the DNS servers as windows does not automatically take those from the VPN connection

DNS here

The connection is now ready to be activated, but there are some bugs hidden.

Public network? Why?

If you double click to connect now (which throws you over to the modern control panel)

Connect finally

You may notice that the network is classified as a public network. If you don't want it to be public follow the next steps. Skip them if public is ok with you.

First open the policy editor by running gpedit.msc from the Win+R window. Now navigate to Local Computer Policy -> Computer Configuration -> Windows Settings -> Security Settings -> Network List Manager Policies (Why is everything so wordy?), right click the VPN connection and set the Network Location

Network Location type

Getting IPv6 to work

If you look into the VPN connection status you may notice that IPv4 Connectivity says Internet and IPv6 Connectivity tells you No Network access. This is because Windows does not setup a default route through the VPN tunnel for IPv6 but depends on the Router on the other end to respond to router queries or send router advertisements. This could be done on the server but noone but Windows does need this and the error could be fixed by just running a single command on the windows box:

route -6 add ::0/0 2a01:4f8:190:2012:3::2

Where the IP on the end is the IP of your tunnel endpoint (look it up in the properties).

So how can we tell Windows to run that command for us... let's say it's a bit hacky what we are about to do now:

  1. Write a small netsh script to run
  2. Find the "VPN connection established"-Event in the event log
  3. Attach a scheduled task to that event type to run the script.

Hacky enough? Ok let's go.

Create a new text-file with the following content:

interface ipv6
add route ::0/0 "VPN Connection" 2a01:4f8:190:2012:3::2
exit

Replace VPN Connection with the name of your connection and the IP address with the IP address you get from the VPN server. Move that text file somewhere where you will not delete it by accident.

Now open the event viewer (Win+R run eventvwr) and navigate to
Event Viewer (Local) -> Applications and Service Logs -> Microsoft -> Windows -> NetworkProfile -> Operational (wow!)

Event viewer path

Now find the event that signifies that the VPN connection was established. It usually has the Event ID 10000 and we need an entry with the State: Connected flag set.

Log entry

Finally we attach a task to that event, which will be called everytime the VPN connects.

Task

My script was called route.nsh and I dropped it into C:\ directly.

Attention: Do not finish the wizard without telling it to open the task properties at the end, we have something to do there!

Set the Run with highes privileges checkbox.

High privilege

... turn off the energy save mode and set the task to run only if the VPN connection is available:

On battery too

Finished! If you now disconnect and then reconnect the VPN IPv6 will work through the tunnel!

Linux

coming soon.

Bash on Windows

So with the Windows 10 Anniversary update came the "Bash on Windows" feature. It is still beta and this really shows. So let me tell you of my experience getting it to work.

How to setup?

There are many blog posts for this but it all boils down to one single command on the Windows command prompt:

 LxRun.exe /install /y

It will then download a very minimal version of Ubuntu from the Windows Store (yes it really displays this). After this you can run bash on the command prompt or search the start menu for "Bash on Ubuntu on Windows" (wow what a name!)

What will work and what will not

So the complete subsystem starts and stops with the bash.exe running which literally means if you close the bash window everything is killed with it. So forget running any daemons in the background.

Microsoft itself said that running server software is not supported and is currently not planned to be supported. If you run your server daemon in the foreground in a bash window it will be accessible on the localhost network though. Part of this is probably because neither Upstart nor systemd could be run on the kernel interface that is available currently. There is no process 1, there is no init. Imagine that bash is running a emergency system repair console on a failed linux boot and you'll understand what is available right now.

You can see the complete Linux kernel interface is thought for developers. You will have access to the local disk, you will have access to the local network, but that's it primarily.

It's a bit sad that Microsoft decided to use Ubuntu 14.04 instead of the 2016 version, but i suppose this is just a symptom of "We need to get it running first" and will be updated at some later date.

How to get a decent terminal emulator

The first thing after getting it to work for me was to get rid of the Windows terminal emulator (if you want to call it that). Don't get me wrong they really put effort into upgrading it to support at least part of the VT100 spec, but copy and paste and mouse support aren't there yet, which is a no-go for me. So how can we get a better emulator on top of it?

Of course you could use what everyone on Windows seems to use: ConEmu. You'll have to put some time into configuring it correctly (get my config here: Github gist) but it even has a bash profile (which is thought for cygwin bash but works for the native bash too ;) )

ConEmu Screenshot

On the other hand you could do it the Unix way: Install an X-Server and run let's say lxterminal which is rather easy to do too. I used lxterminal as an example because I confirmed it works, urxvt for example does not as it requests a PTY from the kernel and this is currently not supported on Windows.

  1. Get an X-Server (like VCXsrv) installed

  2. Run it ;)

  3. Install lxterminal on the Linux subsystem:

    sudo apt-get install lxterminal
    
  4. Find the Bash on Ubuntu on Windows executable in C:\Windows\System32\bash.exe and create a shortcut to that

  5. Open the properties window of that shortcut and add -c "DISPLAY=:0 lxterminal" to the Target

Properties Window

  1. Start the Shortcut and do not close the cmd window (the one with the Ubuntu logo) but instead minimize it.

  2. Use the lxterminal window

LX Terminal

If you install an X-Server this way you can even use most GUI programs from the Ubuntu repositories, just remember to export DISPLAY=:0 before running any GUI tool.

Beware: There is no hardware acceleration as all drawing commands go through the loopback network interface (It should be faster than RDP though), so better do not run something like Firefox or Chromium on that.

How to get rid of it again

This is as easy as installation. To remove everything, including your home dir:

lxrun /uninstall /full

To just remove the Ubuntu installation:

lxrun /uninstall

Bash tricks

Some small list of tricks to make bash better and more like z-shell without actually switching over. You may use these instead of switching over because for example there is no z-shell or you may not have admin privileges on a server.

Editing

Some things to make editing better and faster.

Better history search

Make your up/down-arrow keys more awesome with the following lines in your ~/.inputrc:

"\e[A": history-search-backward
"\e[B": history-search-forward

With those lines your up and down keys just work like you know on empty lines, but if you start typing and touch an arrow key you will search your history with the prefix you just entered.

Better prompt

As you are probably aware the variable PS1 defines your prompt, but you probably don't know about PROMPT_COMMAND, this command is executed each time a new prompt is shown, and yes it can call a bash function () {}, so have fun builing your prompt.

ANSI colors

    COLOR_WHITE='\033[1;37m'
    COLOR_LIGHTGRAY='033[0;37m'
    COLOR_GRAY='\033[1;30m'
    COLOR_BLACK='\033[0;30m'
    COLOR_RED='\033[0;31m'
    COLOR_LIGHTRED='\033[1;31m'
    COLOR_GREEN='\033[0;32m'
    COLOR_LIGHTGREEN='\033[1;32m'
    COLOR_BROWN='\033[0;33m'
    COLOR_YELLOW='\033[1;33m'
    COLOR_BLUE='\033[0;34m'
    COLOR_LIGHTBLUE='\033[1;34m'
    COLOR_PURPLE='\033[0;35m'
    COLOR_PINK='\033[1;35m'
    COLOR_CYAN='\033[0;36m'
    COLOR_LIGHTCYAN='\033[1;36m'
    COLOR_DEFAULT='\033[0m'

Add these to ~/.bash_rc to be able to colorize your output. Probably useful for PS1.

Useful aliases

Actually some aliases and single line functions, because I can't remember everything.

Serve current directory as a website

alias serve='python -m SimpleHTTPServer 8008'

This needs python installed (it seems to be installed everywhere nowadays) and opens a HTTP server that serves the current directory on port 8008. It honours index.html files but does not dynamic parsing, etc.

Quit with CTRL+C

Hexdump a file

alias hexhead='xxd -g 1 -l 1024'
function hex() { xxd -g 1 "$1" | ${PAGER:-less} }
function hextail() { tail -c 1024 "$1" | xxd -g 1 } 

hexhead displays the first kilobyte of the file as a hex dump, just hex displays all of it in your default pager (or less if not set, Beware: Don't do this to Big files, it takes ages ;) ). hextail displays the last kilobyte.

Switch a python virtualenv

function venv() { . "$HOME/.virtualenvs/$1/bin/activate" }

Switch to a python virtual env by name (assuming you store your virtual python installations in $HOME/.virtualenvs)

xhyve: a quick how to

So here we are, xhyve finally usable, so how to use it?

Fetching the source and compiling it

To compile xhyve you need either Xcode or at least the Xcode command line tools from Apple. If you not already have those installed run the following in a Terminal window:

xcode-select --install  

Now you're ready to checkout the source code and compile it:

git clone https://github.com/dunkelstern/xhyve.git xhyve  
cd xhyve  
git checkout sparse-disk-image  
make  

If everything worked the last lines of the make output should look something like this:

ld xhyve.sym  
dsym xhyve.dSYM  
strip xhyve  

If you want you could install the xhyve binary to /usr/local/bin now or just copy it somewhere where you'll find it:

sudo mkdir -p /usr/local/{bin,share/man/man1}  
sudo cp build/xhyve /usr/local/bin/  
sudo cp xhyve.1 /usr/local/share/man/man1/  

(It will ask for your password when using sudo, if you have Homebrew installed you might skip the sudo)

The first config file

A xhyve config file is a simple shell script, as xhyve takes all configuration in it's command line parameters.

What's possible

The best option to see what xhyve actually can do is the man page, but I will describe the basics to you nonetheless.

Open the man page with man ./xhyve.1 while you're in the source directory or just with man xhyve if you copied the binary and man-page into /usr/local.

So what can xhyve emulate:

ACPI

Command line parameter: -A

This is an option that should be set all the time except if you get weird ACPI related crashes.

Number of CPUs

Command line parameter: -c <number>

How many CPU cores you want to share with the guest operating system, defaults to one, maximum is 16.

Allocated RAM

Command line parameter: -m <number>

Amount of RAM to give to the guest operating system. You may use suffixes like k, m or g for Kilobytes, Megabytes or Gigabytes.

Virtual COM ports

Command line parameter: -l <source-device>,<destination-device>

This maps a virtual COM port (serial port) to a host device.
<source-device> may be either com1 or com2 and <destination-device> may be either stdio (to connect the serial port to the terminal window's console) or any character device in /dev/.

Virtual PCI bus

Command line parameter: -s <slot>,<emulation>,[config]

This is a bit more complex as it defines which hardware is connected to your VM.

<slot> is a number from 0 to 31 that defines the PCI slot.

<emulation> is the device to emulate. xhyve knows the following devices:

  • hostbridge, this is usually needed at slot 0 for most guests
  • virtio-net, a virtual network card
  • virtio-blk, a virtual block device (a disk)
  • ahci-cd, a virtual CD-ROM/DVD-ROM drive
  • ahci-hd, a virtual disk for guests that have no virtio driver
  • uart, a virtual serial port (COM3+)
  • lpc, a PCI to ISA bridge, used for COM1 and COM2

Each of these has a specific configuration that may be set, see the man page for further instructions, an example of the usage follows.

Linux kernel to boot

Command line parameter: -f kexec,<kernel>,<initrd>,<kernel command line>

This essentially specifies which linux kernel to load. The problem with xhyve is that it has no BIOS or EFI emulation, so it loads the Linux kernel directly, this leads to a small inconvenience: We need the Kernel outside of the disk image.

What's needed

So this is all nice and good but which of these command line flags do we really need to get a standard Ubuntu running?

  • -A, for ACPI mode
  • -m 1G, 1GB RAM
  • -s 0,hostbridge -s 31,lpc, basically the minimum PCI config that works
  • -l com1,stdio, map the first serial port to the Terminal window
  • -s 1,ahci-cd,ubuntu-15.10-server-amd64.iso, the CD-ROM with the Ubuntu ISO inserted
  • -s 2,virtio-blk,hdd.img,sectorsize=4096,size=20G,split=1G,sparse, the main disk, 20GB max size, split into 1GB parts, use a sector size of 4096 (best choice for SSDs) and do not eat my harddisk space (sparse)
  • -f kexec,vmlinuz,initrd.gz,"earlyprintk console=ttyS0", load a linux kernel from the files vmlinuz and initrd.gz the kernel parameters tell Ubuntu to map the console to the first serial port so we can see the boot process in the Terminal.

All in one (save as install.sh):

# Linux
KERNEL="vmlinuz"  
INITRD="initrd.gz"  
CMDLINE="earlyprintk=serial console=ttyS0"

# Guest Config
MEM="-m 1G"  
IMG_CD="-s 1,ahci-cd,ubuntu-15.10-server-amd64.iso"  
IMG_HDD="-s 2,virtio-blk,hdd.img,size=20G,split=1G,sparse"  
NET="-s 3,virito-net,vmnet0"  
PCI_DEV="-s 0:0,hostbridge -s 31,lpc"  
LPC_DEV="-l com1,stdio"  
ACPI="-A"

# and now run
sudo xhyve $ACPI $MEM $PCI_DEV $LPC_DEV $NET $IMG_CD $IMG_HDD -f kexec,$KERNEL,$INITRD,"$CMDLINE"  

As you might have noticed, we run the xhyve binary as root this is because the xhyve binary has to be code signed with an Apple Developer Key or run as root to use the networking infrastructure. Let's just install Ubuntu and think about that later.

Installing ubuntu

As there is no graphical output we have to install Ubuntu Server (as only the Server version has the text-mode installer).

Go get it here: http://www.ubuntu.com/download/server

As you have read in the What's possible section, we need a Linux Kernel outside of our disk images to boot. So let's get one.

Extracting the setup kernel

As the Ubuntu install CD contains all files we need, why not just extract it from the disk image? Well there's one catch: Ubuntu uses a so called Mixed-Mode CD image and Mac OS X doesn't really like to mount such a disk so we have to resort to a small trick, execute the following on a Terminal to get the Kernel:

# create a 2k temporary file filled with zeroes
dd if=/dev/zero bs=2k count=1 of=tmp.iso

# append the ubuntu server image to that
dd if=ubuntu-15.10-server-amd64.iso bs=2k skip=1 >> tmp.iso

# mount it
hdiutil attach tmp.iso

# copy needed files
cp /Volumes/Ubuntu-Server\ 15/install/vmlinuz .  
cp /Volumes/Ubuntu-Server\ 15/install/initrd.gz .

# unmount it
hdiutil detach /Volumes/Ubuntu-Server\ 15  

You may want to keep tmp.iso or as well delete it as we have what we want. (Thanks to the people at pagetable for this neat trick)

Running the setup

If you saved the All in one script from above, make it executable with chmod a+x install.sh and run it.

If everything worked Ubuntu should boot in your Terminal.
Do not finish the setup yet, as you'll need to do something before:

Extracting the system kernel

When you finished setup you'll need to exchange which Kernel to load. The easiest thing to do that is by opening a console in the ubuntu installer. If it asks you if you want to finish the installation and reboot, don't do that but return to the main installer menu, there's an option Open a console that we need now.

After being dumped into the console make the installed target your change root:

chroot /target  
bash  
cd /boot  

Now open a second Terminal window on your Mac and run the following:

nc -l -p 1234 | tar x  

This starts a netcat server that listens on the port 1234 for a connection from the Ubuntu guest and expects to receive a tar file.

Now on the Ubuntu guest execute the following:

tar cv vmlinuz* initrd.img* | nc <ip_of_mac> 1234  

If everything worked you should have two files on your Mac:

  • vmlinuz-4.2.0-16-generic
  • initrd.img-4.2.0-16-generic

(The version number may differ)

Now finish the setup and let Ubuntu reboot. Rebooting should exit xhyve and dump you to your Mac console. This is correct!

Boot config

Now copy install.sh to boot.sh and make the following changes:

  • replace the files for INITRD and KERNEL with those you extracted from the VM
  • add root=/dev/vda1 to the Kernel command line after console=ttyS0
  • You may comment out the IMG_CD line

In future, to boot Ubuntu, you'll only need to run boot.sh and it will just work.

xhyve: lightweight vm for Mac OS X

xhyve is a port of bhyve a qemu equivalent for FreeBSD to the Mac OS X Hypervisor framework. This means it has zero dependencies that are not already installed on every Mac that runs at least OS X Yosemite (10.10). The cool thing though is that Mac OS X has full control of the system all the time as no third party kernel driver hijacks the CPU when a VM is running, so the full power management that OS X provides is always in charge of everything. No more battery draining VMs \o/.

xhyve logo

xhyve is Open Source

This is really cool as everyone is able to hack it, so did I. The code is, like every bit of lowlevel C code I saw in my life, a bit complex to read and has not many comments but is very structured so you can easily find what you want to modify and do that.

The project is quite young so don't expect miracles. It has for example no graphics emulation. Running Ubuntu or FreeBSD is reduced to a serial tty and networking. If you want to run a virtual server on your Mac for development purposes it's quite perfect though.

There was one downer that got me: A virtual disk of say 30 GB has a backing file that is exactly 30 GB big even if you only store 400 MB on it. That's bad for everyone running on a SSD where space is limited as of now.

Introducing: Sparse HDD-Images for xhyve

Because the VM code is pretty small (the compiled executable of xhyve is about 230 KB) I though it might be possible for me to change this one last thing that prevented me to use xhyve on my Macbook. It turns out it is really easy to hack the virtual block device subsystem. All disk access code is neatly contained in one file: blockif.c. It is neatly separated from the virtio-block and ahci drivers.

So what I went out to do was three things:

  • Split the big disk image file into multiple segments (as for why read on)
  • Make the disk image segments only store blocks that have actual content in them (vs. storing only zeroes)
  • Make xhyve create the backing image files if they do not exist.

Splitting the disk image into segments

You may ask why, this is rather an optimization for maintaining speed and aid debugging but turned out to have the following advantages:

  • Some file systems may only allow files of a maximum size (prime example: FAT32 only allows 2GB per file)
  • Sparse image lookup tables can be filled with 32 bit values instead of defaulting to 64 bit (which saves 50% space in the lookup tables)
  • Debugging is easier as you may hexdump those smaller files on the terminal instead of loading a multi gigabyte file into the hex editor of your choice
  • Fragmentation of sparse images is reduced somewhat (probably not an issue for SSD backed files)
  • Growing disks is easy: just append a segment
  • Shrinking disks should be possible with help of the guest operating system, if it is able to clear the end of the disk of any data you could just delete a segment.

So splitting was implemented and rather easy to think of, just divide the disk read offset by the segment size and use a modulo operation to get to the in-segment-address. There's one catch: I had to revert from using preadv and pwritev to regular reads and writes. Usually you really want those v functions as they allow executing multiple read and write calls in one system call, thus beeing atomic. But these functions only work with one file descriptor and our reads probably span multiple segments and thus multiple file descriptors.

To make the thing easier and configurable I introduced two additional parameters for the block device configuration:

  • size the size of the backing file for the virtual disk
  • split the segment size. size should be a multiple of split to avoid wasting space.

You may use suffixes like k, m, g like on the RAM settings to avoid calculating the real byte sizes in your head ;)

Be aware: You may convert a disk from plain to split image either by using dd and splitting the image file (exact commands are left as an exercise to the reader) or by setting split to the old size of the image and size to a multiple of split effectively increasing the size of the disk by a multiple of the old size. New segments will be created automatically on next VM start.

Example config

Implementing sparse images

So the last step for making xhyve usable to me: Don't waste my disk space.

I think there are multiple methods for implementing efficient sparse images, I went for the following:

  • Only save sectors that contain actual data and not only zeroes
  • Minimum allocation size is one sector
  • Maintain a lookup table that references where each sector is saved in the image
  • Deallocation of a sector (e.g. overwriting with zeroes) is only handled by a shrink disk tool offline

So how does such a lookup table look?

A sparse disk lookup table is just an array of 32 bit unsigned integers, one for each sector. If you want to read sector 45 you just take the value of array position 45, multiply it by sector size and seek into the image segment to read from that address. Simple, isn't it?

In the current implementation the lookup table is written to a separate file with the extension .lut, all writes to this file are synchronous. The other backing files will be initially created as zero byte length files and when the guest os starts writing data the new sector is appended to the respective segment file and a new offset is written to the lookup table.

The lookup table starts as an array full of UINT32_MAX values (0xffffffff) as this is the marker used to describe that this sector is not yet in the image and thus should be returned as a series of zero values. If a read finds an entry other than that marker the corresponding data is read from the segment file.

All lookup tables for all segment files are appended to the .lut file, so it contains multiple tables, not just one. Positive side of this is that 32 bits of offset data map to a maximum segment size of about just under 4 TB divided by the sector size. If you use an SSD as backing storage you probably should configure your sector size to 4KB as that is the sector size of most SSDs and will result in additional performance. So this will result in a maximum segment size of about 16 PB and I never heard of a Mac that has this much storage. (If yours has please send me a photo)

Writes of new sectors (those appended to the segment file) are synchronous to avoid two sectors with the same address. Other writes are as the user configured them on the command line.

To enable sparse disk images just add sparse as a parameter to your configuration.

Sparse config example

Be aware: You'll have to recreate your disk image to profit of this setting, sparse disks are not compatible with normal disks.

Conclusion

I used this configuration:

-s 4,virtio-blk,test/hdd/hdd.img,sectorsize=4096,size=20G,split=1G,sparse

So this is how it looks on disk:

$ ls -lah
total 2785264  
drwxr-xr-x  23 dark  staff   782B Jan 16 01:27 .  
drwxr-xr-x   9 dark  staff   306B Jan 16 01:27 ..  
-rw-rw----   1 dark  staff   531M Jan 16 02:50 hdd.img.0000
-rw-rw----   1 dark  staff    12K Jan 16 01:18 hdd.img.0001
-rw-rw----   1 dark  staff   5.9M Jan 16 02:50 hdd.img.0002
-rw-rw----   1 dark  staff    24K Jan 16 01:18 hdd.img.0003
-rw-rw----   1 dark  staff   110M Jan 16 02:48 hdd.img.0004
-rw-rw----   1 dark  staff     0B Jan 16 01:11 hdd.img.0005
-rw-rw----   1 dark  staff   172M Jan 16 02:50 hdd.img.0006
-rw-rw----   1 dark  staff     0B Jan 16 01:11 hdd.img.0007
-rw-rw----   1 dark  staff   207M Jan 16 02:50 hdd.img.0008
-rw-rw----   1 dark  staff     0B Jan 16 01:11 hdd.img.0009
-rw-rw----   1 dark  staff    15M Jan 16 02:48 hdd.img.0010
-rw-rw----   1 dark  staff     0B Jan 16 01:11 hdd.img.0011
-rw-rw----   1 dark  staff    46M Jan 16 01:24 hdd.img.0012
-rw-rw----   1 dark  staff     0B Jan 16 01:11 hdd.img.0013
-rw-rw----   1 dark  staff   151M Jan 16 02:50 hdd.img.0014
-rw-rw----   1 dark  staff    12K Jan 16 01:18 hdd.img.0015
-rw-rw----   1 dark  staff    97M Jan 16 02:50 hdd.img.0016
-rw-rw----   1 dark  staff   5.3M Jan 16 02:48 hdd.img.0017
-rw-rw----   1 dark  staff     0B Jan 16 01:11 hdd.img.0018
-rw-rw----   1 dark  staff   4.0K Jan 16 01:15 hdd.img.0019
-rw-rw----   1 dark  staff    20M Jan 16 02:48 hdd.img.lut

The dump you see here is of an image where I just installed Ubuntu server 15.10.
Instead of wasting 20GB of space this one only needs about 1.3GB. Speed is about the same as before (with a SSD as backing storage) but may suffer severely on a spinning rust disk as there are way more seeks if the sectors become fragmented.

Where to get it

Currently you will have to compile yourself, just fetch the sparse-disk-image branch from https://github.com/dunkelstern/xhyve and execute make.