headscale & DERP
I was really happy with Tailscale, but I saw a potential need for dozens of subnet routers. This is possible with Tailscale but would most likely require an enterprise license and you are not in control of the control servers or the DERP (relay) servers. I decided to set up a headscale server as a proof of concept. Depending how that went either move in that direction, or stick with commercial Tailscale.
https://github.com/juanfont/headscale
Since we have projects in motion to start putting serviced in the public cloud (AWS, Azure etc), I decided I would build the headscale server in the public cloud. After doing some research it looked like the Oracle Cloud had a very generous “always free” option where you could have two VMs entirely free.
I deployed an Ubuntu VM, got everything up to date and then installed headscale. There isn’t a lot of documentation so I struggled with it a bit, but was able to get it working after some trial and error. Once installed it took some more trial and error to get my config down, but once I did it has been rock solid.
I found some of the common commands you type to be a bit cumbersome so I created a couple aliases for the most common commands on the Headscale server:
$ cat .bash_aliases
alias nodes=”sudo headscale nodes list”
alias routes=”sudo headscale routes list”
DERP (Designated Encrypted Relay for Packets)
With my nodes all moved to my headscale server and working great, I decided to test the embedded DERP relay server. The DERP relays are used when direct point to point Wireguard tunnels cannot be established generally do to restrictive firewalls or CGN (carrier grade NAT). The DERP relays allow the nodes to establish encrypted TLS tunnels over tcp 443. This is essentially just an encrypted session to a webserver, so it will be permitted through firewalls and CGN.
To do this I setup an ACL to block UDP 41641 from one of my nodes:
ip access-list extended force_derp
10 deny udp host 192.168.250.12 eq 41641 any
20 permit ip any any
At this point, I discovered the default headscale behavior is to use the embedded DERP relay as well as the official Tailscale DERP servers. Because the Tailscale servers were closer to my nodes than my headscale server, the clients used them instead. I disabled that functionality forcing them to use the headscale relay and then finally set up two dedicated DERP servers.
As you can see in the output above, the node tests each of my three DERP servers and determines it has the lowest latency to the Madison server. However, this will be implemented in reverse as Tailscale asymmetrically routes traffic through the DERP relay nearest to each recipient.
The “word” node mostly uses the Madison relay server because most of the nodes are in the Madison area. However you can see it uses the Ashburn relay to reach the tailscalecloud node as it is also in Ashburn.
Put it all together and you’ve got a hardware agnostic DIY SD-WAN solution that runs entirely on open source software that just works.
😀 Nice write up!