As a packet slinger by trade, a network admin, and also a bit of a programmer, I lately embarked on a quest for better throughput – to find a way to get more packets to pass through a rather small footprint low-cost device running a VPN tunnel. I will review my findings here over the next few weeks.
It’s been a quest that has taken me deep into the guts of Linux and even into the kernel space. A device capable of routing IP packets at near line rate on a gigabit connection drops down to less than a hundred megabits when we use a tunnel or VPN based on the tun/tap driver. As we shift into a world that requires VPN’s to the office while working from home, performance on these low-cost small footprint devices becomes more and more important.
I discovered that part of the problem is that when using tun/tap we rely very heavily on read(), write() and memcpy() to move little bits of data from the kernel, to the user-space application, manipulate the data, and then move it back again to further processing by the network stack. Essentially, we’re relying on not only the speed of the CPU, but also the speed of the memory and even the system bus. To get the most out of these small devices we need to move the data as few times as possible, minimising the amount of read/write/copy, in order to maximise the packet throughput.
Wireguard achieves amazing performance because of it’s in-kernel implementation (unlike OpenVPN for example). Similarly I discovered that “Foo over UDP” is an in-kernel technology also capable of tunnelling packets at a very high efficiency over a simple UDP tunnel. Looking at these 2 examples, wireguard and FoU, and comparing them to their user-land equivalents like StrongSWAN and OpenVPN, I thought I’d see what would be involved in implementing a custom solution using an in-kernel module for a certain client project we’ve been working on.
Let us start our journey down the rabbit hole with a simple “Hello, World!” example. We’ll craft up a kernel module, that creates a /proc file. When we read the file, we should be get a response back from the kernel.
So, get set up with a new project with all your usual tricks (like Git for versioning, BackupPC for backups, etc). I like “Visual Studio Code” as an IDE as it runs natively on Linux now.
In your new project, start with the following hello.c file.
\#include /\* We're doing Kernel work \*/
\#include /\* Specifically a Kernel Module \*/
\#include /\* Needed for the module_init/exit() macros \*/
static int __init hello_init(void)
{
printk(KERN_INFO "Hello, world!\n");
return 0;
}
static void __exit hello_exit(void)
{
printk(KERN_INFO "Goodbye, world!\n");
}
module_init(hello_init);
module_exit(hello_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Rob Hartzenberg ");
MODULE_DESCRIPTION("Hello world example.");
MODULE_SUPPORTED_DEVICE("hello");
Once compiled, you can use “sudo insmod ./hello.ko” to load the module. You should see something new in “sudo dmesg | tail”. Then you can “sudo rmmod hello” to remove it, and check “sudo dmesg | tail” again.
Try it, and let us know how it goes!
See you in part 2…