Advanced Testing using UserModeLinux - Richard Weinberger, sigma star gmbh [Open Source Summit EU 2019]

UserModeLinux is not really a hot topic, though the uses cases still exist. Its main advantage is that you can use it without help from the surrounding kernel (unlike e.g. KVM). It also makes some interesting features possible, like time travel.

UserModeLinux was taken over by KVM for most use cases. New developers hardly know it. There are still hacks and projects for which it is useful, for example testing.

For advanced (special-purpose) testing, there is not a single tool that fits all. UML can be one of these tools.

UML is Linux ported to its own syscall API as the “hardware”. The kernel runs as a regular process. No root rights are required. Everything that depends on hardware is provided by the lower syscall API. For example, a SIGSEGV is a page fault.

The first implementation of UML was called LinuxOnLinux. It only works on x86. It’s not fast, KVM is faster. But still faster than full system emulation (qemu without KVM). It is mainline since 2.5. Unlike for normal compilation, you need a toolchain with libc. You build with ARCH=um. Typically no CROSS_COMPILE. To build 32-bit, use SUBARCH=i386.

The build creates a file called linux, dynamically linked ELF file with librt, libpthread etc. Command-line arguments are the linux cmdline. rootfstype=hostfs gives host’s root as the internal root. Disk images can be passed with ubda (userspace block device). With con=xterm a new xterm is launched for each virtual console that is opened.

UML is a kernel and hypervisor in one. There is no emulation of low-level CPU instructions. To emulate various things to userspace, you can just modify UML. For example, you can change some code in UML to give the functionality of faketime.

You can glue UML to other userspace libraries to emulate any kind of functionality. This can be an easy way to test userspace binaries (within UML) by fudging syscall results. For example, you can make it load a pcap file and use that as input on a socket. UML_NET_PCAP driver.

To debug UML, you can use earlyprintk, which calls fprintf directly.

For testing timeouts in userspace, you can use CONFIG_UML_TIME_TRAVEL_SUPPORT. It will not actually sleep when something tries to sleep. Instead it just advances its internal clock whenever you sleep. Infinite CPU power mode is an alternative that does the reverse: it doesn’t advance time at all unless something is sleeping. The patch implementing this has a tiny diffstat (+233-7), which shows how easy it is to add stuff like this to UML. This patch is a nice introduction to how to write features in UML.

There are quite a few interesting features upstream, but you may need to modify UML code for your specific use case.

UML builds the kernel as a normal C program, so no -ffreestanding etc. The userspace programs are just execve’d, but intercepted using ptrace. That is what makes it architecture-specific, because it needs to understands the details of the syscall ABI. ptrace will make every syscall a no-op. It will execute its own syscall handler instead. For page faults, it uses SIGSEGV. The memory is modeled as a large tempfile that is mmapped. When you access outside of that mmap, an SIGSEGV is raised and UML handles it as a page fault. The UML process and the userspace processes are really separate processes for the host. So copy_from_user does actual copying between the two processes.

UML has no IOMEM or DMA support, so drivers only work if they don’t depend on that. For example, Richard modified the nandsim driver to also work without IOMEM so it could be used in UML.

Linux has a huge number of symbols. Since it is normally not linked with libc, there are some that clash with libc symbols. Therfore, UML has to rename some of them. Calls to host stall the UML process, so they’re like a NMI.

Missing at the moment:

  • support for non-x86
  • SMP
  • documentation