2016년 7월 30일 토요일

Enable PXE netboot in KVM guests for testing UEFI

Many sysadmins and infrastructure engineers use Oracle VirtualBox + Vagrant to quickly test new system configurations without actually having to test on bare metal. Virtualbox is very convenient and its host bridge networking just works out of the box, but unfortunately Virtualbox doesn't support UEFI PXE netboot (it only supports legacy BIOS PXE).

On the other hand, the virt-manager GUI front-end for libvirt does support UEFI through Intel's tianocore opensource UEFI implementation. But when I created a VM using virt-manager's default macvtap host bridge to my wired interface, the VM could not communicate with my dhcp server to get an IP for PXE netboot.

It turns out that this is a very common problem for users of libvirt, and the solution is to create an entirely new bridge interface, add your wired interface as a slave to the new bridge, and finally to set the packet forwarding delay on your new bridge to 0 (the default forwarding delay is quite high, something like 15 seconds, which is why kvm VM's couldn't connect to my dhcp server). Optionally, you can also disable Spanning Tree Protocol on the bridge.

Here is a simple script I wrote to setup a bridge for KVM guests to communicate with the host:

#!/bin/bash
# setup-bridge.sh
# Last Updated: 2016-07-28
# Jun Go gojun077

# This script creates a linux bridge with a single ethernet
# iface as a slave. It takes 2 arguments:
# (1) Name of bridge
# (2) wired interface name (to be slave of bridge)
# The ip address for the bridge will be assigned by the script
# 'setup_ether.sh'

# This script must be executed as root and requires the
# packages bridge-utils and iproute2 to be installed.

# USAGE sudo ./setup-bridge  
# (ex) sudo ./setup-bridge br0 enp1s0

if [ -z "$1" ]; then
  echo "Must enter bridge name"
  exit 1
elif [ -z "$2" ]; then
  echo "Must enter name of wired iface to be used as slave for bridge"
  exit 1
else
  # create bridge and change its state to UP
  ip l add name "$1" type bridge
  ip l set "$1" up
  # set bridge forwarding delay to 0
  brctl setfd "$1" 0
  # Add wired interface to the bridge
  ip l set "$2" up
  ip l set "$2" master "$1"
  # Show active interfaces
  ip a show up
fi
==========================================
Update 2016-10-8
You can also create the bridge interface statically using systemd-networkd conf files. Here is a sample br0.netdev which should be placed in /etc/systemd/network/ :

[NetDev]
Name=br0
Kind=bridge

[Bridge]
ForwardDelaySec=0

You would then add the IP address, Gateway, DNS, etc. info to br0.network in the same path as br0.netdev:

[Match]
Name=br0

[Network]
Address=192.168.95.95/24
Broadcast=192.168.95.255
DNS=168.126.63.1

[Route]

Gateway=192.168.95.97

And finally, you need to make your wired interface a slave of the bridge device. Assuming your wired iface is named enp2s0, here is a sample systemd-networkd conf file, enp2s0.network, with the reqruired settings:

[Match]
Name=enp2s0

[Network]
Bridge=br0

Finally, to apply these settings, systemctl start systemd-networkd and systemctl enable systemd-networkd
==========================================

If you do not set the forwarding delay to 0, your KVM guest will not be able to communicate with your host.

Now you can assign an IP address to your new bridge.

The next step is to install the OVMF UEFI firmware for QEMU / KVM if you don't have it yet. On Fedora 22+, OVMF is a dependency of libvirtd, so you probably already have it installed. It should just work. The default nvram setting for OVMF firmware in /etc/libvirt/qemu.conf for Fedora is as follows:

nvram = [
   "/usr/share/OVMF/OVMF_CODE.fd:/usr/share/OVMF/OVMF_VARS.fd"
]

This will normally be commented out as it is the default setting.

The name of the OVMF package in Fedora is ed2k-ovmf and it contains the following files:

$ rpm -ql edk2-ovmf
/usr/share/doc/edk2-ovmf
/usr/share/doc/edk2-ovmf/README
/usr/share/edk2
/usr/share/edk2/ovmf
/usr/share/edk2/ovmf/EnrollDefaultKeys.efi
/usr/share/edk2/ovmf/OVMF_CODE.fd
/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd
/usr/share/edk2/ovmf/OVMF_VARS.fd
/usr/share/edk2/ovmf/Shell.efi
/usr/share/edk2/ovmf/UefiShell.iso

On Archlinux, there is also a ovmf package in the default repos, but it doesn't contain all the image files necessary for UEFI. Instead, you need to install the ovmf-git package from the Arch User Repository. ovmf-git contains the following files:

[archjun@pinkS310 bin]$ pacman -Ql ovmf-git
ovmf-git /usr/
ovmf-git /usr/share/
ovmf-git /usr/share/ovmf/
ovmf-git /usr/share/ovmf/ia32/
ovmf-git /usr/share/ovmf/ia32/ovmf_code_ia32.bin
ovmf-git /usr/share/ovmf/ia32/ovmf_ia32.bin
ovmf-git /usr/share/ovmf/ia32/ovmf_vars_ia32.bin
ovmf-git /usr/share/ovmf/x64/
ovmf-git /usr/share/ovmf/x64/ovmf_code_x64.bin
ovmf-git /usr/share/ovmf/x64/ovmf_vars_x64.bin
ovmf-git /usr/share/ovmf/x64/ovmf_x64.bin

You probably noticed that the filenames are slightly different from those on Fedora. Likewise, the nvram setting in /etc/libvirt/qemu.conf is also different for Archlinux:

# uefi nvram files from ovmf-git AUR
nvram = [
   "/usr/share/ovmf/x64/ovmf_x64.bin:/usr/share/ovmf/x64/ovmf_vars_x64.bin"
]

After changing this setting you need to restart the libvirtd systemd service with sudo systemctl restart libvirtd.

You should now be able to install Linux over UEFI PXE netboot into KVM guests. Note that after completing your PXE install and the first reboot, you will have to enter the UEFI firmware menu and select your new Linux boot partition in the boot menu. To access the Tianocore UEFI firmware menu, press ESC when you see the Tianocore logo at POST. I have attached screenshots below.

1. Tianocore UEFI boot screen

2. Grub2 boot menu for UEFI PXE

3. Start UEFI PXE installation for RHEL 7.2

4. vncserver on KVM guest trying to reverse-connect to listening vncclient on my host

5. vnc reverse-connect succeeded; now viewing automated installation in GUI over vnc


6. After reboot, press ESC at POST to enter Tianocore UEFI Firmware menu

7. Enter the Boot Manager menu (through OVMF Platform Config, if I recall)
The first highlighted entry in the boot manager is the bridge device.

8.Go down to UEFI Misc Device to select your newly-installed Linux EFI boot partition.

9. You can also specify that the UEFI misc device be used on the next reboot


References:

http://www.tianocore.org/ovmf/
http://wiki.libvirt.org/page/PXE_boot_(or_dhcp)_on_guest_failed
https://bugzilla.redhat.com/show_bug.cgi?id=533684
https://wiki.archlinux.org/index.php/PCI_passthrough_via_OVMF#Configuring_libvirt