Transcript
Advanced Computer Networks 263‐3501‐00
Network I/O Virtualization Patrick Stuedi Spring Semester 2014
1 © Oriana Riva, Department of Computer Science | ETH Zürich
Tuesday 13 May 2014
Outline Last week: Software Defined Networking OpenFlow
Today: Network I/O Virtualization Paravirtualization SR-IOV
2 Tuesday 13 May 2014
Data Transfer Non-Virtualized Application syscall
OS
TCP/IP
Driver
privileged instructions
NIC
1) Application: syscall, e.g., socket.write() 2) OS driver: issue PCI commands ●
Set up DMA operation
3) NIC: ●
transmit data
●
raise interrupt when done 3 Tuesday 13 May 2014
Virtualization and Hypervisors VM1
VM2
VM3
Guest Application
Guest Application
Guest Application
Guest Operation System
Guest Operation System
Guest Operation System
Hypervisor Hardware 4 Tuesday 13 May 2014
Virtualization and Hypervisors VM1
VM2
VM3
Guest Application
Guest Application
Guest Application
Guest Operation System
Guest Operation System
Guest Operation System
How does Hypervisor network access work? Hardware
5 Tuesday 13 May 2014
Option 1: Full Device Emulation Guest OS unaware that it is being virtualized
Application
Hypervisor emulates device at the lowest level
Privileged instructions from guest driver trap into hypervisor
Advantage: no changes to the
Hypervisor
Disadvantage:
Traps Device Emulation
guest OS required
Guest OS
Inefficient Complex
Hardware
6 Tuesday 13 May 2014
Option 1: Paravirtualization Guest OS aware that it is being virtualized
Application
Runs special paravirtual device drivers
Guest OS
Hypervisor cooperates with guest
Paravirtual Driver
OS through paravirtual interfaces
Interfaces
Advantage: Device Emulation
Better performance Simple
Hypervisor
Disadvantage:
Requires changes to the guest OS
Hardware
7 Tuesday 13 May 2014
Paravirtualization with VirtIO Guest OS Front-end drivers
virtio Back-end drivers Device Emulation
Back-end drivers Device Emulation
KVM Hypervisor
lguest Hypervisor Hardware
VirtIO: I/O virtualization framework for Linux
Framework for developing paravirtual drivers Split driver model: front-end and back-end driver APIs for front-end and back-end to communicate 8 Tuesday 13 May 2014
Example: KVM Hypervisor VM1
guest mode (guest user, guest kernel)
user mode
Guest Application Guest Operation System
Based on Intel VT-x
Single Linux Process
Additional guest execution mode
I/O at guest OS trap into KVM (VM Exit)
KVM schedules QEMU process to emulate I/O operation
user mode
User Application
QEMU
VM Enter & VM Exit
KVM module
kernel mode
Hypervisor Hardware
/dev/kvm
Starting new guest = starting QEMU process QEMU process interacts with KVM through ioctl on /dev/kvm to
Allocated memory for guest Start guest 9 ...
Tuesday 13 May 2014
VirtIO and KVM Guest OS
QEMU
1
VirtIO Back-end
tx rx
VirtIO-Net Driver (Front-end)
VirtIO shared memory
3 KVM module
2
4
Hypervisor
tap
Real NIC 1) VirtIO-Net driver adds packet to shared VirtIO memory 2) VirtIO-Net driver causes trap into KVM 3) KVM schedules QEMU VirtIO Back-end 4) VirtIO back-end gets packet from shared VirtIO memory and emulates I/O (via system call) 5) KVM resumes guest 10 Tuesday 13 May 2014
Vhost: Improved VirtIO Backend Guest OS QEMU
VirtIO-Net Driver
tx KVM module
rx
vhost net
Hypervisor
tap
Real NIC
Vhost puts VirtIO emulation code into the kernel
Instead of performing system calls from userspace (QEMU) 11 Tuesday 13 May 2014
Inter-VM communication VM1
VM2
VM3
Guest Application
Guest Application
Guest Application
Guest Operation System
Guest Operation System
Guest Operation System
Hypervisor NIC
12 Tuesday 13 May 2014
Inter-VM communication VM1
VM2
VM3
Guest Application
Guest Application
Guest Application
Guest Operation System
Guest Operation System
Guest Operation System
Hypervisor How does
inter-VM communication work?
Hardware
13 Tuesday 13 May 2014
Switch in Hypervisor
Guest Application
Guest Application
Guest Application
Guest Operation System
Guest Operation System
Guest Operation System
Bridge/switch
HV
NIC
14 Tuesday 13 May 2014
Switched Vhost in KVM
Advantage: low latency (1 software copy)
Guest Application
Guest Application
Guest Application
Guest OS
Guest OS
Guest OS
VirtIO-Net Driver
VirtIO-Net Driver
VirtIO-Net Driver
vhost net
vhost net
vhost net
tap
tap
tap
Disadvantage: uses host CPU cycles
KVM
bridge
Real NIC 15 Tuesday 13 May 2014
Switch Externally...
Guest Application
Guest Application
Guest Application
Guest Operation System
Guest Operation System
Guest Operation System
...either in
External switch:
Simplifies configuration: all switching controlled/configured by the network
Latency = 2xDMA + 2hops
NIC
Hypervisor
NIC
Latency = 2xDMA
16 Tuesday 13 May 2014
Where are we? Option 1: Full emulation No changes to guest required Complex Inefficient
Option 2: Paravirtualization Requires special guest drivers Enhanced performance
Option 3: Passthrough No hypervisor involvement: best performance Problems: 17 Tuesday 13 May 2014
Where are we? Option 1: Full emulation No changes to guest required Complex Inefficient
Option 2: Paravirtualization Requires special guest drivers Enhanced performance
Not good enough! Still requires hypervisor involvement, e.g., interrupt relaying
Option 3: Passthrough No hypervisor involvement: best performance Problems: 18 Tuesday 13 May 2014
Where are we? Option 1: Full emulation No changes to guest required Complex Inefficient
Option 2: Paravirtualization Requires special guest drivers Enhanced performance
Not good enough! Still requires hypervisor involvement, e.g., interrupt relaying
Option 3: Passthrough Directly assign NIC to VM No hypervisor involvement: best performance
19 Tuesday 13 May 2014
Paravirtual vs Passthrough in KVM VM1
VM2
Application
Application
Guest OS
Guest OS
VirtIO-Net Driver
tx
Physical Driver
rx
KVM module
vhost net
Hypervisor
tap
NIC exclusively assigned to VM2
Real NIC
Real NIC
20 Tuesday 13 May 2014
Challenges with Passthrough / Direct Assignment VM tied to specific NIC hardware
Makes VM migration more difficult
VM driver issues DMA requests using VM addresses Incorrect: VM physical addresses are host virtual addresses (!) Security concern: addresses may belong to other VM Potential solution: let VM translate it's physical addresses to real DMA addresses - Still safety problem: exposes driver details to hypervisor, bugs in driver could result in incorrect translations Solution: Use an IOMMU to translate/validate DMA requests from the device
Need a different NIC for each VM Solution: SR-IOV, emulate multiple NICs at hardware level 21 Tuesday 13 May 2014
Challenges with Passthrough / Direct Assignment VM tied to specific NIC hardware
Makes VM migration more difficult
VM driver issues DMA requests using VM addresses Incorrect: VM physical addresses are host virtual addresses (!) Security concern: addresses may belong to other VM Potential solution: let VM translate it's physical addresses to real DMA addresses - Still safety problem: exposes driver details to hypervisor, bugs in driver could result in incorrect translations Solution: Use an IOMMU to translate/validate DMA requests from the device
Need a different NIC for each VM
Solution: SR-IOV, emulate multiple NICs at hardware level 22 Tuesday 13 May 2014
Memory Address Terminology Virtual Address
Address in some virtual address space in a process running in the guest OS
Physical Address:
Hardware address as seen by the guest OS, i.e., physical address in the virtual machine
Machine address:
Real hardware address on the physical machine as seen by the Hypervisor
23 Tuesday 13 May 2014
IOMMU
Memory controller
IOMMU
Main memory
PCIe function (e.g. NIC)
24 Tuesday 13 May 2014
IOMMU VMM programs IOMMU with VM-physical to machine address translations
Memory controller
IOMMU
Main memory
PCIe function (e.g. NIC)
25 Tuesday 13 May 2014
IOMMU VMM programs IOMMU with VM-physical to machine address translations
Memory controller
IOMMU
Main memory
Guest OS programs NIC with VMphysical address of DMA
PCIe function (e.g. NIC)
26 Tuesday 13 May 2014
IOMMU VMM programs IOMMU with VM-physical to machine address translations
Memory controller
IOMMU
Guest OS programs NIC with VMphysical address of DMA
Main memory
NIC issues a DMA request to VM physical memory
PCIe function (e.g. NIC)
27 Tuesday 13 May 2014
IOMMU VMM programs IOMMU with VM-physical to machine address translations
IOMMU checks and translates to machine (real) address for transfer
Memory controller
IOMMU
Guest OS programs NIC with VMphysical address of DMA
Main memory
NIC issues a DMA request to VM physical memory
PCIe function (e.g. NIC)
28 Tuesday 13 May 2014
IOMMU VMM programs IOMMU with VM-physical to machine address translations
IOMMU checks and translates to machine (real) address for transfer
Memory controller
IOMMU
Guest OS programs NIC with VMphysical address of DMA
NIC issues a DMA request to VM physical memory
Main memory
Memory controller accesses memory
PCIe function (e.g. NIC)
29 Tuesday 13 May 2014
SR-IOV Single-Root I/O Virtualization Key idea: dynamically create new “PCI devices” Physical Function (PF): original device, full functionality Virtual Function (VF): extra device, limited functionality VFs created/destroyed via PF registers
For Networking: Partitions a network card's resources With direct assignment can implement passthrough
30 Tuesday 13 May 2014
SR-IOV in Action Guest Application
Guest Application
Guest Application
Guest OS
Guest OS
Guest OS
Physical Driver
Physical Driver
Physical Driver
Hypervisor
Physical Driver
IOMMU PCI Virtual Function
Virtual Virtual Function Function Virtual Ethernet Bridge/Switch
Physical Function
SR-IOV NIC 31 Tuesday 13 May 2014
References I/O Virtualization, Mendel Rosenblum, ACM Queue, January 2012
Kernel-based Virtual Machine Technology, Yasunori Goto, Fujitsu Technical Journal, July 2011
VirtIO: Towards a De-Facto Standard For Virtual I/O Devices, Rusty Russel, ACM SIGOPS Operating Systems Review, July 2008
32 Tuesday 13 May 2014