Preview only show first 10 pages with watermark. For full document please download

Slides

   EMBED


Share

Transcript

Advanced Computer Networks 263‐3501‐00 Network I/O Virtualization Patrick Stuedi Spring Semester 2014 1 © Oriana Riva, Department of Computer Science | ETH Zürich Tuesday 13 May 2014 Outline  Last week: Software Defined Networking  OpenFlow   Today: Network I/O Virtualization  Paravirtualization  SR-IOV  2 Tuesday 13 May 2014 Data Transfer Non-Virtualized Application syscall OS TCP/IP Driver privileged instructions NIC 1) Application: syscall, e.g., socket.write() 2) OS driver: issue PCI commands ● Set up DMA operation 3) NIC: ● transmit data ● raise interrupt when done 3 Tuesday 13 May 2014 Virtualization and Hypervisors VM1 VM2 VM3 Guest Application Guest Application Guest Application Guest Operation System Guest Operation System Guest Operation System Hypervisor Hardware 4 Tuesday 13 May 2014 Virtualization and Hypervisors VM1 VM2 VM3 Guest Application Guest Application Guest Application Guest Operation System Guest Operation System Guest Operation System How does Hypervisor network access work? Hardware 5 Tuesday 13 May 2014 Option 1: Full Device Emulation  Guest OS unaware that it is being virtualized Application  Hypervisor emulates device at the lowest level  Privileged instructions from guest driver trap into hypervisor  Advantage: no changes to the Hypervisor  Disadvantage:  Traps Device Emulation guest OS required  Guest OS Inefficient Complex Hardware 6 Tuesday 13 May 2014 Option 1: Paravirtualization  Guest OS aware that it is being virtualized  Application Runs special paravirtual device drivers Guest OS  Hypervisor cooperates with guest Paravirtual Driver OS through paravirtual interfaces Interfaces  Advantage: Device Emulation Better performance  Simple  Hypervisor  Disadvantage:  Requires changes to the guest OS Hardware 7 Tuesday 13 May 2014 Paravirtualization with VirtIO Guest OS Front-end drivers virtio Back-end drivers Device Emulation Back-end drivers Device Emulation KVM Hypervisor lguest Hypervisor Hardware  VirtIO: I/O virtualization framework for Linux    Framework for developing paravirtual drivers Split driver model: front-end and back-end driver APIs for front-end and back-end to communicate 8 Tuesday 13 May 2014 Example: KVM Hypervisor VM1 guest mode (guest user, guest kernel) user mode Guest Application Guest Operation System  Based on Intel VT-x  Single Linux Process  Additional guest execution mode I/O at guest OS trap into KVM (VM Exit)  KVM schedules QEMU process to emulate I/O operation user mode User Application QEMU VM Enter & VM Exit KVM module kernel mode Hypervisor Hardware /dev/kvm  Starting new guest = starting QEMU process  QEMU process interacts with KVM through ioctl on /dev/kvm to    Allocated memory for guest Start guest 9 ... Tuesday 13 May 2014 VirtIO and KVM Guest OS QEMU 1 VirtIO Back-end tx rx VirtIO-Net Driver (Front-end) VirtIO shared memory 3 KVM module 2 4 Hypervisor tap Real NIC 1) VirtIO-Net driver adds packet to shared VirtIO memory 2) VirtIO-Net driver causes trap into KVM 3) KVM schedules QEMU VirtIO Back-end 4) VirtIO back-end gets packet from shared VirtIO memory and emulates I/O (via system call) 5) KVM resumes guest 10 Tuesday 13 May 2014 Vhost: Improved VirtIO Backend Guest OS QEMU VirtIO-Net Driver tx KVM module rx vhost net Hypervisor tap Real NIC  Vhost puts VirtIO emulation code into the kernel  Instead of performing system calls from userspace (QEMU) 11 Tuesday 13 May 2014 Inter-VM communication VM1 VM2 VM3 Guest Application Guest Application Guest Application Guest Operation System Guest Operation System Guest Operation System Hypervisor NIC 12 Tuesday 13 May 2014 Inter-VM communication VM1 VM2 VM3 Guest Application Guest Application Guest Application Guest Operation System Guest Operation System Guest Operation System Hypervisor How does inter-VM communication work? Hardware 13 Tuesday 13 May 2014 Switch in Hypervisor Guest Application Guest Application Guest Application Guest Operation System Guest Operation System Guest Operation System Bridge/switch HV NIC 14 Tuesday 13 May 2014 Switched Vhost in KVM  Advantage: low latency (1 software copy) Guest Application Guest Application Guest Application Guest OS Guest OS Guest OS VirtIO-Net Driver VirtIO-Net Driver VirtIO-Net Driver vhost net vhost net vhost net tap tap tap  Disadvantage: uses host CPU cycles KVM bridge Real NIC 15 Tuesday 13 May 2014 Switch Externally... Guest Application Guest Application Guest Application Guest Operation System Guest Operation System Guest Operation System ...either in  External switch:  Simplifies configuration: all switching controlled/configured by the network  Latency = 2xDMA + 2hops  NIC Hypervisor  NIC Latency = 2xDMA 16 Tuesday 13 May 2014 Where are we?  Option 1: Full emulation No changes to guest required  Complex  Inefficient   Option 2: Paravirtualization Requires special guest drivers  Enhanced performance  Option 3: Passthrough No hypervisor involvement: best performance Problems: 17 Tuesday 13 May 2014 Where are we?  Option 1: Full emulation No changes to guest required  Complex  Inefficient   Option 2: Paravirtualization Requires special guest drivers  Enhanced performance  Not good enough! Still requires hypervisor involvement, e.g., interrupt relaying Option 3: Passthrough No hypervisor involvement: best performance Problems: 18 Tuesday 13 May 2014 Where are we?  Option 1: Full emulation No changes to guest required  Complex  Inefficient   Option 2: Paravirtualization Requires special guest drivers  Enhanced performance  Not good enough! Still requires hypervisor involvement, e.g., interrupt relaying  Option 3: Passthrough Directly assign NIC to VM  No hypervisor involvement: best performance  19 Tuesday 13 May 2014 Paravirtual vs Passthrough in KVM VM1 VM2 Application Application Guest OS Guest OS VirtIO-Net Driver tx Physical Driver rx KVM module vhost net Hypervisor tap NIC exclusively assigned to VM2 Real NIC Real NIC 20 Tuesday 13 May 2014 Challenges with Passthrough / Direct Assignment  VM tied to specific NIC hardware  Makes VM migration more difficult  VM driver issues DMA requests using VM addresses Incorrect: VM physical addresses are host virtual addresses (!)  Security concern: addresses may belong to other VM  Potential solution: let VM translate it's physical addresses to real DMA addresses - Still safety problem: exposes driver details to hypervisor, bugs in driver could result in incorrect translations Solution: Use an IOMMU to translate/validate DMA requests from the device   Need a different NIC for each VM Solution: SR-IOV, emulate multiple NICs at hardware level 21 Tuesday 13 May 2014 Challenges with Passthrough / Direct Assignment  VM tied to specific NIC hardware  Makes VM migration more difficult  VM driver issues DMA requests using VM addresses Incorrect: VM physical addresses are host virtual addresses (!)  Security concern: addresses may belong to other VM  Potential solution: let VM translate it's physical addresses to real DMA addresses - Still safety problem: exposes driver details to hypervisor, bugs in driver could result in incorrect translations  Solution: Use an IOMMU to translate/validate DMA requests from the device   Need a different NIC for each VM  Solution: SR-IOV, emulate multiple NICs at hardware level 22 Tuesday 13 May 2014 Memory Address Terminology  Virtual Address  Address in some virtual address space in a process running in the guest OS  Physical Address:  Hardware address as seen by the guest OS, i.e., physical address in the virtual machine  Machine address:  Real hardware address on the physical machine as seen by the Hypervisor 23 Tuesday 13 May 2014 IOMMU Memory controller IOMMU Main memory PCIe function (e.g. NIC) 24 Tuesday 13 May 2014 IOMMU VMM programs IOMMU with VM-physical to machine address translations Memory controller IOMMU Main memory PCIe function (e.g. NIC) 25 Tuesday 13 May 2014 IOMMU VMM programs IOMMU with VM-physical to machine address translations Memory controller IOMMU Main memory Guest OS programs NIC with VMphysical address of DMA PCIe function (e.g. NIC) 26 Tuesday 13 May 2014 IOMMU VMM programs IOMMU with VM-physical to machine address translations Memory controller IOMMU Guest OS programs NIC with VMphysical address of DMA Main memory NIC issues a DMA request to VM physical memory PCIe function (e.g. NIC) 27 Tuesday 13 May 2014 IOMMU VMM programs IOMMU with VM-physical to machine address translations IOMMU checks and translates to machine (real) address for transfer Memory controller IOMMU Guest OS programs NIC with VMphysical address of DMA Main memory NIC issues a DMA request to VM physical memory PCIe function (e.g. NIC) 28 Tuesday 13 May 2014 IOMMU VMM programs IOMMU with VM-physical to machine address translations IOMMU checks and translates to machine (real) address for transfer Memory controller IOMMU Guest OS programs NIC with VMphysical address of DMA NIC issues a DMA request to VM physical memory Main memory Memory controller accesses memory PCIe function (e.g. NIC) 29 Tuesday 13 May 2014 SR-IOV  Single-Root I/O Virtualization  Key idea: dynamically create new “PCI devices” Physical Function (PF): original device, full functionality  Virtual Function (VF): extra device, limited functionality  VFs created/destroyed via PF registers   For Networking: Partitions a network card's resources  With direct assignment can implement passthrough  30 Tuesday 13 May 2014 SR-IOV in Action Guest Application Guest Application Guest Application Guest OS Guest OS Guest OS Physical Driver Physical Driver Physical Driver Hypervisor Physical Driver IOMMU PCI Virtual Function Virtual Virtual Function Function Virtual Ethernet Bridge/Switch Physical Function SR-IOV NIC 31 Tuesday 13 May 2014 References  I/O Virtualization, Mendel Rosenblum, ACM Queue, January 2012  Kernel-based Virtual Machine Technology, Yasunori Goto, Fujitsu Technical Journal, July 2011  VirtIO: Towards a De-Facto Standard For Virtual I/O Devices, Rusty Russel, ACM SIGOPS Operating Systems Review, July 2008 32 Tuesday 13 May 2014