Preview only show first 10 pages with watermark. For full document please download

Patrick Stuedi, Ankit Singla, Desislava Dimitrova Spring Semester 2016 Advanced Computer Networks

   EMBED


Share

Transcript

Advanced Computer Networks 263-3501-00 Network I/O Virtualization Patrick Stuedi, Ankit Singla, Desislava Dimitrova Spring Semester 2016 © Oriana Riva, Department of Computer Science | ETH Zürich Outline  Last week: Data center routing & addressing  Portland, VL2, TRILL   Today: Network I/O Virtualization  Paravirtualization  SR-IOV  2 Processor Clock Frequency Scaling Has Ended  Three decades of exponential clock rate  (and electrical power!) growth has ended  Yet Moore’s Law continues in transistor  count  What do we do with all those transistors  to keep performance increasing to meet  demand?  Industry response: Multi­core (i.e.  double the number of cores every 18  months instead of the clock frequency  (and power!) Source: “The Landscape of Computer Architecture,” John Shalf,  NERSC/LBNL, presented at ISC07, Dresden, June 25, 2007 3 Virtualization and Hypervisors VM1 VM2 VM3 Guest Application Guest Application Guest Application Guest Operation System Guest Operation System Guest Operation System Hypervisor Hardware 4 Data Transfer Non-Virtualized Application syscall OS TCP/IP Driver privileged instructions NIC 1) Application: syscall, e.g., socket.write() 2) OS driver: issue PCI commands ● Set up DMA operation 3) NIC: ● transmit data ● raise interrupt when done 5 Virtualization and Hypervisors VM1 VM2 VM3 Guest Application Guest Application Guest Application Guest Operation System Guest Operation System Guest Operation System How does Hypervisor network access work? Hardware 6 Option 1: Full Device Emulation  Guest OS unaware that it is being virtualized Application  Hypervisor emulates device at the lowest level  Privileged instructions from guest driver trap into hypervisor  Advantage: no changes to the Hypervisor  Disadvantage:  Traps Device Emulation guest OS required  Guest OS Inefficient Complex Hardware 7 Option 2: Paravirtualization  Guest OS aware that it is being virtualized  Application Runs special paravirtual device drivers Guest OS  Hypervisor cooperates with guest Paravirtual Driver OS through paravirtual interfaces Interfaces  Advantage:   Device Emulation Better performance Simple Hypervisor  Disadvantage:  Requires changes to the guest OS 8 Hardware Paravirtualization with VirtIO Guest OS Front-end drivers virtio Back­end drivers Device Emulation Back­end drivers Device Emulation KVM Hypervisor lguest Hypervisor Hardware  VirtIO: I/O virtualization framework for Linux    Framework for developing paravirtual drivers Split driver model: front-end and back-end driver APIs for front-end and back-end to communicate 9 Example: KVM Hypervisor VM1 guest mode  (guest user,  guest kernel) user mode Guest Application Guest Operation System QEMU  Based on Intel VT-x Single  Linux  Process /dev/kvm  Additional guest execution mode I/O at guest OS trap into KVM (VM Exit)  KVM schedules QEMU process to emulate I/O operation User Application VM Enter & VM Exit KVM module  Hypervisor Hardware  Starting new guest = starting QEMU process  QEMU process interacts with KVM through ioctl on /dev/kvm to    Allocated memory for guest Start guest 10 ... user mode kernel  mode VirtIO and KVM Guest OS QEMU 1 VirtIO Back-end tx rx 3 KVM module VirtIO-Net Driver (Front-end) 2 VirtIO shared  memory 4 Hypervisor tap Real NIC 1) VirtIO-Net driver adds packet to shared VirtIO memory 2) VirtIO-Net driver causes trap into KVM 3) KVM schedules QEMU VirtIO Back-end 4) VirtIO back-end gets packet from shared VirtIO memory and emulates I/O (via system call) 5) KVM resumes guest 11 Vhost: Improved VirtIO Backend Guest OS QEMU VirtIO-Net Driver tx KVM module rx vhost net Hypervisor tap Real NIC  Vhost puts VirtIO emulation code into the kernel  Instead of performing system calls from userspace (QEMU) 12 Where are we?  Option 1: Full emulation No changes to guest required  Complex  Inefficient   Option 2: Paravirtualization Requires special guest drivers  Enhanced performance  Option 3: Passthrough No hypervisor involvement: best performance Problems: 13 Where are we?  Option 1: Full emulation No changes to guest required  Complex  Inefficient   Option 2: Paravirtualization Requires special guest drivers  Enhanced performance  Not good enough!  Still requires  hypervisor  involvement, e.g.,  interrupt relaying Option 3: Passthrough No hypervisor involvement: best performance Problems: 14 Where are we?  Option 1: Full emulation No changes to guest required  Complex  Inefficient   Option 2: Paravirtualization Requires special guest drivers  Enhanced performance  Not good enough!  Still requires  hypervisor  involvement, e.g.,  interrupt relaying  Option 3: Passthrough Directly assign NIC to VM  No hypervisor involvement: best performance  15 Passthrough / Direct Assignment VM1 VM2 VM3 Guest Application Guest Application Guest Application Guest Operation System Guest Operation System Guest Operation System Native Device Driver Native Device Driver Safe HW IF Ethernet Hypervisor Safe HW IF Hardware Ethernet 16 Paravirtual vs Passthrough in KVM VM1 VM2 Application Application Guest OS Guest OS VirtIO-Net Driver tx Physical Driver rx KVM module vhost net Hypervisor tap NIC exclusively  assigned to VM2 Real NIC Real NIC 17 Challenges with Passthrough / Direct Assignment  VM tied to specific NIC hardware  Makes VM migration more difficult  VM driver issues DMA requests using VM addresses    Incorrect: VM physical addresses are host virtual addresses (!) Security concern: addresses may belong to other VM Potential solution: let VM translate it's physical addresses to real DMA addresses - Still safety problem: exposes driver details to hypervisor, bugs in driver could result in incorrect translations Solution: Use an IOMMU to translate/validate DMA requests from the device  Need a different NIC for each VM Solution: SR-IOV, emulate multiple NICs at hardware level 18 Challenges with Passthrough / Direct Assignment  VM tied to specific NIC hardware  Makes VM migration more difficult  VM driver issues DMA requests using VM addresses    Incorrect: VM physical addresses are host virtual addresses (!) Security concern: addresses may belong to other VM Potential solution: let VM translate it's physical addresses to real DMA addresses - Still safety problem: exposes driver details to hypervisor, bugs in driver could result in incorrect translations  Solution: Use an IOMMU to translate/validate DMA requests from the device  Need a different NIC for each VM  Solution: SR-IOV, emulate multiple NICs at hardware level 19 Memory Address Terminology  Virtual Address  Address in some virtual address space in a process running in the guest OS  Physical Address:  Hardware address as seen by the guest OS, i.e., physical address in the virtual machine  Machine address:  Real hardware address on the physical machine as seen by the Hypervisor 20 IOMMU Memory controller IOMMU PCIe function (e.g. NIC) 21 Main memory IOMMU VMM programs IOMMU with VM-physical to machine address translations Memory controller IOMMU PCIe function (e.g. NIC) 22 Main memory IOMMU VMM programs IOMMU with VM-physical to machine address translations Memory controller IOMMU Guest OS programs NIC with VMphysical address of DMA PCIe function (e.g. NIC) 23 Main memory IOMMU VMM programs IOMMU with VM-physical to machine address translations Memory controller IOMMU Guest OS programs NIC with VMphysical address of DMA NIC issues a DMA request to VM physical memory PCIe function (e.g. NIC) 24 Main memory IOMMU VMM programs IOMMU with VM-physical to machine address translations IOMMU checks and translates to machine (real) address for transfer Memory controller IOMMU Guest OS programs NIC with VMphysical address of DMA NIC issues a DMA request to VM physical memory PCIe function (e.g. NIC) 25 Main memory IOMMU VMM programs IOMMU with VM-physical to machine address translations IOMMU checks and translates to machine (real) address for transfer Memory controller IOMMU Guest OS programs NIC with VMphysical address of DMA NIC issues a DMA request to VM physical memory PCIe function (e.g. NIC) 26 Main memory Memory controller accesses memory SR-IOV  Single-Root I/O Virtualization  Key idea: dynamically create new “PCI devices” Physical Function (PF): original device, full functionality  Virtual Function (VF): extra device, limited functionality  VFs created/destroyed via PF registers   For Networking: Partitions a network card's resources  With direct assignment can implement passthrough  27 SR-IOV in Action Guest Application Guest Application Guest Application Guest OS Guest OS Guest OS Physical Driver Physical Driver Physical Driver                                                           Hypervisor Physical Driver IOMMU PCI Virtual Function Virtual Virtual Function Function Virtual Ethernet Bridge/Switch SR-IOV NIC 28 Physical Function Example: SolarFlare 29 Inter-VM communication VM1 VM2 VM3 Guest Application Guest Application Guest Application Guest Operation System Guest Operation System Guest Operation System Hypervisor NIC 30 Inter-VM communication VM1 VM2 VM3 Guest Application Guest Application Guest Application Guest Operation System Guest Operation System Guest Operation System Hypervisor How does inter- VM communication work? Hardware 31 Switch in Hypervisor Guest Application Guest Application Guest Application Guest Operation System Guest Operation System Guest Operation System Bridge/switch NIC 32 HV Switched Vhost in KVM  Advantage: low latency (1 software copy) Guest Application Guest Application Guest Application Guest OS Guest OS Guest OS VirtIO-Net Driver VirtIO-Net Driver VirtIO-Net Driver vhost net vhost net vhost net tap tap tap  Disadvantage: uses host CPU cycles KVM bridge Real NIC 33 Switch Externally... Guest Application Guest Application Guest Application Guest Operation System Guest Operation System Guest Operation System ...either in  External switch:  Simplifies configuration: all switching controlled/configured by the network  Latency = 2xDMA + 2hops  NIC Hypervisor  NIC 34 Latency = 2xDMA Controversial  External switching in NIC or Switch      Extra latency Reduces CPU requirements Hardware vendors like it Better TCAMs on the switch Integration with network management policies  Software switching in hypervisor      Lower latency Higher CPU consumption. But software switches got more efficient over the last years CPU resources are generic and flexible Easy to upgrade Fully support OpenFlow 35 Moral  Network interface cards traditional are the “end point”  We now have at least two more hops Virtual switch in the NIC  Virtual switch in the hypervisor   Inside of a physical machine increasingly resembles a network 36 References  I/O Virtualization, Mendel Rosenblum, ACM Queue, January 2012  Kernel-based Virtual Machine Technology, Yasunori Goto, Fujitsu Technical Journal, July 2011  VirtIO: Towards a De-Facto Standard For Virtual I/O Devices, Rusty Russel, ACM SIGOPS Operating Systems Review, July 2008 37