kvm_440.txt 2.2 KB

123456789101112131415161718192021222324252627282930313233343536373839404142
  1. Hollis Blanchard <hollisb@us.ibm.com>
  2. 15 Apr 2008
  3. Various notes on the implementation of KVM for PowerPC 440:
  4. To enforce isolation, host userspace, guest kernel, and guest userspace all
  5. run at user privilege level. Only the host kernel runs in supervisor mode.
  6. Executing privileged instructions in the guest traps into KVM (in the host
  7. kernel), where we decode and emulate them. Through this technique, unmodified
  8. 440 Linux kernels can be run (slowly) as guests. Future performance work will
  9. focus on reducing the overhead and frequency of these traps.
  10. The usual code flow is started from userspace invoking an "run" ioctl, which
  11. causes KVM to switch into guest context. We use IVPR to hijack the host
  12. interrupt vectors while running the guest, which allows us to direct all
  13. interrupts to kvmppc_handle_interrupt(). At this point, we could either
  14. - handle the interrupt completely (e.g. emulate "mtspr SPRG0"), or
  15. - let the host interrupt handler run (e.g. when the decrementer fires), or
  16. - return to host userspace (e.g. when the guest performs device MMIO)
  17. Address spaces: We take advantage of the fact that Linux doesn't use the AS=1
  18. address space (in host or guest), which gives us virtual address space to use
  19. for guest mappings. While the guest is running, the host kernel remains mapped
  20. in AS=0, but the guest can only use AS=1 mappings.
  21. TLB entries: The TLB entries covering the host linear mapping remain
  22. present while running the guest. This reduces the overhead of lightweight
  23. exits, which are handled by KVM running in the host kernel. We keep three
  24. copies of the TLB:
  25. - guest TLB: contents of the TLB as the guest sees it
  26. - shadow TLB: the TLB that is actually in hardware while guest is running
  27. - host TLB: to restore TLB state when context switching guest -> host
  28. When a TLB miss occurs because a mapping was not present in the shadow TLB,
  29. but was present in the guest TLB, KVM handles the fault without invoking the
  30. guest. Large guest pages are backed by multiple 4KB shadow pages through this
  31. mechanism.
  32. IO: MMIO and DCR accesses are emulated by userspace. We use virtio for network
  33. and block IO, so those drivers must be enabled in the guest. It's possible
  34. that some qemu device emulation (e.g. e1000 or rtl8139) may also work with
  35. little effort.