diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2023-09-07 13:52:20 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2023-09-07 13:52:20 -0700 |
commit | 0c02183427b4d2002992f26d4917c1263c5d4a7f (patch) | |
tree | 426a0b282af3f309934cf0ff813b02c385e7ea04 /virt/kvm/kvm_main.c | |
parent | 4a0fc73da97efd23a383ca839e6fe86410268f6b (diff) | |
parent | d011151616e73de20c139580b73fa4c7042bd861 (diff) |
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"ARM:
- Clean up vCPU targets, always returning generic v8 as the preferred
target
- Trap forwarding infrastructure for nested virtualization (used for
traps that are taken from an L2 guest and are needed by the L1
hypervisor)
- FEAT_TLBIRANGE support to only invalidate specific ranges of
addresses when collapsing a table PTE to a block PTE. This avoids
that the guest refills the TLBs again for addresses that aren't
covered by the table PTE.
- Fix vPMU issues related to handling of PMUver.
- Don't unnecessary align non-stack allocations in the EL2 VA space
- Drop HCR_VIRT_EXCP_MASK, which was never used...
- Don't use smp_processor_id() in kvm_arch_vcpu_load(), but the cpu
parameter instead
- Drop redundant call to kvm_set_pfn_accessed() in user_mem_abort()
- Remove prototypes without implementations
RISC-V:
- Zba, Zbs, Zicntr, Zicsr, Zifencei, and Zihpm support for guest
- Added ONE_REG interface for SATP mode
- Added ONE_REG interface to enable/disable multiple ISA extensions
- Improved error codes returned by ONE_REG interfaces
- Added KVM_GET_REG_LIST ioctl() implementation for KVM RISC-V
- Added get-reg-list selftest for KVM RISC-V
s390:
- PV crypto passthrough enablement (Tony, Steffen, Viktor, Janosch)
Allows a PV guest to use crypto cards. Card access is governed by
the firmware and once a crypto queue is "bound" to a PV VM every
other entity (PV or not) looses access until it is not bound
anymore. Enablement is done via flags when creating the PV VM.
- Guest debug fixes (Ilya)
x86:
- Clean up KVM's handling of Intel architectural events
- Intel bugfixes
- Add support for SEV-ES DebugSwap, allowing SEV-ES guests to use
debug registers and generate/handle #DBs
- Clean up LBR virtualization code
- Fix a bug where KVM fails to set the target pCPU during an IRTE
update
- Fix fatal bugs in SEV-ES intrahost migration
- Fix a bug where the recent (architecturally correct) change to
reinject #BP and skip INT3 broke SEV guests (can't decode INT3 to
skip it)
- Retry APIC map recalculation if a vCPU is added/enabled
- Overhaul emergency reboot code to bring SVM up to par with VMX, tie
the "emergency disabling" behavior to KVM actually being loaded,
and move all of the logic within KVM
- Fix user triggerable WARNs in SVM where KVM incorrectly assumes the
TSC ratio MSR cannot diverge from the default when TSC scaling is
disabled up related code
- Add a framework to allow "caching" feature flags so that KVM can
check if the guest can use a feature without needing to search
guest CPUID
- Rip out the ancient MMU_DEBUG crud and replace the useful bits with
CONFIG_KVM_PROVE_MMU
- Fix KVM's handling of !visible guest roots to avoid premature
triple fault injection
- Overhaul KVM's page-track APIs, and KVMGT's usage, to reduce the
API surface that is needed by external users (currently only
KVMGT), and fix a variety of issues in the process
Generic:
- Wrap kvm_{gfn,hva}_range.pte in a union to allow mmu_notifier
events to pass action specific data without needing to constantly
update the main handlers.
- Drop unused function declarations
Selftests:
- Add testcases to x86's sync_regs_test for detecting KVM TOCTOU bugs
- Add support for printf() in guest code and covert all guest asserts
to use printf-based reporting
- Clean up the PMU event filter test and add new testcases
- Include x86 selftests in the KVM x86 MAINTAINERS entry"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (279 commits)
KVM: x86/mmu: Include mmu.h in spte.h
KVM: x86/mmu: Use dummy root, backed by zero page, for !visible guest roots
KVM: x86/mmu: Disallow guest from using !visible slots for page tables
KVM: x86/mmu: Harden TDP MMU iteration against root w/o shadow page
KVM: x86/mmu: Harden new PGD against roots without shadow pages
KVM: x86/mmu: Add helper to convert root hpa to shadow page
drm/i915/gvt: Drop final dependencies on KVM internal details
KVM: x86/mmu: Handle KVM bookkeeping in page-track APIs, not callers
KVM: x86/mmu: Drop @slot param from exported/external page-track APIs
KVM: x86/mmu: Bug the VM if write-tracking is used but not enabled
KVM: x86/mmu: Assert that correct locks are held for page write-tracking
KVM: x86/mmu: Rename page-track APIs to reflect the new reality
KVM: x86/mmu: Drop infrastructure for multiple page-track modes
KVM: x86/mmu: Use page-track notifiers iff there are external users
KVM: x86/mmu: Move KVM-only page-track declarations to internal header
KVM: x86: Remove the unused page-track hook track_flush_slot()
drm/i915/gvt: switch from ->track_flush_slot() to ->track_remove_region()
KVM: x86: Add a new page-track hook to handle memslot deletion
drm/i915/gvt: Don't bother removing write-protection on to-be-deleted slot
KVM: x86: Reject memslot MOVE operations if KVMGT is attached
...
Diffstat (limited to 'virt/kvm/kvm_main.c')
-rw-r--r-- | virt/kvm/kvm_main.c | 54 |
1 files changed, 40 insertions, 14 deletions
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 2500178cf444..486800a7024b 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -345,7 +345,6 @@ bool kvm_make_all_cpus_request(struct kvm *kvm, unsigned int req) } EXPORT_SYMBOL_GPL(kvm_make_all_cpus_request); -#ifndef CONFIG_HAVE_KVM_ARCH_TLB_FLUSH_ALL void kvm_flush_remote_tlbs(struct kvm *kvm) { ++kvm->stat.generic.remote_tlb_flush_requests; @@ -361,12 +360,38 @@ void kvm_flush_remote_tlbs(struct kvm *kvm) * kvm_make_all_cpus_request() reads vcpu->mode. We reuse that * barrier here. */ - if (!kvm_arch_flush_remote_tlb(kvm) + if (!kvm_arch_flush_remote_tlbs(kvm) || kvm_make_all_cpus_request(kvm, KVM_REQ_TLB_FLUSH)) ++kvm->stat.generic.remote_tlb_flush; } EXPORT_SYMBOL_GPL(kvm_flush_remote_tlbs); -#endif + +void kvm_flush_remote_tlbs_range(struct kvm *kvm, gfn_t gfn, u64 nr_pages) +{ + if (!kvm_arch_flush_remote_tlbs_range(kvm, gfn, nr_pages)) + return; + + /* + * Fall back to a flushing entire TLBs if the architecture range-based + * TLB invalidation is unsupported or can't be performed for whatever + * reason. + */ + kvm_flush_remote_tlbs(kvm); +} + +void kvm_flush_remote_tlbs_memslot(struct kvm *kvm, + const struct kvm_memory_slot *memslot) +{ + /* + * All current use cases for flushing the TLBs for a specific memslot + * are related to dirty logging, and many do the TLB flush out of + * mmu_lock. The interaction between the various operations on memslot + * must be serialized by slots_locks to ensure the TLB flush from one + * operation is observed by any other operation on the same memslot. + */ + lockdep_assert_held(&kvm->slots_lock); + kvm_flush_remote_tlbs_range(kvm, memslot->base_gfn, memslot->npages); +} static void kvm_flush_shadow_all(struct kvm *kvm) { @@ -526,7 +551,7 @@ typedef void (*on_unlock_fn_t)(struct kvm *kvm); struct kvm_hva_range { unsigned long start; unsigned long end; - pte_t pte; + union kvm_mmu_notifier_arg arg; hva_handler_t handler; on_lock_fn_t on_lock; on_unlock_fn_t on_unlock; @@ -547,6 +572,8 @@ static void kvm_null_fn(void) } #define IS_KVM_NULL_FN(fn) ((fn) == (void *)kvm_null_fn) +static const union kvm_mmu_notifier_arg KVM_MMU_NOTIFIER_NO_ARG; + /* Iterate over each memslot intersecting [start, last] (inclusive) range */ #define kvm_for_each_memslot_in_hva_range(node, slots, start, last) \ for (node = interval_tree_iter_first(&slots->hva_tree, start, last); \ @@ -591,7 +618,7 @@ static __always_inline int __kvm_handle_hva_range(struct kvm *kvm, * bother making these conditional (to avoid writes on * the second or later invocation of the handler). */ - gfn_range.pte = range->pte; + gfn_range.arg = range->arg; gfn_range.may_block = range->may_block; /* @@ -632,14 +659,14 @@ static __always_inline int __kvm_handle_hva_range(struct kvm *kvm, static __always_inline int kvm_handle_hva_range(struct mmu_notifier *mn, unsigned long start, unsigned long end, - pte_t pte, + union kvm_mmu_notifier_arg arg, hva_handler_t handler) { struct kvm *kvm = mmu_notifier_to_kvm(mn); const struct kvm_hva_range range = { .start = start, .end = end, - .pte = pte, + .arg = arg, .handler = handler, .on_lock = (void *)kvm_null_fn, .on_unlock = (void *)kvm_null_fn, @@ -659,7 +686,6 @@ static __always_inline int kvm_handle_hva_range_no_flush(struct mmu_notifier *mn const struct kvm_hva_range range = { .start = start, .end = end, - .pte = __pte(0), .handler = handler, .on_lock = (void *)kvm_null_fn, .on_unlock = (void *)kvm_null_fn, @@ -693,6 +719,7 @@ static void kvm_mmu_notifier_change_pte(struct mmu_notifier *mn, pte_t pte) { struct kvm *kvm = mmu_notifier_to_kvm(mn); + const union kvm_mmu_notifier_arg arg = { .pte = pte }; trace_kvm_set_spte_hva(address); @@ -708,7 +735,7 @@ static void kvm_mmu_notifier_change_pte(struct mmu_notifier *mn, if (!READ_ONCE(kvm->mmu_invalidate_in_progress)) return; - kvm_handle_hva_range(mn, address, address + 1, pte, kvm_change_spte_gfn); + kvm_handle_hva_range(mn, address, address + 1, arg, kvm_change_spte_gfn); } void kvm_mmu_invalidate_begin(struct kvm *kvm, unsigned long start, @@ -747,7 +774,6 @@ static int kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn, const struct kvm_hva_range hva_range = { .start = range->start, .end = range->end, - .pte = __pte(0), .handler = kvm_unmap_gfn_range, .on_lock = kvm_mmu_invalidate_begin, .on_unlock = kvm_arch_guest_memory_reclaimed, @@ -812,7 +838,6 @@ static void kvm_mmu_notifier_invalidate_range_end(struct mmu_notifier *mn, const struct kvm_hva_range hva_range = { .start = range->start, .end = range->end, - .pte = __pte(0), .handler = (void *)kvm_null_fn, .on_lock = kvm_mmu_invalidate_end, .on_unlock = (void *)kvm_null_fn, @@ -845,7 +870,8 @@ static int kvm_mmu_notifier_clear_flush_young(struct mmu_notifier *mn, { trace_kvm_age_hva(start, end); - return kvm_handle_hva_range(mn, start, end, __pte(0), kvm_age_gfn); + return kvm_handle_hva_range(mn, start, end, KVM_MMU_NOTIFIER_NO_ARG, + kvm_age_gfn); } static int kvm_mmu_notifier_clear_young(struct mmu_notifier *mn, @@ -2180,7 +2206,7 @@ static int kvm_get_dirty_log_protect(struct kvm *kvm, struct kvm_dirty_log *log) } if (flush) - kvm_arch_flush_remote_tlbs_memslot(kvm, memslot); + kvm_flush_remote_tlbs_memslot(kvm, memslot); if (copy_to_user(log->dirty_bitmap, dirty_bitmap_buffer, n)) return -EFAULT; @@ -2297,7 +2323,7 @@ static int kvm_clear_dirty_log_protect(struct kvm *kvm, KVM_MMU_UNLOCK(kvm); if (flush) - kvm_arch_flush_remote_tlbs_memslot(kvm, memslot); + kvm_flush_remote_tlbs_memslot(kvm, memslot); return 0; } |