summaryrefslogtreecommitdiff
path: root/virt/kvm/arm
AgeCommit message (Collapse)Author
2020-05-16KVM: arm64: Move virt/kvm/arm to arch/arm64Marc Zyngier
Now that the 32bit KVM/arm host is a distant memory, let's move the whole of the KVM/arm64 code into the arm64 tree. As they said in the song: Welcome Home (Sanitarium). Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200513104034.74741-1-maz@kernel.org
2020-05-01KVM: arm64: Fix 32bit PC wrap-aroundMarc Zyngier
In the unlikely event that a 32bit vcpu traps into the hypervisor on an instruction that is located right at the end of the 32bit range, the emulation of that instruction is going to increment PC past the 32bit range. This isn't great, as userspace can then observe this value and get a bit confused. Conversly, userspace can do things like (in the context of a 64bit guest that is capable of 32bit EL0) setting PSTATE to AArch64-EL0, set PC to a 64bit value, change PSTATE to AArch32-USR, and observe that PC hasn't been truncated. More confusion. Fix both by: - truncating PC increments for 32bit guests - sanitizing all 32bit regs every time a core reg is changed by userspace, and that PSTATE indicates a 32bit mode. Cc: stable@vger.kernel.org Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-04-30KVM: arm64: vgic-v4: Initialize GICv4.1 even in the absence of a virtual ITSMarc Zyngier
KVM now expects to be able to use HW-accelerated delivery of vSGIs as soon as the guest has enabled thm. Unfortunately, we only initialize the GICv4 context if we have a virtual ITS exposed to the guest. Fix it by always initializing the GICv4.1 context if it is available on the host. Fixes: 2291ff2f2a56 ("KVM: arm64: GICv4.1: Plumb SGI implementation selection in the distributor") Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-04-23Merge branch 'kvm-arm64/vgic-fixes-5.7' into kvmarm-master/masterMarc Zyngier
2020-04-23KVM: arm64: vgic-its: Fix memory leak on the error path of vgic_add_lpi()Zenghui Yu
If we're going to fail out the vgic_add_lpi(), let's make sure the allocated vgic_irq memory is also freed. Though it seems that both cases are unlikely to fail. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200414030349.625-3-yuzenghui@huawei.com
2020-04-23KVM: arm64: vgic-v3: Retire all pending LPIs on vcpu destroyZenghui Yu
It's likely that the vcpu fails to handle all virtual interrupts if userspace decides to destroy it, leaving the pending ones stay in the ap_list. If the un-handled one is a LPI, its vgic_irq structure will be eventually leaked because of an extra refcount increment in vgic_queue_irq_unlock(). This was detected by kmemleak on almost every guest destroy, the backtrace is as follows: unreferenced object 0xffff80725aed5500 (size 128): comm "CPU 5/KVM", pid 40711, jiffies 4298024754 (age 166366.512s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 08 01 a9 73 6d 80 ff ff ...........sm... c8 61 ee a9 00 20 ff ff 28 1e 55 81 6c 80 ff ff .a... ..(.U.l... backtrace: [<000000004bcaa122>] kmem_cache_alloc_trace+0x2dc/0x418 [<0000000069c7dabb>] vgic_add_lpi+0x88/0x418 [<00000000bfefd5c5>] vgic_its_cmd_handle_mapi+0x4dc/0x588 [<00000000cf993975>] vgic_its_process_commands.part.5+0x484/0x1198 [<000000004bd3f8e3>] vgic_its_process_commands+0x50/0x80 [<00000000b9a65b2b>] vgic_mmio_write_its_cwriter+0xac/0x108 [<0000000009641ebb>] dispatch_mmio_write+0xd0/0x188 [<000000008f79d288>] __kvm_io_bus_write+0x134/0x240 [<00000000882f39ac>] kvm_io_bus_write+0xe0/0x150 [<0000000078197602>] io_mem_abort+0x484/0x7b8 [<0000000060954e3c>] kvm_handle_guest_abort+0x4cc/0xa58 [<00000000e0d0cd65>] handle_exit+0x24c/0x770 [<00000000b44a7fad>] kvm_arch_vcpu_ioctl_run+0x460/0x1988 [<0000000025fb897c>] kvm_vcpu_ioctl+0x4f8/0xee0 [<000000003271e317>] do_vfs_ioctl+0x160/0xcd8 [<00000000e7f39607>] ksys_ioctl+0x98/0xd8 Fix it by retiring all pending LPIs in the ap_list on the destroy path. p.s. I can also reproduce it on a normal guest shutdown. It is because userspace still send LPIs to vcpu (through KVM_SIGNAL_MSI ioctl) while the guest is being shutdown and unable to handle it. A little strange though and haven't dig further... Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> [maz: moved the distributor deallocation down to avoid an UAF splat] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200414030349.625-2-yuzenghui@huawei.com
2020-04-23KVM: arm: vgic-v2: Only use the virtual state when userspace accesses ↵Marc Zyngier
pending bits There is no point in accessing the HW when writing to any of the ISPENDR/ICPENDR registers from userspace, as only the guest should be allowed to change the HW state. Introduce new userspace-specific accessors that deal solely with the virtual state. Note that the API differs from that of GICv3, where userspace exclusively uses ISPENDR to set the state. Too bad we can't reuse it. Fixes: 82e40f558de56 ("KVM: arm/arm64: vgic-v2: Handle SGI bits in GICD_I{S,C}PENDR0 as WI") Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-04-22KVM: arm: vgic: Only use the virtual state when userspace accesses enable bitsMarc Zyngier
There is no point in accessing the HW when writing to any of the ISENABLER/ICENABLER registers from userspace, as only the guest should be allowed to change the HW state. Introduce new userspace-specific accessors that deal solely with the virtual state. Reported-by: James Morse <james.morse@arm.com> Tested-by: James Morse <james.morse@arm.com> Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-04-22KVM: arm: vgic: Synchronize the whole guest on GIC{D,R}_I{S,C}ACTIVER readMarc Zyngier
When a guest tries to read the active state of its interrupts, we currently just return whatever state we have in memory. This means that if such an interrupt lives in a List Register on another CPU, we fail to obsertve the latest active state for this interrupt. In order to remedy this, stop all the other vcpus so that they exit and we can observe the most recent value for the state. This is similar to what we are doing for the write side of the same registers, and results in new MMIO handlers for userspace (which do not need to stop the guest, as it is supposed to be stopped already). Reported-by: Julien Grall <julien@xen.org> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-04-17KVM: arm64: PSCI: Forbid 64bit functions for 32bit guestsMarc Zyngier
Implementing (and even advertising) 64bit PSCI functions to 32bit guests is at least a bit odd, if not altogether violating the spec which says ("5.2.1 Register usage in arguments and return values"): "Adherence to the SMC Calling Conventions implies that any AArch32 caller of an SMC64 function will get a return code of 0xFFFFFFFF(int32). This matches the NOT_SUPPORTED error code used in PSCI" Tighten the implementation by pretending these functions are not there for 32bit guests. Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-04-17KVM: arm64: PSCI: Narrow input registers when using 32bit functionsMarc Zyngier
When a guest delibarately uses an SMC32 function number (which is allowed), we should make sure we drop the top 32bits from the input arguments, as they could legitimately be junk. Reported-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-04-15KVM: arm: vgic: Fix limit condition when writing to GICD_I[CS]ACTIVERMarc Zyngier
When deciding whether a guest has to be stopped we check whether this is a private interrupt or not. Unfortunately, there's an off-by-one bug here, and we fail to recognize a whole range of interrupts as being global (GICv2 SPIs 32-63). Fix the condition from > to be >=. Cc: stable@vger.kernel.org Fixes: abd7229626b93 ("KVM: arm/arm64: Simplify active_change_prepare and plug race") Reported-by: André Przywara <andre.przywara@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-03-31KVM: Pass kvm_init()'s opaque param to additional arch funcsSean Christopherson
Pass @opaque to kvm_arch_hardware_setup() and kvm_arch_check_processor_compat() to allow architecture specific code to reference @opaque without having to stash it away in a temporary global variable. This will enable x86 to separate its vendor specific callback ops, which are passed via @opaque, into "init" and "runtime" ops without having to stash away the "init" ops. No functional change intended. Reviewed-by: Cornelia Huck <cohuck@redhat.com> Tested-by: Cornelia Huck <cohuck@redhat.com> #s390 Acked-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200321202603.19355-2-sean.j.christopherson@intel.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-31Merge tag 'kvmarm-5.7' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm updates for Linux 5.7 - GICv4.1 support - 32bit host removal
2020-03-24Merge branch 'kvm-arm64/gic-v4.1' into kvmarm-master/nextMarc Zyngier
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-03-24KVM: arm64: GICv4.1: Expose HW-based SGIs in debugfsMarc Zyngier
The vgic-state debugfs file could do with showing the pending state of the HW-backed SGIs. Plug it into the low-level code. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20200304203330.4967-24-maz@kernel.org
2020-03-24KVM: arm64: GICv4.1: Reload VLPI configuration on distributor enable/disableMarc Zyngier
Each time a Group-enable bit gets flipped, the state of these bits needs to be forwarded to the hardware. This is a pretty heavy handed operation, requiring all vcpus to reload their GICv4 configuration. It is thus implemented as a new request type. These enable bits are programmed into the HW by setting the VGrp{0,1}En fields of GICR_VPENDBASER when the vPEs are made resident again. Of course, we only support Group-1 for now... Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20200304203330.4967-22-maz@kernel.org
2020-03-24KVM: arm64: GICv4.1: Plumb SGI implementation selection in the distributorMarc Zyngier
The GICv4.1 architecture gives the hypervisor the option to let the guest choose whether it wants the good old SGIs with an active state, or the new, HW-based ones that do not have one. For this, plumb the configuration of SGIs into the GICv3 MMIO handling, present the GICD_TYPER2.nASSGIcap to the guest, and handle the GICD_CTLR.nASSGIreq setting. In order to be able to deal with the restore of a guest, also apply the GICD_CTLR.nASSGIreq setting at first run so that we can move the restored SGIs to the HW if that's what the guest had selected in a previous life. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20200304203330.4967-21-maz@kernel.org
2020-03-24KVM: arm64: GICv4.1: Allow SGIs to switch between HW and SW interruptsMarc Zyngier
In order to let a guest buy in the new, active-less SGIs, we need to be able to switch between the two modes. Handle this by stopping all guest activity, transfer the state from one mode to the other, and resume the guest. Nothing calls this code so far, but a later patch will plug it into the MMIO emulation. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20200304203330.4967-20-maz@kernel.org
2020-03-24KVM: arm64: GICv4.1: Add direct injection capability to SGI registersMarc Zyngier
Most of the GICv3 emulation code that deals with SGIs now has to be aware of the v4.1 capabilities in order to benefit from it. Add such support, keyed on the interrupt having the hw flag set and being a SGI. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20200304203330.4967-19-maz@kernel.org
2020-03-24KVM: arm64: GICv4.1: Let doorbells be auto-enabledMarc Zyngier
As GICv4.1 understands the life cycle of doorbells (instead of just randomly firing them at the most inconvenient time), just enable them at irq_request time, and be done with it. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20200304203330.4967-18-maz@kernel.org
2020-03-24irqchip/gic-v4.1: Move doorbell management to the GICv4 abstraction layerMarc Zyngier
In order to hide some of the differences between v4.0 and v4.1, move the doorbell management out of the KVM code, and into the GICv4-specific layer. This allows the calling code to ask for the doorbell when blocking, and otherwise to leave the doorbell permanently disabled. This matches the v4.1 code perfectly, and only results in a minor refactoring of the v4.0 code. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20200304203330.4967-14-maz@kernel.org
2020-03-16KVM: Remove unnecessary asm/kvm_host.h includesPeter Xu
Remove includes of asm/kvm_host.h from files that already include linux/kvm_host.h to make it more obvious that there is no ordering issue between the two headers. linux/kvm_host.h includes asm/kvm_host.h to pick up architecture specific settings, and this will never change, i.e. including asm/kvm_host.h after linux/kvm_host.h may seem problematic, but in practice is simply redundant. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: Terminate memslot walks via used_slotsSean Christopherson
Refactor memslot handling to treat the number of used slots as the de facto size of the memslot array, e.g. return NULL from id_to_memslot() when an invalid index is provided instead of relying on npages==0 to detect an invalid memslot. Rework the sorting and walking of memslots in advance of dynamically sizing memslots to aid bisection and debug, e.g. with luck, a bug in the refactoring will bisect here and/or hit a WARN instead of randomly corrupting memory. Alternatively, a global null/invalid memslot could be returned, i.e. so callers of id_to_memslot() don't have to explicitly check for a NULL memslot, but that approach runs the risk of introducing difficult-to- debug issues, e.g. if the global null slot is modified. Constifying the return from id_to_memslot() to combat such issues is possible, but would require a massive refactoring of arch specific code and would still be susceptible to casting shenanigans. Add function comments to update_memslots() and search_memslots() to explicitly (and loudly) state how memslots are sorted. Opportunistically stuff @hva with a non-canonical value when deleting a private memslot on x86 to detect bogus usage of the freed slot. No functional change intended. Tested-by: Christoffer Dall <christoffer.dall@arm.com> Tested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: Provide common implementation for generic dirty log functionsSean Christopherson
Move the implementations of KVM_GET_DIRTY_LOG and KVM_CLEAR_DIRTY_LOG for CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT into common KVM code. The arch specific implemenations are extremely similar, differing only in whether the dirty log needs to be sync'd from hardware (x86) and how the TLBs are flushed. Add new arch hooks to handle sync and TLB flush; the sync will also be used for non-generic dirty log support in a future patch (s390). The ulterior motive for providing a common implementation is to eliminate the dependency between arch and common code with respect to the memslot referenced by the dirty log, i.e. to make it obvious in the code that the validity of the memslot is guaranteed, as a future patch will rework memslot handling such that id_to_memslot() can return NULL. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: Simplify kvm_free_memslot() and all its descendentsSean Christopherson
Now that all callers of kvm_free_memslot() pass NULL for @dont, remove the param from the top-level routine and all arch's implementations. No functional change intended. Tested-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: Drop "const" attribute from old memslot in commit_memory_region()Sean Christopherson
Drop the "const" attribute from @old in kvm_arch_commit_memory_region() to allow arch specific code to free arch specific resources in the old memslot without having to cast away the attribute. Freeing resources in kvm_arch_commit_memory_region() paves the way for simplifying kvm_free_memslot() by eliminating the last usage of its @dont param. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: Drop kvm_arch_create_memslot()Sean Christopherson
Remove kvm_arch_create_memslot() now that all arch implementations are effectively nops. Removing kvm_arch_create_memslot() eliminates the possibility for arch specific code to allocate memory prior to setting a memslot, which sets the stage for simplifying kvm_free_memslot(). Cc: Janosch Frank <frankja@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: arm64: Use the correct timer structure to access the physical counterKarimAllah Ahmed
Use the physical timer structure when reading the physical counter instead of using the virtual timer structure. Thankfully, nothing is accessing this code path yet (at least not until we enable save/restore of the physical counter). It doesn't hurt for this to be correct though. Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> [maz: amended commit log] Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Fixes: 84135d3d18da ("KVM: arm/arm64: consolidate arch timer trap handlers") Link: https://lore.kernel.org/r/1584351546-5018-1-git-send-email-karahmed@amazon.de
2020-02-28Merge tag 'kvmarm-fixes-5.6-1' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm fixes for 5.6, take #1 - Fix compilation on 32bit - Move VHE guest entry/exit into the VHE-specific entry code - Make sure all functions called by the non-VHE HYP code is tagged as __always_inline
2020-02-17kvm: arm/arm64: Fold VHE entry/exit work into kvm_vcpu_run_vhe()Mark Rutland
With VHE, running a vCPU always requires the sequence: 1. kvm_arm_vhe_guest_enter(); 2. kvm_vcpu_run_vhe(); 3. kvm_arm_vhe_guest_exit() ... and as we invoke this from the shared arm/arm64 KVM code, 32-bit arm has to provide stubs for all three functions. To simplify the common code, and make it easier to make further modifications to the arm64-specific portions in the near future, let's fold kvm_arm_vhe_guest_enter() and kvm_arm_vhe_guest_exit() into kvm_vcpu_run_vhe(). The 32-bit stubs for kvm_arm_vhe_guest_enter() and kvm_arm_vhe_guest_exit() are removed, as they are no longer used. The 32-bit stub for kvm_vcpu_run_vhe() is left as-is. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200210114757.2889-1-mark.rutland@arm.com
2020-02-12KVM: Disable preemption in kvm_get_running_vcpu()Marc Zyngier
Accessing a per-cpu variable only makes sense when preemption is disabled (and the kernel does check this when the right debug options are switched on). For kvm_get_running_vcpu(), it is fine to return the value after re-enabling preemption, as the preempt notifiers will make sure that this is kept consistent across task migration (the comment above the function hints at it, but lacks the crucial preemption management). While we're at it, move the comment from the ARM code, which explains why the whole thing works. Fixes: 7495e22bb165 ("KVM: Move running VCPU from ARM to common code"). Cc: Paolo Bonzini <pbonzini@redhat.com> Reported-by: Zenghui Yu <yuzenghui@huawei.com> Tested-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/318984f6-bc36-33a3-abc6-bf2295974b06@huawei.com Message-id: <20200207163410.31276-1-maz@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-05KVM: arm/arm64: Fix up includes for trace.hJeremy Cline
Fedora kernel builds on armv7hl began failing recently because kvm_arm_exception_type and kvm_arm_exception_class were undeclared in trace.h. Add the missing include. Fixes: 0e20f5e25556 ("KVM: arm/arm64: Cleanup MMIO handling") Signed-off-by: Jeremy Cline <jcline@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200205134146.82678-1-jcline@redhat.com
2020-01-30Merge tag 'kvmarm-5.6' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm updates for Linux 5.6 - Fix MMIO sign extension - Fix HYP VA tagging on tag space exhaustion - Fix PSTATE/CPSR handling when generating exception - Fix MMU notifier's advertizing of young pages - Fix poisoned page handling - Fix PMU SW event handling - Fix TVAL register access - Fix AArch32 external abort injection - Fix ITS unmapped collection handling - Various cleanups
2020-01-28KVM: arm64: Treat emulated TVAL TimerValue as a signed 32-bit integerAlexandru Elisei
According to the ARM ARM, registers CNT{P,V}_TVAL_EL0 have bits [63:32] RES0 [1]. When reading the register, the value is truncated to the least significant 32 bits [2], and on writes, TimerValue is treated as a signed 32-bit integer [1, 2]. When the guest behaves correctly and writes 32-bit values, treating TVAL as an unsigned 64 bit register works as expected. However, things start to break down when the guest writes larger values, because (u64)0x1_ffff_ffff = 8589934591. but (s32)0x1_ffff_ffff = -1, and the former will cause the timer interrupt to be asserted in the future, but the latter will cause it to be asserted now. Let's treat TVAL as a signed 32-bit register on writes, to match the behaviour described in the architecture, and the behaviour experimentally exhibited by the virtual timer on a non-vhe host. [1] Arm DDI 0487E.a, section D13.8.18 [2] Arm DDI 0487E.a, section D11.2.4 Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> [maz: replaced the read-side mask with lower_32_bits] Signed-off-by: Marc Zyngier <maz@kernel.org> Fixes: 8fa761624871 ("KVM: arm/arm64: arch_timer: Fix CNTP_TVAL calculation") Link: https://lore.kernel.org/r/20200127103652.2326-1-alexandru.elisei@arm.com
2020-01-28KVM: arm64: pmu: Only handle supported event countersEric Auger
Let the code never use unsupported event counters. Change kvm_pmu_handle_pmcr() to only reset supported counters and kvm_pmu_vcpu_reset() to only stop supported counters. Other actions are filtered on the supported counters in kvm/sysregs.c Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200124142535.29386-5-eric.auger@redhat.com
2020-01-28KVM: arm64: pmu: Fix chained SW_INCR countersEric Auger
At the moment a SW_INCR counter always overflows on 32-bit boundary, independently on whether the n+1th counter is programmed as CHAIN. Check whether the SW_INCR counter is a 64b counter and if so, implement the 64b logic. Fixes: 80f393a23be6 ("KVM: arm/arm64: Support chained PMU counters") Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200124142535.29386-4-eric.auger@redhat.com
2020-01-28KVM: arm64: pmu: Don't mark a counter as chained if the odd one is disabledEric Auger
At the moment we update the chain bitmap on type setting. This does not take into account the enable state of the odd register. Let's make sure a counter is never considered as chained if the high counter is disabled. We recompute the chain state on enable/disable and type changes. Also let create_perf_event() use the chain bitmap and not use kvm_pmu_idx_has_chain_evtype(). Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200124142535.29386-3-eric.auger@redhat.com
2020-01-28KVM: arm64: pmu: Don't increment SW_INCR if PMCR.E is unsetEric Auger
The specification says PMSWINC increments PMEVCNTR<n>_EL1 by 1 if PMEVCNTR<n>_EL0 is enabled and configured to count SW_INCR. For PMEVCNTR<n>_EL0 to be enabled, we need both PMCNTENSET to be set for the corresponding event counter but we also need the PMCR.E bit to be set. Fixes: 7a0adc7064b8 ("arm64: KVM: Add access handler for PMSWINC register") Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Andrew Murray <andrew.murray@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200124142535.29386-2-eric.auger@redhat.com
2020-01-27mm: thp: KVM: Explicitly check for THP when populating secondary MMUSean Christopherson
Add a helper, is_transparent_hugepage(), to explicitly check whether a compound page is a THP and use it when populating KVM's secondary MMU. The explicit check fixes a bug where a remapped compound page, e.g. for an XDP Rx socket, is mapped into a KVM guest and is mistaken for a THP, which results in KVM incorrectly creating a huge page in its secondary MMU. Fixes: 936a5fe6e6148 ("thp: kvm mmu transparent hugepage support") Reported-by: syzbot+c9d1fb51ac9d0d10c39d@syzkaller.appspotmail.com Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: Move running VCPU from ARM to common codePaolo Bonzini
For ring-based dirty log tracking, it will be more efficient to account writes during schedule-out or schedule-in to the currently running VCPU. We would like to do it even if the write doesn't use the current VCPU's address space, as is the case for cached writes (see commit 4e335d9e7ddb, "Revert "KVM: Support vCPU-based gfn->hva cache"", 2017-05-02). Therefore, add a mechanism to track the currently-loaded kvm_vcpu struct. There is already something similar in KVM/ARM; one important difference is that kvm_arch_vcpu_{load,put} have two callers in virt/kvm/kvm_main.c: we have to update both the architecture-independent vcpu_{load,put} and the preempt notifiers. Another change made in the process is to allow using kvm_get_running_vcpu() in preemptible code. This is allowed because preempt notifiers ensure that the value does not change even after the VCPU thread is migrated. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: Drop kvm_arch_vcpu_init() and kvm_arch_vcpu_uninit()Sean Christopherson
Remove kvm_arch_vcpu_init() and kvm_arch_vcpu_uninit() now that all arch specific implementations are nops. Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: arm64: Free sve_state via arm specific hookSean Christopherson
Add an arm specific hook to free the arm64-only sve_state. Doing so eliminates the last functional code from kvm_arch_vcpu_uninit() across all architectures and paves the way for removing kvm_arch_vcpu_init() and kvm_arch_vcpu_uninit() entirely. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: ARM: Move all vcpu init code into kvm_arch_vcpu_create()Sean Christopherson
Fold init() into create() now that the two are called back-to-back by common KVM code (kvm_vcpu_init() calls kvm_arch_vcpu_init() as its last action, and kvm_vm_ioctl_create_vcpu() calls kvm_arch_vcpu_create() immediately thereafter). This paves the way for removing kvm_arch_vcpu_{un}init() entirely. Note, there is no associated unwinding in kvm_arch_vcpu_uninit() that needs to be relocated (to kvm_arch_vcpu_destroy()). No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: Move vcpu alloc and init invocation to common codeSean Christopherson
Now that all architectures tightly couple vcpu allocation/free with the mandatory calls to kvm_{un}init_vcpu(), move the sequences verbatim to common KVM code. Move both allocation and initialization in a single patch to eliminate thrash in arch specific code. The bisection benefits of moving the two pieces in separate patches is marginal at best, whereas the odds of introducing a transient arch specific bug are non-zero. Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-24KVM: Introduce kvm_vcpu_destroy()Sean Christopherson
Add kvm_vcpu_destroy() and wire up all architectures to call the common function instead of their arch specific implementation. The common destruction function will be used by future patches to move allocation and initialization of vCPUs to common KVM code, i.e. to free resources that are allocated by arch agnostic code. No functional change intended. Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-24KVM: Add kvm_arch_vcpu_precreate() to handle pre-allocation issuesSean Christopherson
Add a pre-allocation arch hook to handle checks that are currently done by arch specific code prior to allocating the vCPU object. This paves the way for moving the allocation to common KVM code. Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-24KVM: arm: Drop kvm_arch_vcpu_free()Sean Christopherson
Remove the superfluous kvm_arch_vcpu_free() as it is no longer called from commmon KVM code. Note, kvm_arch_vcpu_destroy() *is* called from common code, i.e. choosing which function to whack is not completely arbitrary. Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-23KVM: arm: Make inject_abt32() inject an external abort insteadJames Morse
KVM's inject_abt64() injects an external-abort into an aarch64 guest. The KVM_CAP_ARM_INJECT_EXT_DABT is intended to do exactly this, but for an aarch32 guest inject_abt32() injects an implementation-defined exception, 'Lockdown fault'. Change this to external abort. For non-LPAE we now get the documented: | Unhandled fault: external abort on non-linefetch (0x008) at 0x9c800f00 and for LPAE: | Unhandled fault: synchronous external abort (0x210) at 0x9c800f00 Fixes: 74a64a981662a ("KVM: arm/arm64: Unify 32bit fault injection") Reported-by: Beata Michalska <beata.michalska@linaro.org> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200121123356.203000-3-james.morse@arm.com
2020-01-23KVM: arm: Fix DFSR setting for non-LPAE aarch32 guestsJames Morse
Beata reports that KVM_SET_VCPU_EVENTS doesn't inject the expected exception to a non-LPAE aarch32 guest. The host intends to inject DFSR.FS=0x14 "IMPLEMENTATION DEFINED fault (Lockdown fault)", but the guest receives DFSR.FS=0x04 "Fault on instruction cache maintenance". This fault is hooked by do_translation_fault() since ARMv6, which goes on to silently 'handle' the exception, and restart the faulting instruction. It turns out, when TTBCR.EAE is clear DFSR is split, and FS[4] has to shuffle up to DFSR[10]. As KVM only does this in one place, fix up the static values. We now get the expected: | Unhandled fault: lock abort (0x404) at 0x9c800f00 Fixes: 74a64a981662a ("KVM: arm/arm64: Unify 32bit fault injection") Reported-by: Beata Michalska <beata.michalska@linaro.org> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200121123356.203000-2-james.morse@arm.com