summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2024-03-18Merge branch 'master' into mm-stableAndrew Morton
2024-03-10Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "KVM GUEST_MEMFD fixes for 6.8: - Make KVM_MEM_GUEST_MEMFD mutually exclusive with KVM_MEM_READONLY to avoid creating an inconsistent ABI (KVM_MEM_GUEST_MEMFD is not writable from userspace, so there would be no way to write to a read-only guest_memfd). - Update documentation for KVM_SW_PROTECTED_VM to make it abundantly clear that such VMs are purely for development and testing. - Limit KVM_SW_PROTECTED_VM guests to the TDP MMU, as the long term plan is to support confidential VMs with deterministic private memory (SNP and TDX) only in the TDP MMU. - Fix a bug in a GUEST_MEMFD dirty logging test that caused false passes. x86 fixes: - Fix missing marking of a guest page as dirty when emulating an atomic access. - Check for mmu_notifier invalidation events before faulting in the pfn, and before acquiring mmu_lock, to avoid unnecessary work and lock contention with preemptible kernels (including CONFIG_PREEMPT_DYNAMIC in non-preemptible mode). - Disable AMD DebugSwap by default, it breaks VMSA signing and will be re-enabled with a better VM creation API in 6.10. - Do the cache flush of converted pages in svm_register_enc_region() before dropping kvm->lock, to avoid a race with unregistering of the same region and the consequent use-after-free issue" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: SEV: disable SEV-ES DebugSwap by default KVM: x86/mmu: Retry fault before acquiring mmu_lock if mapping is changing KVM: SVM: Flush pages under kvm->lock to fix UAF in svm_register_enc_region() KVM: selftests: Add a testcase to verify GUEST_MEMFD and READONLY are exclusive KVM: selftests: Create GUEST_MEMFD for relevant invalid flags testcases KVM: x86/mmu: Restrict KVM_SW_PROTECTED_VM to the TDP MMU KVM: x86: Update KVM_SW_PROTECTED_VM docs to make it clear they're a WIP KVM: Make KVM_MEM_GUEST_MEMFD mutually exclusive with KVM_MEM_READONLY KVM: x86: Mark target gfn of emulated atomic instruction as dirty
2024-03-09SEV: disable SEV-ES DebugSwap by defaultPaolo Bonzini
The DebugSwap feature of SEV-ES provides a way for confidential guests to use data breakpoints. However, because the status of the DebugSwap feature is recorded in the VMSA, enabling it by default invalidates the attestation signatures. In 6.10 we will introduce a new API to create SEV VMs that will allow enabling DebugSwap based on what the user tells KVM to do. Contextually, we will change the legacy KVM_SEV_ES_INIT API to never enable DebugSwap. For compatibility with kernels that pre-date the introduction of DebugSwap, as well as with those where KVM_SEV_ES_INIT will never enable it, do not enable the feature by default. If anybody wants to use it, for now they can enable the sev_es_debug_swap_enabled module parameter, but this will result in a warning. Fixes: d1f85fbe836e ("KVM: SEV: Enable data breakpoints in SEV-ES") Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-03-09Merge tag 'kvm-x86-guest_memfd_fixes-6.8' of ↵Paolo Bonzini
https://github.com/kvm-x86/linux into HEAD KVM GUEST_MEMFD fixes for 6.8: - Make KVM_MEM_GUEST_MEMFD mutually exclusive with KVM_MEM_READONLY to avoid creating ABI that KVM can't sanely support. - Update documentation for KVM_SW_PROTECTED_VM to make it abundantly clear that such VMs are purely a development and testing vehicle, and come with zero guarantees. - Limit KVM_SW_PROTECTED_VM guests to the TDP MMU, as the long term plan is to support confidential VMs with deterministic private memory (SNP and TDX) only in the TDP MMU. - Fix a bug in a GUEST_MEMFD negative test that resulted in false passes when verifying that KVM_MEM_GUEST_MEMFD memslots can't be dirty logged.
2024-03-09Merge tag 'kvm-x86-fixes-6.8-2' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini
KVM x86 fixes for 6.8, round 2: - When emulating an atomic access, mark the gfn as dirty in the memslot to fix a bug where KVM could fail to mark the slot as dirty during live migration, ultimately resulting in guest data corruption due to a dirty page not being re-copied from the source to the target. - Check for mmu_notifier invalidation events before faulting in the pfn, and before acquiring mmu_lock, to avoid unnecessary work and lock contention. Contending mmu_lock is especially problematic on preemptible kernels, as KVM may yield mmu_lock in response to the contention, which severely degrades overall performance due to vCPUs making it difficult for the task that triggered invalidation to make forward progress. Note, due to another kernel bug, this fix isn't limited to preemtible kernels, as any kernel built with CONFIG_PREEMPT_DYNAMIC=y will yield contended rwlocks and spinlocks. https://lore.kernel.org/all/20240110214723.695930-1-seanjc@google.com
2024-03-07Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fix from Will Deacon: "A lonely arm64 fix addressing a kprobes regression that we introduced during the merge window: - Fix recursive kprobes regression when probing the stack unwinder" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: prohibit probing on arch_kunwind_consume_entry()
2024-03-06mm/treewide: align up pXd_leaf() retval across archsPeter Xu
Even if pXd_leaf() API is defined globally, it's not clear on the retval, and there are three types used (bool, int, unsigned log). Always return a boolean for pXd_leaf() APIs. Link: https://lkml.kernel.org/r/20240305043750.93762-11-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Muchun Song <muchun.song@linux.dev> Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-03-06mm/treewide: drop pXd_large()Peter Xu
They're not used anymore, drop all of them. Link: https://lkml.kernel.org/r/20240305043750.93762-10-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Muchun Song <muchun.song@linux.dev> Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-03-06mm/treewide: replace pud_large() with pud_leaf()Peter Xu
pud_large() is always defined as pud_leaf(). Merge their usages. Chose pud_leaf() because pud_leaf() is a global API, while pud_large() is not. Link: https://lkml.kernel.org/r/20240305043750.93762-9-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Muchun Song <muchun.song@linux.dev> Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-03-06mm/treewide: replace pmd_large() with pmd_leaf()Peter Xu
pmd_large() is always defined as pmd_leaf(). Merge their usages. Chose pmd_leaf() because pmd_leaf() is a global API, while pmd_large() is not. Link: https://lkml.kernel.org/r/20240305043750.93762-8-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Muchun Song <muchun.song@linux.dev> Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-03-06mm/x86: drop two unnecessary pud_leaf() definitionsPeter Xu
pud_leaf() has a fallback macro defined in include/linux/pgtable.h already. Drop the extra two for x86. Link: https://lkml.kernel.org/r/20240305043750.93762-6-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Muchun Song <muchun.song@linux.dev> Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-03-06mm/x86: replace pgd_large() with pgd_leaf()Peter Xu
pgd_leaf() is a global API while pgd_large() is not. Always use the global pgd_leaf(), then drop pgd_large(). Link: https://lkml.kernel.org/r/20240305043750.93762-5-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Muchun Song <muchun.song@linux.dev> Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-03-06mm/x86: replace p4d_large() with p4d_leaf()Peter Xu
p4d_large() is always defined as p4d_leaf(). Merge their usages. Chose p4d_leaf() because p4d_leaf() is a global API, while p4d_large() is not. Only x86 has p4d_leaf() defined as of now. So it also means after this patch we removed all p4d_large() usages. Link: https://lkml.kernel.org/r/20240305043750.93762-4-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Muchun Song <muchun.song@linux.dev> Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-03-06mm/powerpc: replace pXd_is_leaf() with pXd_leaf()Peter Xu
They're the same macros underneath. Drop pXd_is_leaf(), instead always use pXd_leaf(). At the meantime, instead of renames, drop the pXd_is_leaf() fallback definitions directly in arch/powerpc/include/asm/pgtable.h. because similar fallback macros for pXd_leaf() are already defined in include/linux/pgtable.h. Link: https://lkml.kernel.org/r/20240305043750.93762-3-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org> Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Muchun Song <muchun.song@linux.dev> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-03-06mm/powerpc: define pXd_large() with pXd_leaf()Peter Xu
Patch series "mm/treewide: Replace pXd_large() with pXd_leaf()", v3. These two APIs are mostly always the same. It's confusing to have both of them. Merge them into one. Here I used pXd_leaf() only because pXd_leaf() is a global API which is always defined, while pXd_large() is not. We have yet one more API that is similar which is pXd_huge(), but that's even trickier, so let's do it step by step. Some special cares are taken for ppc and x86, they're done as separate cleanups first. This patch (of 10): The two definitions are the same. The only difference is that pXd_large() is only defined with THP selected, and only on book3s 64bits. Instead of implementing it twice, make pXd_large() a macro to pXd_leaf(). Define it unconditionally just like pXd_leaf(). This helps to prepare merging the two APIs. Link: https://lkml.kernel.org/r/20240305043750.93762-1-peterx@redhat.com Link: https://lkml.kernel.org/r/20240305043750.93762-2-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org> Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Muchun Song <muchun.song@linux.dev> Cc: Yang Shi <shy828301@gmail.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-03-06s390: supplement for ptdesc conversionQi Zheng
After commit 6326c26c1514 ("s390: convert various pgalloc functions to use ptdescs"), there are still some positions that use page->{lru, index} instead of ptdesc->{pt_list, pt_index}. In order to make the use of ptdesc->{pt_list, pt_index} clearer, it would be better to convert them as well. [zhengqi.arch@bytedance.com: fix build failure] Link: https://lkml.kernel.org/r/20240305072154.26168-1-zhengqi.arch@bytedance.com Link: https://lkml.kernel.org/r/04beaf3255056ffe131a5ea595736066c1e84756.1709541697.git.zhengqi.arch@bytedance.com Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Janosch Frank <frankja@linux.ibm.com> Cc: Claudio Imbrenda <imbrenda@linux.ibm.com> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-03-06hugetlb: parallelize 1G hugetlb initializationGang Li
Optimizing the initialization speed of 1G huge pages through parallelization. 1G hugetlbs are allocated from bootmem, a process that is already very fast and does not currently require optimization. Therefore, we focus on parallelizing only the initialization phase in `gather_bootmem_prealloc`. Here are some test results: test case no patch(ms) patched(ms) saved ------------------- -------------- ------------- -------- 256c2T(4 node) 1G 4745 2024 57.34% 128c1T(2 node) 1G 3358 1712 49.02% 12T 1G 77000 18300 76.23% [akpm@linux-foundation.org: s/initialied/initialized/, per Alexey] Link: https://lkml.kernel.org/r/20240222140422.393911-9-gang.li@linux.dev Signed-off-by: Gang Li <ligang.bdlg@bytedance.com> Tested-by: David Rientjes <rientjes@google.com> Reviewed-by: Muchun Song <muchun.song@linux.dev> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jane Chu <jane.chu@oracle.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-03-06Merge tag 'arm-fixes-6.8-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "These should be the final fixes for the soc tree for 6.8, as usual they mostly deal wtih dts files: - Qualcomm fixes for pcie4 on sc8280xp, a revert of msm8996 mpm support, sm6115 interconnect and sm8650 gpio. - Two fixes for Tegra234 ethernet - A Makefile fix to actually build the allwinner based orange pi zero 2w device tree - Fixes for clocks and reset on imx8mp and a DSI display regression on imx7. The non-DT fixes are: - Firmware fixes addressing a kernel panic in op-tee and a minor regression in microchip/riscv. - A defconfig change to bring back backlight support after a Kconfig change" * tag 'arm-fixes-6.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: firmware: microchip: Fix over-requested allocation size tee: optee: Fix kernel panic caused by incorrect error handling Revert "arm64: dts: qcom: msm8996: Hook up MPM" arm64: dts: qcom: sc8280xp-x13s: limit pcie4 link speed arm64: dts: qcom: sc8280xp-crd: limit pcie4 link speed arm64: dts: imx8mp: Fix LDB clocks property arm64: dts: imx8mp: Fix TC9595 reset GPIO on DH i.MX8M Plus DHCOM SoM MAINTAINERS: Use a proper mailinglist for NXP i.MX development ARM: dts: imx7: remove DSI port endpoints arm64: dts: allwinner: h616: Add Orange Pi Zero 2W to Makefile ARM: imx_v6_v7_defconfig: Restore CONFIG_BACKLIGHT_CLASS_DEVICE arm64: tegra: Fix Tegra234 MGBE power-domains arm64: tegra: Set the correct PHY mode for MGBE arm64: dts: qcom: sm6115: Fix missing interconnect-names arm64: dts: qcom: sm8650-mtp: add gpio74 as reserved gpio arm64: dts: qcom: sm8650-qrd: add gpio74 as reserved gpio
2024-03-06Merge tag 'qcom-arm64-fixes-for-6.8-2' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes A few more Qualcomm Arm64 DeviceTree fixes for v6.8 This reduces the link speed of the PCIe bus with WiFi-card connected on the Lenovo ThinkPad X13s and the Qualcomm Compute Reference Device, avoid link errors and initialization issues reported by users. It also reverts the enablement of MPM on MSM8996, which is reported to prevent boards on this platform from booting for some users. * tag 'qcom-arm64-fixes-for-6.8-2' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: Revert "arm64: dts: qcom: msm8996: Hook up MPM" arm64: dts: qcom: sc8280xp-x13s: limit pcie4 link speed arm64: dts: qcom: sc8280xp-crd: limit pcie4 link speed Link: https://lore.kernel.org/r/20240306031208.4218-1-andersson@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-03-05Merge tag 'hyperv-fixes-signed-20240303' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fixes from Wei Liu: - Multiple fixes, cleanups and documentations for Hyper-V core code and drivers * tag 'hyperv-fixes-signed-20240303' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: Drivers: hv: vmbus: make hv_bus const x86/hyperv: Allow 15-bit APIC IDs for VTL platforms x86/hyperv: Make encrypted/decrypted changes safe for load_unaligned_zeropad() x86/mm: Regularize set_memory_p() parameters and make non-static x86/hyperv: Use slow_virt_to_phys() in page transition hypervisor callback Documentation: hyperv: Add overview of PCI pass-thru device support Drivers: hv: vmbus: Update indentation in create_gpadl_header() Drivers: hv: vmbus: Remove duplication and cleanup code in create_gpadl_header() fbdev/hyperv_fb: Fix logic error for Gen2 VMs in hvfb_getmem() Drivers: hv: vmbus: Calculate ring buffer size for more efficient use of memory hv_utils: Allow implicit ICTIMESYNCFLAG_SYNC
2024-03-04arm64/mm: improve comment in contpte_ptep_get_lockless()Ryan Roberts
Make clear the atmicity/consistency requirements of the API and how we achieve them. Link: https://lore.kernel.org/linux-mm/Zc-Tqqfksho3BHmU@arm.com/ Link: https://lkml.kernel.org/r/20240226120321.1055731-3-ryan.roberts@arm.com Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-03-04arm64/mm: export contpte symbols only to GPL usersRyan Roberts
Patch series "Address some contpte nits". These 2 patches address some nits raised by Catalin late in the review cycle for my contpte series [1]. [1] https://lore.kernel.org/linux-mm/20240215103205.2607016-1-ryan.roberts@arm.com/ This patch (of 2): The contpte symbols must be exported since some of the public inline ptep_* APIs are called from modules and these inlines now call the contpte functions. Originally they were exported as EXPORT_SYMBOL() for fear of breaking out-of-tree modules. But we subsequently concluded that EXPORT_SYMBOL_GPL() should be safe since these functions are deeply core mm routines, and any module operating at this level is not going to be able to survive on EXPORT_SYMBOL alone. Link: https://lkml.kernel.org/r/20240226120321.1055731-1-ryan.roberts@arm.com Link: https://lore.kernel.org/linux-mm/f9fc2b31-11cb-4969-8961-9c89fea41b74@nvidia.com/ Link: https://lkml.kernel.org/r/20240226120321.1055731-2-ryan.roberts@arm.com Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-03-04x86/mm: always pass NULL as the first argument of switch_mm_irqs_off()Yosry Ahmed
The first argument of switch_mm_irqs_off() is unused by the x86 implementation. Make sure that x86 code never passes a non-NULL value to make this clear. Update the only non violating caller, switch_mm(). Link: https://lkml.kernel.org/r/20240222190911.1903054-2-yosryahmed@google.com Signed-off-by: Yosry Ahmed <yosryahmed@google.com> Suggested-by: Dave Hansen <dave.hansen@intel.com> Acked-by: Dave Hansen <dave.hansen@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-03-04x86/mm: further clarify switch_mm_irqs_off() documentationYosry Ahmed
Commit accf6b23d1e5a ("x86/mm: clarify "prev" usage in switch_mm_irqs_off()") attempted to clarify x86's usage of the arguments passed by generic code, specifically the "prev" argument the is unused by x86. However, it could have done a better job with the comment above switch_mm_irqs_off(). Rewrite this comment according to Dave Hansen's suggestion. Link: https://lkml.kernel.org/r/20240222190911.1903054-1-yosryahmed@google.com Fixes: 3cfd6625a6cf ("x86/mm: clarify "prev" usage in switch_mm_irqs_off()") Signed-off-by: Yosry Ahmed <yosryahmed@google.com> Suggested-by: Dave Hansen <dave.hansen@intel.com> Acked-by: Dave Hansen <dave.hansen@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-03-04Merge tag 'tegra-for-6.8-arm64-dt' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux into arm/fixes arm64: tegra: Device tree fixes for v6.8 This contains two fixes to make the MGBE Ethernet devices found on Tegra234 work properly. * tag 'tegra-for-6.8-arm64-dt' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux: arm64: tegra: Fix Tegra234 MGBE power-domains arm64: tegra: Set the correct PHY mode for MGBE Link: https://lore.kernel.org/r/20240226144536.1525704-1-thierry.reding@gmail.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-03-04Merge tag 'imx-fixes-6.8-2' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes i.MX fixes for 6.8, round 2: - Update MAINTAINERS to use a public mailing list for NXP i.MX development. - Re-enable CONFIG_BACKLIGHT_CLASS_DEVICE in imx_v6_v7_defconfig to fix a backlight regression. - Remove DSI port endpoints from i.MX7 SoC DTSI to fix a display regression. - Fix LDB clocks property for i.MX8MP device tree. - Fix TC9595 reset GPIO on DH i.MX8M Plus DHCOM SoM. * tag 'imx-fixes-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux: arm64: dts: imx8mp: Fix LDB clocks property arm64: dts: imx8mp: Fix TC9595 reset GPIO on DH i.MX8M Plus DHCOM SoM MAINTAINERS: Use a proper mailinglist for NXP i.MX development ARM: dts: imx7: remove DSI port endpoints ARM: imx_v6_v7_defconfig: Restore CONFIG_BACKLIGHT_CLASS_DEVICE Link: https://lore.kernel.org/r/ZdtPJzdenRybI+Bq@dragon Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-03-04Merge tag 'qcom-arm64-fixes-for-6.8' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes Qualcomm ARM64 DeviceTree fixes for 6.8 This marks an additional GPIO as protected on SM8650 devices, to avoid a system reset caused by a security violation with some firmware versions. It also adds the missing interconnect-names, which resolves a regression where one of the I2C busses on SM6115 devices would no longer probe in Linux. * tag 'qcom-arm64-fixes-for-6.8' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: arm64: dts: qcom: sm6115: Fix missing interconnect-names arm64: dts: qcom: sm8650-mtp: add gpio74 as reserved gpio arm64: dts: qcom: sm8650-qrd: add gpio74 as reserved gpio Link: https://lore.kernel.org/r/20240225025205.479589-1-andersson@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-03-04Merge tag 'sunxi-fixes-for-6.8-1' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into arm/fixes - include Orange Pi Zero 2W DT in Makefile * tag 'sunxi-fixes-for-6.8-1' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux: arm64: dts: allwinner: h616: Add Orange Pi Zero 2W to Makefile Link: https://lore.kernel.org/r/20240223205450.GA8881@jernej-laptop Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-03-04arm64: prohibit probing on arch_kunwind_consume_entry()Puranjay Mohan
Make arch_kunwind_consume_entry() as __always_inline otherwise the compiler might not inline it and allow attaching probes to it. Without this, just probing arch_kunwind_consume_entry() via <tracefs>/kprobe_events will crash the kernel on arm64. The crash can be reproduced using the following compiler and kernel combination: clang version 19.0.0git (https://github.com/llvm/llvm-project.git d68d29516102252f6bf6dc23fb22cef144ca1cb3) commit 87adedeba51a ("Merge tag 'net-6.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net") [root@localhost ~]# echo 'p arch_kunwind_consume_entry' > /sys/kernel/debug/tracing/kprobe_events [root@localhost ~]# echo 1 > /sys/kernel/debug/tracing/events/kprobes/enable Modules linked in: aes_ce_blk aes_ce_cipher ghash_ce sha2_ce virtio_net sha256_arm64 sha1_ce arm_smccc_trng net_failover failover virtio_mmio uio_pdrv_genirq uio sch_fq_codel dm_mod dax configfs CPU: 3 PID: 1405 Comm: bash Not tainted 6.8.0-rc6+ #14 Hardware name: linux,dummy-virt (DT) pstate: 604003c5 (nZCv DAIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : kprobe_breakpoint_handler+0x17c/0x258 lr : kprobe_breakpoint_handler+0x17c/0x258 sp : ffff800085d6ab60 x29: ffff800085d6ab60 x28: ffff0000066f0040 x27: ffff0000066f0b20 x26: ffff800081fa7b0c x25: 0000000000000002 x24: ffff00000b29bd18 x23: ffff00007904c590 x22: ffff800081fa6590 x21: ffff800081fa6588 x20: ffff00000b29bd18 x19: ffff800085d6ac40 x18: 0000000000000079 x17: 0000000000000001 x16: ffffffffffffffff x15: 0000000000000004 x14: ffff80008277a940 x13: 0000000000000003 x12: 0000000000000003 x11: 00000000fffeffff x10: c0000000fffeffff x9 : aa95616fdf80cc00 x8 : aa95616fdf80cc00 x7 : 205d343137373231 x6 : ffff800080fb48ec x5 : 0000000000000000 x4 : 0000000000000001 x3 : 0000000000000000 x2 : 0000000000000000 x1 : ffff800085d6a910 x0 : 0000000000000079 Call trace: kprobes: Failed to recover from reentered kprobes. kprobes: Dump kprobe: .symbol_name = arch_kunwind_consume_entry, .offset = 0, .addr = arch_kunwind_consume_entry+0x0/0x40 ------------[ cut here ]------------ kernel BUG at arch/arm64/kernel/probes/kprobes.c:241! kprobes: Failed to recover from reentered kprobes. kprobes: Dump kprobe: .symbol_name = arch_kunwind_consume_entry, .offset = 0, .addr = arch_kunwind_consume_entry+0x0/0x40 Fixes: 1aba06e7b2b4 ("arm64: stacktrace: factor out kunwind_stack_walk()") Signed-off-by: Puranjay Mohan <puranjay12@gmail.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20240229231620.24846-1-puranjay12@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-03Revert "arm64: dts: qcom: msm8996: Hook up MPM"Dmitry Baryshkov
Commit 09896da07315 ("arm64: dts: qcom: msm8996: Hook up MPM") has hooked up the MPM irq chip on the MSM8996 platform. However this causes my Dragonboard 820c crash during bootup (usually when probing IOMMUs). Revert the offending commit for now. Quick debug shows that making tlmm's wakeup-parent point to the MPM is enough to trigger the crash. Fixes: 09896da07315 ("arm64: dts: qcom: msm8996: Hook up MPM") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20240221-msm8996-revert-mpm-v1-1-cdca9e30c9b4@linaro.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2024-03-03Merge tag 'powerpc-6.8-5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix IOMMU table initialisation when doing kdump over SR-IOV - Fix incorrect RTAS function name for resetting TCE tables - Fix fpu_signal selftest failures since a recent change Thanks to Gaurav Batra and Nathan Lynch. * tag 'powerpc-6.8-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: selftests/powerpc: Fix fpu_signal failures powerpc/rtas: use correct function name for resetting TCE tables powerpc/pseries/iommu: IOMMU table is not initialized for kdump over SR-IOV
2024-03-03Merge tag 'x86_urgent_for_v6.8_rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Do not reserve SETUP_RNG_SEED setup data in the e820 map as it should be used by kexec only - Make sure MKTME feature detection happens at an earlier time in the boot process so that the physical address size supported by the CPU is properly corrected and MTRR masks are programmed properly, leading to TDX systems booting without disable_mtrr_cleanup on the cmdline - Make sure the different address sizes supported by the CPU are read out as early as possible * tag 'x86_urgent_for_v6.8_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/e820: Don't reserve SETUP_RNG_SEED in e820 x86/cpu/intel: Detect TME keyid bits before setting MTRR mask registers x86/cpu: Allow reducing x86_phys_bits during early_identify_cpu()
2024-03-01Merge tag 'riscv-for-linus-6.8-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - detect ".option arch" support on not-yet-released LLVM builds - fix missing TLB flush when modifying non-leaf PTEs - fixes for T-Head custom extensions - fix for systems with the legacy PMU, that manifests as a crash on kernels built without SBI PMU support - fix for systems that clear *envcfg on suspend, which manifests as cbo.zero trapping after resume - fixes for Svnapot systems, including removing Svnapot support for huge vmalloc/vmap regions * tag 'riscv-for-linus-6.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Sparse-Memory/vmemmap out-of-bounds fix riscv: Fix pte_leaf_size() for NAPOT Revert "riscv: mm: support Svnapot in huge vmap" riscv: Save/restore envcfg CSR during CPU suspend riscv: Add a custom ISA extension for the [ms]envcfg CSR riscv: Fix enabling cbo.zero when running in M-mode perf: RISCV: Fix panic on pmu overflow handler MAINTAINERS: Update SiFive driver maintainers drivers: perf: ctr_get_width function for legacy is not defined drivers: perf: added capabilities for legacy PMU RISC-V: Ignore V from the riscv,isa DT property on older T-Head CPUs riscv: Fix build error if !CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION riscv: mm: fix NOCACHE_THEAD does not set bit[61] correctly riscv: add CALLER_ADDRx support RISC-V: Drop invalid test from CONFIG_AS_HAS_OPTION_ARCH kbuild: Add -Wa,--fatal-warnings to as-instr invocation riscv: tlb: fix __p*d_free_tlb()
2024-03-01x86/e820: Don't reserve SETUP_RNG_SEED in e820Jiri Bohac
SETUP_RNG_SEED in setup_data is supplied by kexec and should not be reserved in the e820 map. Doing so reserves 16 bytes of RAM when booting with kexec. (16 bytes because data->len is zeroed by parse_setup_data so only sizeof(setup_data) is reserved.) When kexec is used repeatedly, each boot adds two entries in the kexec-provided e820 map as the 16-byte range splits a larger range of usable memory. Eventually all of the 128 available entries get used up. The next split will result in losing usable memory as the new entries cannot be added to the e820 map. Fixes: 68b8e9713c8e ("x86/setup: Use rng seeds from setup_data") Signed-off-by: Jiri Bohac <jbohac@suse.cz> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: <stable@kernel.org> Link: https://lore.kernel.org/r/ZbmOjKnARGiaYBd5@dwarf.suse.cz
2024-03-01arm64: dts: qcom: sc8280xp-x13s: limit pcie4 link speedJohan Hovold
Limit the WiFi PCIe link speed to Gen2 speed (500 MB/s), which is the speed that the boot firmware has brought up the link at (and that Windows uses). This is specifically needed to avoid a large amount of link errors when restarting the link during boot (but which are currently not reported). This also appears to fix intermittent failures to download the ath11k firmware during boot which can be seen when there is a longer delay between restarting the link and loading the WiFi driver (e.g. when using full disk encryption). Fixes: 123b30a75623 ("arm64: dts: qcom: sc8280xp-x13s: enable WiFi controller") Cc: stable@vger.kernel.org # 6.2 Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20240223152124.20042-8-johan+linaro@kernel.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2024-03-01arm64: dts: qcom: sc8280xp-crd: limit pcie4 link speedJohan Hovold
Limit the WiFi PCIe link speed to Gen2 speed (500 MB/s), which is the speed that Windows uses. Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20240223152124.20042-7-johan+linaro@kernel.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2024-03-01x86/hyperv: Allow 15-bit APIC IDs for VTL platformsSaurabh Sengar
The current method for signaling the compatibility of a Hyper-V host with MSIs featuring 15-bit APIC IDs relies on a synthetic cpuid leaf. However, for higher VTLs, this leaf is not reported, due to the absence of an IO-APIC. As an alternative, assume that when running at a high VTL, the host supports 15-bit APIC IDs. This assumption is safe, as Hyper-V does not employ any architectural MSIs at higher VTLs This unblocks startup of VTL2 environments with more than 256 CPUs. Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/1705341460-18394-1-git-send-email-ssengar@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1705341460-18394-1-git-send-email-ssengar@linux.microsoft.com>
2024-03-01x86/hyperv: Make encrypted/decrypted changes safe for load_unaligned_zeropad()Michael Kelley
In a CoCo VM, when transitioning memory from encrypted to decrypted, or vice versa, the caller of set_memory_encrypted() or set_memory_decrypted() is responsible for ensuring the memory isn't in use and isn't referenced while the transition is in progress. The transition has multiple steps, and the memory is in an inconsistent state until all steps are complete. A reference while the state is inconsistent could result in an exception that can't be cleanly fixed up. However, the kernel load_unaligned_zeropad() mechanism could cause a stray reference that can't be prevented by the caller of set_memory_encrypted() or set_memory_decrypted(), so there's specific code to handle this case. But a CoCo VM running on Hyper-V may be configured to run with a paravisor, with the #VC or #VE exception routed to the paravisor. There's no architectural way to forward the exceptions back to the guest kernel, and in such a case, the load_unaligned_zeropad() specific code doesn't work. To avoid this problem, mark pages as "not present" while a transition is in progress. If load_unaligned_zeropad() causes a stray reference, a normal page fault is generated instead of #VC or #VE, and the page-fault-based fixup handlers for load_unaligned_zeropad() resolve the reference. When the encrypted/decrypted transition is complete, mark the pages as "present" again. Signed-off-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://lore.kernel.org/r/20240116022008.1023398-4-mhklinux@outlook.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20240116022008.1023398-4-mhklinux@outlook.com>
2024-03-01x86/mm: Regularize set_memory_p() parameters and make non-staticMichael Kelley
set_memory_p() is currently static. It has parameters that don't match set_memory_p() under arch/powerpc and that aren't congruent with the other set_memory_* functions. There's no good reason for the difference. Fix this by making the parameters consistent, and update the one existing call site. Make the function non-static and add it to include/asm/set_memory.h so that it is completely parallel to set_memory_np() and is usable in other modules. No functional change. Signed-off-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Link: https://lore.kernel.org/r/20240116022008.1023398-3-mhklinux@outlook.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20240116022008.1023398-3-mhklinux@outlook.com>
2024-03-01x86/hyperv: Use slow_virt_to_phys() in page transition hypervisor callbackMichael Kelley
In preparation for temporarily marking pages not present during a transition between encrypted and decrypted, use slow_virt_to_phys() in the hypervisor callback. As long as the PFN is correct, slow_virt_to_phys() works even if the leaf PTE is not present. The existing functions that depend on vmalloc_to_page() all require that the leaf PTE be marked present, so they don't work. Update the comments for slow_virt_to_phys() to note this broader usage and the requirement to work even if the PTE is not marked present. Signed-off-by: Michael Kelley <mhklinux@outlook.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Link: https://lore.kernel.org/r/20240116022008.1023398-2-mhklinux@outlook.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20240116022008.1023398-2-mhklinux@outlook.com>
2024-02-29riscv: Sparse-Memory/vmemmap out-of-bounds fixDimitris Vlachos
Offset vmemmap so that the first page of vmemmap will be mapped to the first page of physical memory in order to ensure that vmemmap’s bounds will be respected during pfn_to_page()/page_to_pfn() operations. The conversion macros will produce correct SV39/48/57 addresses for every possible/valid DRAM_BASE inside the physical memory limits. v2:Address Alex's comments Suggested-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Dimitris Vlachos <dvlachos@ics.forth.gr> Reported-by: Dimitris Vlachos <dvlachos@ics.forth.gr> Closes: https://lore.kernel.org/linux-riscv/20240202135030.42265-1-csd4492@csd.uoc.gr Fixes: d95f1a542c3d ("RISC-V: Implement sparsemem") Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240229191723.32779-1-dvlachos@ics.forth.gr Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-29Merge patch series "NAPOT Fixes"Palmer Dabbelt
Alexandre Ghiti <alexghiti@rivosinc.com> says: This contains 2 fixes for NAPOT: patch 1 disables the use of NAPOT mapping for vmalloc/vmap and patch 2 implements pte_leaf_size() to report NAPOT size. * b4-shazam-merge: riscv: Fix pte_leaf_size() for NAPOT Revert "riscv: mm: support Svnapot in huge vmap" Link: https://lore.kernel.org/r/20240227205016.121901-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-29riscv: Fix pte_leaf_size() for NAPOTAlexandre Ghiti
pte_leaf_size() must be reimplemented to add support for NAPOT mappings. Fixes: 82a1a1f3bfb6 ("riscv: mm: support Svnapot in hugetlb page") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240227205016.121901-3-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-29Revert "riscv: mm: support Svnapot in huge vmap"Alexandre Ghiti
This reverts commit ce173474cf19fe7fbe8f0fc74e3c81ec9c3d9807. We cannot correctly deal with NAPOT mappings in vmalloc/vmap because if some part of a NAPOT mapping is unmapped, the remaining mapping is not updated accordingly. For example: ptr = vmalloc_huge(64 * 1024, GFP_KERNEL); vunmap_range((unsigned long)(ptr + PAGE_SIZE), (unsigned long)(ptr + 64 * 1024)); leads to the following kernel page table dump: 0xffff8f8000ef0000-0xffff8f8000ef1000 0x00000001033c0000 4K PTE N .. .. D A G . . W R V Meaning the first entry which was not unmapped still has the N bit set, which, if accessed first and cached in the TLB, could allow access to the unmapped range. That's because the logic to break the NAPOT mapping does not exist and likely won't. Indeed, to break a NAPOT mapping, we first have to clear the whole mapping, flush the TLB and then set the new mapping ("break- before-make" equivalent). That works fine in userspace since we can handle any pagefault occurring on the remaining mapping but we can't handle a kernel pagefault on such mapping. So fix this by reverting the commit that introduced the vmap/vmalloc support. Fixes: ce173474cf19 ("riscv: mm: support Svnapot in huge vmap") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240227205016.121901-2-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-29Merge patch series "riscv: cbo.zero fixes"Palmer Dabbelt
Samuel Holland <samuel.holland@sifive.com> says: This series fixes a couple of issues related to using the cbo.zero instruction in userspace. The first patch fixes a bug where the wrong enable bit gets set if the kernel is running in M-mode. The remaining patches fix a bug where the enable bit gets reset to its default value after a nonretentive idle state. I have hardware which reproduces this: Before this series: $ tools/testing/selftests/riscv/hwprobe/cbo TAP version 13 1..3 ok 1 Zicboz block size # Zicboz block size: 64 Illegal instruction After applying this series: $ tools/testing/selftests/riscv/hwprobe/cbo TAP version 13 1..3 ok 1 Zicboz block size # Zicboz block size: 64 ok 2 cbo.zero ok 3 cbo.zero check # Totals: pass:3 fail:0 xfail:0 xpass:0 skip:0 error:0 * b4-shazam-merge: riscv: Save/restore envcfg CSR during CPU suspend riscv: Add a custom ISA extension for the [ms]envcfg CSR riscv: Fix enabling cbo.zero when running in M-mode Link: https://lore.kernel.org/r/20240228065559.3434837-1-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-29riscv: Save/restore envcfg CSR during CPU suspendSamuel Holland
The value of the [ms]envcfg CSR is lost when entering a nonretentive idle state, so the CSR must be rewritten when resuming the CPU. Cc: <stable@vger.kernel.org> # v6.7+ Fixes: 43c16d51a19b ("RISC-V: Enable cbo.zero in usermode") Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20240228065559.3434837-4-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-29riscv: Add a custom ISA extension for the [ms]envcfg CSRSamuel Holland
The [ms]envcfg CSR was added in version 1.12 of the RISC-V privileged ISA (aka S[ms]1p12). However, bits in this CSR are defined by several other extensions which may be implemented separately from any particular version of the privileged ISA (for example, some unrelated errata may prevent an implementation from claiming conformance with Ss1p12). As a result, Linux cannot simply use the privileged ISA version to determine if the CSR is present. It must also check if any of these other extensions are implemented. It also cannot probe the existence of the CSR at runtime, because Linux does not require Sstrict, so (in the absence of additional information) it cannot know if a CSR at that address is [ms]envcfg or part of some non-conforming vendor extension. Since there are several standard extensions that imply the existence of the [ms]envcfg CSR, it becomes unwieldy to check for all of them wherever the CSR is accessed. Instead, define a custom Xlinuxenvcfg ISA extension bit that is implied by the other extensions and denotes that the CSR exists as defined in the privileged ISA, containing at least one of the fields common between menvcfg and senvcfg. This extension does not need to be parsed from the devicetree or ISA string because it can only be implemented as a subset of some other standard extension. Cc: <stable@vger.kernel.org> # v6.7+ Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20240228065559.3434837-3-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-29riscv: Fix enabling cbo.zero when running in M-modeSamuel Holland
When the kernel is running in M-mode, the CBZE bit must be set in the menvcfg CSR, not in senvcfg. Cc: <stable@vger.kernel.org> Fixes: 43c16d51a19b ("RISC-V: Enable cbo.zero in usermode") Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20240228065559.3434837-2-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-28Merge tag 'v6.8-p5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: "This fixes a regression in lskcipher and an out-of-bound access in arm64/neonbs" * tag 'v6.8-p5' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: arm64/neonbs - fix out-of-bounds access on short input crypto: lskcipher - Copy IV in lskcipher glue code always
2024-02-26x86/cpu/intel: Detect TME keyid bits before setting MTRR mask registersPaolo Bonzini
MKTME repurposes the high bit of physical address to key id for encryption key and, even though MAXPHYADDR in CPUID[0x80000008] remains the same, the valid bits in the MTRR mask register are based on the reduced number of physical address bits. detect_tme() in arch/x86/kernel/cpu/intel.c detects TME and subtracts it from the total usable physical bits, but it is called too late. Move the call to early_init_intel() so that it is called in setup_arch(), before MTRRs are setup. This fixes boot on TDX-enabled systems, which until now only worked with "disable_mtrr_cleanup". Without the patch, the values written to the MTRRs mask registers were 52-bit wide (e.g. 0x000fffff_80000800) and the writes failed; with the patch, the values are 46-bit wide, which matches the reduced MAXPHYADDR that is shown in /proc/cpuinfo. Reported-by: Zixi Chen <zixchen@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc:stable@vger.kernel.org Link: https://lore.kernel.org/all/20240131230902.1867092-3-pbonzini%40redhat.com