Age | Commit message (Collapse) | Author |
|
|
|
Pull kvm fixes from Paolo Bonzini:
"KVM GUEST_MEMFD fixes for 6.8:
- Make KVM_MEM_GUEST_MEMFD mutually exclusive with KVM_MEM_READONLY
to avoid creating an inconsistent ABI (KVM_MEM_GUEST_MEMFD is not
writable from userspace, so there would be no way to write to a
read-only guest_memfd).
- Update documentation for KVM_SW_PROTECTED_VM to make it abundantly
clear that such VMs are purely for development and testing.
- Limit KVM_SW_PROTECTED_VM guests to the TDP MMU, as the long term
plan is to support confidential VMs with deterministic private
memory (SNP and TDX) only in the TDP MMU.
- Fix a bug in a GUEST_MEMFD dirty logging test that caused false
passes.
x86 fixes:
- Fix missing marking of a guest page as dirty when emulating an
atomic access.
- Check for mmu_notifier invalidation events before faulting in the
pfn, and before acquiring mmu_lock, to avoid unnecessary work and
lock contention with preemptible kernels (including
CONFIG_PREEMPT_DYNAMIC in non-preemptible mode).
- Disable AMD DebugSwap by default, it breaks VMSA signing and will
be re-enabled with a better VM creation API in 6.10.
- Do the cache flush of converted pages in svm_register_enc_region()
before dropping kvm->lock, to avoid a race with unregistering of
the same region and the consequent use-after-free issue"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
SEV: disable SEV-ES DebugSwap by default
KVM: x86/mmu: Retry fault before acquiring mmu_lock if mapping is changing
KVM: SVM: Flush pages under kvm->lock to fix UAF in svm_register_enc_region()
KVM: selftests: Add a testcase to verify GUEST_MEMFD and READONLY are exclusive
KVM: selftests: Create GUEST_MEMFD for relevant invalid flags testcases
KVM: x86/mmu: Restrict KVM_SW_PROTECTED_VM to the TDP MMU
KVM: x86: Update KVM_SW_PROTECTED_VM docs to make it clear they're a WIP
KVM: Make KVM_MEM_GUEST_MEMFD mutually exclusive with KVM_MEM_READONLY
KVM: x86: Mark target gfn of emulated atomic instruction as dirty
|
|
The DebugSwap feature of SEV-ES provides a way for confidential guests to use
data breakpoints. However, because the status of the DebugSwap feature is
recorded in the VMSA, enabling it by default invalidates the attestation
signatures. In 6.10 we will introduce a new API to create SEV VMs that
will allow enabling DebugSwap based on what the user tells KVM to do.
Contextually, we will change the legacy KVM_SEV_ES_INIT API to never
enable DebugSwap.
For compatibility with kernels that pre-date the introduction of DebugSwap,
as well as with those where KVM_SEV_ES_INIT will never enable it, do not enable
the feature by default. If anybody wants to use it, for now they can enable
the sev_es_debug_swap_enabled module parameter, but this will result in a
warning.
Fixes: d1f85fbe836e ("KVM: SEV: Enable data breakpoints in SEV-ES")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
https://github.com/kvm-x86/linux into HEAD
KVM GUEST_MEMFD fixes for 6.8:
- Make KVM_MEM_GUEST_MEMFD mutually exclusive with KVM_MEM_READONLY to
avoid creating ABI that KVM can't sanely support.
- Update documentation for KVM_SW_PROTECTED_VM to make it abundantly
clear that such VMs are purely a development and testing vehicle, and
come with zero guarantees.
- Limit KVM_SW_PROTECTED_VM guests to the TDP MMU, as the long term plan
is to support confidential VMs with deterministic private memory (SNP
and TDX) only in the TDP MMU.
- Fix a bug in a GUEST_MEMFD negative test that resulted in false passes
when verifying that KVM_MEM_GUEST_MEMFD memslots can't be dirty logged.
|
|
KVM x86 fixes for 6.8, round 2:
- When emulating an atomic access, mark the gfn as dirty in the memslot
to fix a bug where KVM could fail to mark the slot as dirty during live
migration, ultimately resulting in guest data corruption due to a dirty
page not being re-copied from the source to the target.
- Check for mmu_notifier invalidation events before faulting in the pfn,
and before acquiring mmu_lock, to avoid unnecessary work and lock
contention. Contending mmu_lock is especially problematic on preemptible
kernels, as KVM may yield mmu_lock in response to the contention, which
severely degrades overall performance due to vCPUs making it difficult
for the task that triggered invalidation to make forward progress.
Note, due to another kernel bug, this fix isn't limited to preemtible
kernels, as any kernel built with CONFIG_PREEMPT_DYNAMIC=y will yield
contended rwlocks and spinlocks.
https://lore.kernel.org/all/20240110214723.695930-1-seanjc@google.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fix from Will Deacon:
"A lonely arm64 fix addressing a kprobes regression that we introduced
during the merge window:
- Fix recursive kprobes regression when probing the stack unwinder"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: prohibit probing on arch_kunwind_consume_entry()
|
|
Even if pXd_leaf() API is defined globally, it's not clear on the retval,
and there are three types used (bool, int, unsigned log).
Always return a boolean for pXd_leaf() APIs.
Link: https://lkml.kernel.org/r/20240305043750.93762-11-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
They're not used anymore, drop all of them.
Link: https://lkml.kernel.org/r/20240305043750.93762-10-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
pud_large() is always defined as pud_leaf(). Merge their usages. Chose
pud_leaf() because pud_leaf() is a global API, while pud_large() is not.
Link: https://lkml.kernel.org/r/20240305043750.93762-9-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
pmd_large() is always defined as pmd_leaf(). Merge their usages. Chose
pmd_leaf() because pmd_leaf() is a global API, while pmd_large() is not.
Link: https://lkml.kernel.org/r/20240305043750.93762-8-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
pud_leaf() has a fallback macro defined in include/linux/pgtable.h
already. Drop the extra two for x86.
Link: https://lkml.kernel.org/r/20240305043750.93762-6-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
pgd_leaf() is a global API while pgd_large() is not. Always use the
global pgd_leaf(), then drop pgd_large().
Link: https://lkml.kernel.org/r/20240305043750.93762-5-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
p4d_large() is always defined as p4d_leaf(). Merge their usages. Chose
p4d_leaf() because p4d_leaf() is a global API, while p4d_large() is not.
Only x86 has p4d_leaf() defined as of now. So it also means after this
patch we removed all p4d_large() usages.
Link: https://lkml.kernel.org/r/20240305043750.93762-4-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
They're the same macros underneath. Drop pXd_is_leaf(), instead always use
pXd_leaf().
At the meantime, instead of renames, drop the pXd_is_leaf() fallback
definitions directly in arch/powerpc/include/asm/pgtable.h. because
similar fallback macros for pXd_leaf() are already defined in
include/linux/pgtable.h.
Link: https://lkml.kernel.org/r/20240305043750.93762-3-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org>
Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "mm/treewide: Replace pXd_large() with pXd_leaf()", v3.
These two APIs are mostly always the same. It's confusing to have both of
them. Merge them into one. Here I used pXd_leaf() only because
pXd_leaf() is a global API which is always defined, while pXd_large() is
not.
We have yet one more API that is similar which is pXd_huge(), but that's
even trickier, so let's do it step by step.
Some special cares are taken for ppc and x86, they're done as separate
cleanups first.
This patch (of 10):
The two definitions are the same. The only difference is that pXd_large()
is only defined with THP selected, and only on book3s 64bits.
Instead of implementing it twice, make pXd_large() a macro to pXd_leaf().
Define it unconditionally just like pXd_leaf(). This helps to prepare
merging the two APIs.
Link: https://lkml.kernel.org/r/20240305043750.93762-1-peterx@redhat.com
Link: https://lkml.kernel.org/r/20240305043750.93762-2-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org>
Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
After commit 6326c26c1514 ("s390: convert various pgalloc functions to use
ptdescs"), there are still some positions that use page->{lru, index}
instead of ptdesc->{pt_list, pt_index}. In order to make the use of
ptdesc->{pt_list, pt_index} clearer, it would be better to convert them as
well.
[zhengqi.arch@bytedance.com: fix build failure]
Link: https://lkml.kernel.org/r/20240305072154.26168-1-zhengqi.arch@bytedance.com
Link: https://lkml.kernel.org/r/04beaf3255056ffe131a5ea595736066c1e84756.1709541697.git.zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Janosch Frank <frankja@linux.ibm.com>
Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Optimizing the initialization speed of 1G huge pages through
parallelization.
1G hugetlbs are allocated from bootmem, a process that is already very
fast and does not currently require optimization. Therefore, we focus on
parallelizing only the initialization phase in `gather_bootmem_prealloc`.
Here are some test results:
test case no patch(ms) patched(ms) saved
------------------- -------------- ------------- --------
256c2T(4 node) 1G 4745 2024 57.34%
128c1T(2 node) 1G 3358 1712 49.02%
12T 1G 77000 18300 76.23%
[akpm@linux-foundation.org: s/initialied/initialized/, per Alexey]
Link: https://lkml.kernel.org/r/20240222140422.393911-9-gang.li@linux.dev
Signed-off-by: Gang Li <ligang.bdlg@bytedance.com>
Tested-by: David Rientjes <rientjes@google.com>
Reviewed-by: Muchun Song <muchun.song@linux.dev>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jane Chu <jane.chu@oracle.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"These should be the final fixes for the soc tree for 6.8, as usual
they mostly deal wtih dts files:
- Qualcomm fixes for pcie4 on sc8280xp, a revert of msm8996 mpm
support, sm6115 interconnect and sm8650 gpio.
- Two fixes for Tegra234 ethernet
- A Makefile fix to actually build the allwinner based orange pi zero
2w device tree
- Fixes for clocks and reset on imx8mp and a DSI display regression
on imx7.
The non-DT fixes are:
- Firmware fixes addressing a kernel panic in op-tee and a minor
regression in microchip/riscv.
- A defconfig change to bring back backlight support after a Kconfig
change"
* tag 'arm-fixes-6.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
firmware: microchip: Fix over-requested allocation size
tee: optee: Fix kernel panic caused by incorrect error handling
Revert "arm64: dts: qcom: msm8996: Hook up MPM"
arm64: dts: qcom: sc8280xp-x13s: limit pcie4 link speed
arm64: dts: qcom: sc8280xp-crd: limit pcie4 link speed
arm64: dts: imx8mp: Fix LDB clocks property
arm64: dts: imx8mp: Fix TC9595 reset GPIO on DH i.MX8M Plus DHCOM SoM
MAINTAINERS: Use a proper mailinglist for NXP i.MX development
ARM: dts: imx7: remove DSI port endpoints
arm64: dts: allwinner: h616: Add Orange Pi Zero 2W to Makefile
ARM: imx_v6_v7_defconfig: Restore CONFIG_BACKLIGHT_CLASS_DEVICE
arm64: tegra: Fix Tegra234 MGBE power-domains
arm64: tegra: Set the correct PHY mode for MGBE
arm64: dts: qcom: sm6115: Fix missing interconnect-names
arm64: dts: qcom: sm8650-mtp: add gpio74 as reserved gpio
arm64: dts: qcom: sm8650-qrd: add gpio74 as reserved gpio
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes
A few more Qualcomm Arm64 DeviceTree fixes for v6.8
This reduces the link speed of the PCIe bus with WiFi-card connected on the
Lenovo ThinkPad X13s and the Qualcomm Compute Reference Device, avoid
link errors and initialization issues reported by users.
It also reverts the enablement of MPM on MSM8996, which is reported to
prevent boards on this platform from booting for some users.
* tag 'qcom-arm64-fixes-for-6.8-2' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
Revert "arm64: dts: qcom: msm8996: Hook up MPM"
arm64: dts: qcom: sc8280xp-x13s: limit pcie4 link speed
arm64: dts: qcom: sc8280xp-crd: limit pcie4 link speed
Link: https://lore.kernel.org/r/20240306031208.4218-1-andersson@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv fixes from Wei Liu:
- Multiple fixes, cleanups and documentations for Hyper-V core code and
drivers
* tag 'hyperv-fixes-signed-20240303' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
Drivers: hv: vmbus: make hv_bus const
x86/hyperv: Allow 15-bit APIC IDs for VTL platforms
x86/hyperv: Make encrypted/decrypted changes safe for load_unaligned_zeropad()
x86/mm: Regularize set_memory_p() parameters and make non-static
x86/hyperv: Use slow_virt_to_phys() in page transition hypervisor callback
Documentation: hyperv: Add overview of PCI pass-thru device support
Drivers: hv: vmbus: Update indentation in create_gpadl_header()
Drivers: hv: vmbus: Remove duplication and cleanup code in create_gpadl_header()
fbdev/hyperv_fb: Fix logic error for Gen2 VMs in hvfb_getmem()
Drivers: hv: vmbus: Calculate ring buffer size for more efficient use of memory
hv_utils: Allow implicit ICTIMESYNCFLAG_SYNC
|
|
Make clear the atmicity/consistency requirements of the API and how we
achieve them.
Link: https://lore.kernel.org/linux-mm/Zc-Tqqfksho3BHmU@arm.com/
Link: https://lkml.kernel.org/r/20240226120321.1055731-3-ryan.roberts@arm.com
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "Address some contpte nits".
These 2 patches address some nits raised by Catalin late in the review cycle for
my contpte series [1].
[1] https://lore.kernel.org/linux-mm/20240215103205.2607016-1-ryan.roberts@arm.com/
This patch (of 2):
The contpte symbols must be exported since some of the public inline
ptep_* APIs are called from modules and these inlines now call the contpte
functions. Originally they were exported as EXPORT_SYMBOL() for fear of
breaking out-of-tree modules. But we subsequently concluded that
EXPORT_SYMBOL_GPL() should be safe since these functions are deeply core
mm routines, and any module operating at this level is not going to be
able to survive on EXPORT_SYMBOL alone.
Link: https://lkml.kernel.org/r/20240226120321.1055731-1-ryan.roberts@arm.com
Link: https://lore.kernel.org/linux-mm/f9fc2b31-11cb-4969-8961-9c89fea41b74@nvidia.com/
Link: https://lkml.kernel.org/r/20240226120321.1055731-2-ryan.roberts@arm.com
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The first argument of switch_mm_irqs_off() is unused by the x86
implementation. Make sure that x86 code never passes a non-NULL value to
make this clear. Update the only non violating caller, switch_mm().
Link: https://lkml.kernel.org/r/20240222190911.1903054-2-yosryahmed@google.com
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Suggested-by: Dave Hansen <dave.hansen@intel.com>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Commit accf6b23d1e5a ("x86/mm: clarify "prev" usage in
switch_mm_irqs_off()") attempted to clarify x86's usage of the arguments
passed by generic code, specifically the "prev" argument the is unused by
x86. However, it could have done a better job with the comment above
switch_mm_irqs_off(). Rewrite this comment according to Dave Hansen's
suggestion.
Link: https://lkml.kernel.org/r/20240222190911.1903054-1-yosryahmed@google.com
Fixes: 3cfd6625a6cf ("x86/mm: clarify "prev" usage in switch_mm_irqs_off()")
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Suggested-by: Dave Hansen <dave.hansen@intel.com>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux into arm/fixes
arm64: tegra: Device tree fixes for v6.8
This contains two fixes to make the MGBE Ethernet devices found on
Tegra234 work properly.
* tag 'tegra-for-6.8-arm64-dt' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux:
arm64: tegra: Fix Tegra234 MGBE power-domains
arm64: tegra: Set the correct PHY mode for MGBE
Link: https://lore.kernel.org/r/20240226144536.1525704-1-thierry.reding@gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes
i.MX fixes for 6.8, round 2:
- Update MAINTAINERS to use a public mailing list for NXP i.MX
development.
- Re-enable CONFIG_BACKLIGHT_CLASS_DEVICE in imx_v6_v7_defconfig to fix
a backlight regression.
- Remove DSI port endpoints from i.MX7 SoC DTSI to fix a display
regression.
- Fix LDB clocks property for i.MX8MP device tree.
- Fix TC9595 reset GPIO on DH i.MX8M Plus DHCOM SoM.
* tag 'imx-fixes-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
arm64: dts: imx8mp: Fix LDB clocks property
arm64: dts: imx8mp: Fix TC9595 reset GPIO on DH i.MX8M Plus DHCOM SoM
MAINTAINERS: Use a proper mailinglist for NXP i.MX development
ARM: dts: imx7: remove DSI port endpoints
ARM: imx_v6_v7_defconfig: Restore CONFIG_BACKLIGHT_CLASS_DEVICE
Link: https://lore.kernel.org/r/ZdtPJzdenRybI+Bq@dragon
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes
Qualcomm ARM64 DeviceTree fixes for 6.8
This marks an additional GPIO as protected on SM8650 devices, to avoid
a system reset caused by a security violation with some firmware
versions.
It also adds the missing interconnect-names, which resolves a regression
where one of the I2C busses on SM6115 devices would no longer probe in
Linux.
* tag 'qcom-arm64-fixes-for-6.8' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
arm64: dts: qcom: sm6115: Fix missing interconnect-names
arm64: dts: qcom: sm8650-mtp: add gpio74 as reserved gpio
arm64: dts: qcom: sm8650-qrd: add gpio74 as reserved gpio
Link: https://lore.kernel.org/r/20240225025205.479589-1-andersson@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into arm/fixes
- include Orange Pi Zero 2W DT in Makefile
* tag 'sunxi-fixes-for-6.8-1' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
arm64: dts: allwinner: h616: Add Orange Pi Zero 2W to Makefile
Link: https://lore.kernel.org/r/20240223205450.GA8881@jernej-laptop
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
Make arch_kunwind_consume_entry() as __always_inline otherwise the
compiler might not inline it and allow attaching probes to it.
Without this, just probing arch_kunwind_consume_entry() via
<tracefs>/kprobe_events will crash the kernel on arm64.
The crash can be reproduced using the following compiler and kernel
combination:
clang version 19.0.0git (https://github.com/llvm/llvm-project.git d68d29516102252f6bf6dc23fb22cef144ca1cb3)
commit 87adedeba51a ("Merge tag 'net-6.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net")
[root@localhost ~]# echo 'p arch_kunwind_consume_entry' > /sys/kernel/debug/tracing/kprobe_events
[root@localhost ~]# echo 1 > /sys/kernel/debug/tracing/events/kprobes/enable
Modules linked in: aes_ce_blk aes_ce_cipher ghash_ce sha2_ce virtio_net sha256_arm64 sha1_ce arm_smccc_trng net_failover failover virtio_mmio uio_pdrv_genirq uio sch_fq_codel dm_mod dax configfs
CPU: 3 PID: 1405 Comm: bash Not tainted 6.8.0-rc6+ #14
Hardware name: linux,dummy-virt (DT)
pstate: 604003c5 (nZCv DAIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : kprobe_breakpoint_handler+0x17c/0x258
lr : kprobe_breakpoint_handler+0x17c/0x258
sp : ffff800085d6ab60
x29: ffff800085d6ab60 x28: ffff0000066f0040 x27: ffff0000066f0b20
x26: ffff800081fa7b0c x25: 0000000000000002 x24: ffff00000b29bd18
x23: ffff00007904c590 x22: ffff800081fa6590 x21: ffff800081fa6588
x20: ffff00000b29bd18 x19: ffff800085d6ac40 x18: 0000000000000079
x17: 0000000000000001 x16: ffffffffffffffff x15: 0000000000000004
x14: ffff80008277a940 x13: 0000000000000003 x12: 0000000000000003
x11: 00000000fffeffff x10: c0000000fffeffff x9 : aa95616fdf80cc00
x8 : aa95616fdf80cc00 x7 : 205d343137373231 x6 : ffff800080fb48ec
x5 : 0000000000000000 x4 : 0000000000000001 x3 : 0000000000000000
x2 : 0000000000000000 x1 : ffff800085d6a910 x0 : 0000000000000079
Call trace:
kprobes: Failed to recover from reentered kprobes.
kprobes: Dump kprobe:
.symbol_name = arch_kunwind_consume_entry, .offset = 0, .addr = arch_kunwind_consume_entry+0x0/0x40
------------[ cut here ]------------
kernel BUG at arch/arm64/kernel/probes/kprobes.c:241!
kprobes: Failed to recover from reentered kprobes.
kprobes: Dump kprobe:
.symbol_name = arch_kunwind_consume_entry, .offset = 0, .addr = arch_kunwind_consume_entry+0x0/0x40
Fixes: 1aba06e7b2b4 ("arm64: stacktrace: factor out kunwind_stack_walk()")
Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20240229231620.24846-1-puranjay12@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Commit 09896da07315 ("arm64: dts: qcom: msm8996: Hook up MPM") has
hooked up the MPM irq chip on the MSM8996 platform. However this causes
my Dragonboard 820c crash during bootup (usually when probing IOMMUs).
Revert the offending commit for now. Quick debug shows that making
tlmm's wakeup-parent point to the MPM is enough to trigger the crash.
Fixes: 09896da07315 ("arm64: dts: qcom: msm8996: Hook up MPM")
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20240221-msm8996-revert-mpm-v1-1-cdca9e30c9b4@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Fix IOMMU table initialisation when doing kdump over SR-IOV
- Fix incorrect RTAS function name for resetting TCE tables
- Fix fpu_signal selftest failures since a recent change
Thanks to Gaurav Batra and Nathan Lynch.
* tag 'powerpc-6.8-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
selftests/powerpc: Fix fpu_signal failures
powerpc/rtas: use correct function name for resetting TCE tables
powerpc/pseries/iommu: IOMMU table is not initialized for kdump over SR-IOV
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- Do not reserve SETUP_RNG_SEED setup data in the e820 map as it should
be used by kexec only
- Make sure MKTME feature detection happens at an earlier time in the
boot process so that the physical address size supported by the CPU
is properly corrected and MTRR masks are programmed properly, leading
to TDX systems booting without disable_mtrr_cleanup on the cmdline
- Make sure the different address sizes supported by the CPU are read
out as early as possible
* tag 'x86_urgent_for_v6.8_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/e820: Don't reserve SETUP_RNG_SEED in e820
x86/cpu/intel: Detect TME keyid bits before setting MTRR mask registers
x86/cpu: Allow reducing x86_phys_bits during early_identify_cpu()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- detect ".option arch" support on not-yet-released LLVM builds
- fix missing TLB flush when modifying non-leaf PTEs
- fixes for T-Head custom extensions
- fix for systems with the legacy PMU, that manifests as a crash on
kernels built without SBI PMU support
- fix for systems that clear *envcfg on suspend, which manifests as
cbo.zero trapping after resume
- fixes for Svnapot systems, including removing Svnapot support for
huge vmalloc/vmap regions
* tag 'riscv-for-linus-6.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Sparse-Memory/vmemmap out-of-bounds fix
riscv: Fix pte_leaf_size() for NAPOT
Revert "riscv: mm: support Svnapot in huge vmap"
riscv: Save/restore envcfg CSR during CPU suspend
riscv: Add a custom ISA extension for the [ms]envcfg CSR
riscv: Fix enabling cbo.zero when running in M-mode
perf: RISCV: Fix panic on pmu overflow handler
MAINTAINERS: Update SiFive driver maintainers
drivers: perf: ctr_get_width function for legacy is not defined
drivers: perf: added capabilities for legacy PMU
RISC-V: Ignore V from the riscv,isa DT property on older T-Head CPUs
riscv: Fix build error if !CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION
riscv: mm: fix NOCACHE_THEAD does not set bit[61] correctly
riscv: add CALLER_ADDRx support
RISC-V: Drop invalid test from CONFIG_AS_HAS_OPTION_ARCH
kbuild: Add -Wa,--fatal-warnings to as-instr invocation
riscv: tlb: fix __p*d_free_tlb()
|
|
SETUP_RNG_SEED in setup_data is supplied by kexec and should
not be reserved in the e820 map.
Doing so reserves 16 bytes of RAM when booting with kexec.
(16 bytes because data->len is zeroed by parse_setup_data so only
sizeof(setup_data) is reserved.)
When kexec is used repeatedly, each boot adds two entries in the
kexec-provided e820 map as the 16-byte range splits a larger
range of usable memory. Eventually all of the 128 available entries
get used up. The next split will result in losing usable memory
as the new entries cannot be added to the e820 map.
Fixes: 68b8e9713c8e ("x86/setup: Use rng seeds from setup_data")
Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/ZbmOjKnARGiaYBd5@dwarf.suse.cz
|
|
Limit the WiFi PCIe link speed to Gen2 speed (500 MB/s), which is the
speed that the boot firmware has brought up the link at (and that
Windows uses).
This is specifically needed to avoid a large amount of link errors when
restarting the link during boot (but which are currently not reported).
This also appears to fix intermittent failures to download the ath11k
firmware during boot which can be seen when there is a longer delay
between restarting the link and loading the WiFi driver (e.g. when using
full disk encryption).
Fixes: 123b30a75623 ("arm64: dts: qcom: sc8280xp-x13s: enable WiFi controller")
Cc: stable@vger.kernel.org # 6.2
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20240223152124.20042-8-johan+linaro@kernel.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
|
|
Limit the WiFi PCIe link speed to Gen2 speed (500 MB/s), which is the
speed that Windows uses.
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20240223152124.20042-7-johan+linaro@kernel.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
|
|
The current method for signaling the compatibility of a Hyper-V host
with MSIs featuring 15-bit APIC IDs relies on a synthetic cpuid leaf.
However, for higher VTLs, this leaf is not reported, due to the absence
of an IO-APIC.
As an alternative, assume that when running at a high VTL, the host
supports 15-bit APIC IDs. This assumption is safe, as Hyper-V does not
employ any architectural MSIs at higher VTLs
This unblocks startup of VTL2 environments with more than 256 CPUs.
Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Link: https://lore.kernel.org/r/1705341460-18394-1-git-send-email-ssengar@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <1705341460-18394-1-git-send-email-ssengar@linux.microsoft.com>
|
|
In a CoCo VM, when transitioning memory from encrypted to decrypted, or
vice versa, the caller of set_memory_encrypted() or set_memory_decrypted()
is responsible for ensuring the memory isn't in use and isn't referenced
while the transition is in progress. The transition has multiple steps,
and the memory is in an inconsistent state until all steps are complete.
A reference while the state is inconsistent could result in an exception
that can't be cleanly fixed up.
However, the kernel load_unaligned_zeropad() mechanism could cause a stray
reference that can't be prevented by the caller of set_memory_encrypted()
or set_memory_decrypted(), so there's specific code to handle this case.
But a CoCo VM running on Hyper-V may be configured to run with a paravisor,
with the #VC or #VE exception routed to the paravisor. There's no
architectural way to forward the exceptions back to the guest kernel, and
in such a case, the load_unaligned_zeropad() specific code doesn't work.
To avoid this problem, mark pages as "not present" while a transition
is in progress. If load_unaligned_zeropad() causes a stray reference, a
normal page fault is generated instead of #VC or #VE, and the
page-fault-based fixup handlers for load_unaligned_zeropad() resolve the
reference. When the encrypted/decrypted transition is complete, mark the
pages as "present" again.
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://lore.kernel.org/r/20240116022008.1023398-4-mhklinux@outlook.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <20240116022008.1023398-4-mhklinux@outlook.com>
|
|
set_memory_p() is currently static. It has parameters that don't
match set_memory_p() under arch/powerpc and that aren't congruent
with the other set_memory_* functions. There's no good reason for
the difference.
Fix this by making the parameters consistent, and update the one
existing call site. Make the function non-static and add it to
include/asm/set_memory.h so that it is completely parallel to
set_memory_np() and is usable in other modules.
No functional change.
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Link: https://lore.kernel.org/r/20240116022008.1023398-3-mhklinux@outlook.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <20240116022008.1023398-3-mhklinux@outlook.com>
|
|
In preparation for temporarily marking pages not present during a
transition between encrypted and decrypted, use slow_virt_to_phys()
in the hypervisor callback. As long as the PFN is correct,
slow_virt_to_phys() works even if the leaf PTE is not present.
The existing functions that depend on vmalloc_to_page() all
require that the leaf PTE be marked present, so they don't work.
Update the comments for slow_virt_to_phys() to note this broader usage
and the requirement to work even if the PTE is not marked present.
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Link: https://lore.kernel.org/r/20240116022008.1023398-2-mhklinux@outlook.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <20240116022008.1023398-2-mhklinux@outlook.com>
|
|
Offset vmemmap so that the first page of vmemmap will be mapped
to the first page of physical memory in order to ensure that
vmemmap’s bounds will be respected during
pfn_to_page()/page_to_pfn() operations.
The conversion macros will produce correct SV39/48/57 addresses
for every possible/valid DRAM_BASE inside the physical memory limits.
v2:Address Alex's comments
Suggested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Dimitris Vlachos <dvlachos@ics.forth.gr>
Reported-by: Dimitris Vlachos <dvlachos@ics.forth.gr>
Closes: https://lore.kernel.org/linux-riscv/20240202135030.42265-1-csd4492@csd.uoc.gr
Fixes: d95f1a542c3d ("RISC-V: Implement sparsemem")
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240229191723.32779-1-dvlachos@ics.forth.gr
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Alexandre Ghiti <alexghiti@rivosinc.com> says:
This contains 2 fixes for NAPOT: patch 1 disables the use of NAPOT
mapping for vmalloc/vmap and patch 2 implements pte_leaf_size() to
report NAPOT size.
* b4-shazam-merge:
riscv: Fix pte_leaf_size() for NAPOT
Revert "riscv: mm: support Svnapot in huge vmap"
Link: https://lore.kernel.org/r/20240227205016.121901-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
pte_leaf_size() must be reimplemented to add support for NAPOT mappings.
Fixes: 82a1a1f3bfb6 ("riscv: mm: support Svnapot in hugetlb page")
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240227205016.121901-3-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
This reverts commit ce173474cf19fe7fbe8f0fc74e3c81ec9c3d9807.
We cannot correctly deal with NAPOT mappings in vmalloc/vmap because if
some part of a NAPOT mapping is unmapped, the remaining mapping is not
updated accordingly. For example:
ptr = vmalloc_huge(64 * 1024, GFP_KERNEL);
vunmap_range((unsigned long)(ptr + PAGE_SIZE),
(unsigned long)(ptr + 64 * 1024));
leads to the following kernel page table dump:
0xffff8f8000ef0000-0xffff8f8000ef1000 0x00000001033c0000 4K PTE N .. .. D A G . . W R V
Meaning the first entry which was not unmapped still has the N bit set,
which, if accessed first and cached in the TLB, could allow access to the
unmapped range.
That's because the logic to break the NAPOT mapping does not exist and
likely won't. Indeed, to break a NAPOT mapping, we first have to clear
the whole mapping, flush the TLB and then set the new mapping ("break-
before-make" equivalent). That works fine in userspace since we can handle
any pagefault occurring on the remaining mapping but we can't handle a kernel
pagefault on such mapping.
So fix this by reverting the commit that introduced the vmap/vmalloc
support.
Fixes: ce173474cf19 ("riscv: mm: support Svnapot in huge vmap")
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240227205016.121901-2-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Samuel Holland <samuel.holland@sifive.com> says:
This series fixes a couple of issues related to using the cbo.zero
instruction in userspace. The first patch fixes a bug where the wrong
enable bit gets set if the kernel is running in M-mode. The remaining
patches fix a bug where the enable bit gets reset to its default value
after a nonretentive idle state. I have hardware which reproduces this:
Before this series:
$ tools/testing/selftests/riscv/hwprobe/cbo
TAP version 13
1..3
ok 1 Zicboz block size
# Zicboz block size: 64
Illegal instruction
After applying this series:
$ tools/testing/selftests/riscv/hwprobe/cbo
TAP version 13
1..3
ok 1 Zicboz block size
# Zicboz block size: 64
ok 2 cbo.zero
ok 3 cbo.zero check
# Totals: pass:3 fail:0 xfail:0 xpass:0 skip:0 error:0
* b4-shazam-merge:
riscv: Save/restore envcfg CSR during CPU suspend
riscv: Add a custom ISA extension for the [ms]envcfg CSR
riscv: Fix enabling cbo.zero when running in M-mode
Link: https://lore.kernel.org/r/20240228065559.3434837-1-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
The value of the [ms]envcfg CSR is lost when entering a nonretentive
idle state, so the CSR must be rewritten when resuming the CPU.
Cc: <stable@vger.kernel.org> # v6.7+
Fixes: 43c16d51a19b ("RISC-V: Enable cbo.zero in usermode")
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240228065559.3434837-4-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
The [ms]envcfg CSR was added in version 1.12 of the RISC-V privileged
ISA (aka S[ms]1p12). However, bits in this CSR are defined by several
other extensions which may be implemented separately from any particular
version of the privileged ISA (for example, some unrelated errata may
prevent an implementation from claiming conformance with Ss1p12). As a
result, Linux cannot simply use the privileged ISA version to determine
if the CSR is present. It must also check if any of these other
extensions are implemented. It also cannot probe the existence of the
CSR at runtime, because Linux does not require Sstrict, so (in the
absence of additional information) it cannot know if a CSR at that
address is [ms]envcfg or part of some non-conforming vendor extension.
Since there are several standard extensions that imply the existence of
the [ms]envcfg CSR, it becomes unwieldy to check for all of them
wherever the CSR is accessed. Instead, define a custom Xlinuxenvcfg ISA
extension bit that is implied by the other extensions and denotes that
the CSR exists as defined in the privileged ISA, containing at least one
of the fields common between menvcfg and senvcfg.
This extension does not need to be parsed from the devicetree or ISA
string because it can only be implemented as a subset of some other
standard extension.
Cc: <stable@vger.kernel.org> # v6.7+
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240228065559.3434837-3-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
When the kernel is running in M-mode, the CBZE bit must be set in the
menvcfg CSR, not in senvcfg.
Cc: <stable@vger.kernel.org>
Fixes: 43c16d51a19b ("RISC-V: Enable cbo.zero in usermode")
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240228065559.3434837-2-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
"This fixes a regression in lskcipher and an out-of-bound access
in arm64/neonbs"
* tag 'v6.8-p5' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: arm64/neonbs - fix out-of-bounds access on short input
crypto: lskcipher - Copy IV in lskcipher glue code always
|
|
MKTME repurposes the high bit of physical address to key id for encryption
key and, even though MAXPHYADDR in CPUID[0x80000008] remains the same,
the valid bits in the MTRR mask register are based on the reduced number
of physical address bits.
detect_tme() in arch/x86/kernel/cpu/intel.c detects TME and subtracts
it from the total usable physical bits, but it is called too late.
Move the call to early_init_intel() so that it is called in setup_arch(),
before MTRRs are setup.
This fixes boot on TDX-enabled systems, which until now only worked with
"disable_mtrr_cleanup". Without the patch, the values written to the
MTRRs mask registers were 52-bit wide (e.g. 0x000fffff_80000800) and
the writes failed; with the patch, the values are 46-bit wide, which
matches the reduced MAXPHYADDR that is shown in /proc/cpuinfo.
Reported-by: Zixi Chen <zixchen@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/20240131230902.1867092-3-pbonzini%40redhat.com
|