Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Misc fixes:
- Fix an UV boot crash
- Skip spurious ENDBR generation on _THIS_IP_
- Fix ENDBR use in putuser() asm methods
- Fix corner case boot crashes on 5-level paging
- and fix a false positive WARNING on LTO kernels"
* tag 'x86-urgent-2023-09-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/purgatory: Remove LTO flags
x86/boot/compressed: Reserve more memory for page tables
x86/ibt: Avoid duplicate ENDBR in __put_user_nocheck*()
x86/ibt: Suppress spurious ENDBR
x86/platform/uv: Use alternate source for socket to node data
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
"Fix a performance regression on large SMT systems, an Intel SMT4
balancing bug, and a topology setup bug on (Intel) hybrid processors"
* tag 'sched-urgent-2023-09-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/sched: Restore the SD_ASYM_PACKING flag in the DIE domain
sched/fair: Fix SMT4 group_smt_balance handling
sched/fair: Optimize should_we_balance() for large SMT systems
|
|
-flto* implies -ffunction-sections. With LTO enabled, ld.lld generates
multiple .text sections for purgatory.ro:
$ readelf -S purgatory.ro | grep " .text"
[ 1] .text PROGBITS 0000000000000000 00000040
[ 7] .text.purgatory PROGBITS 0000000000000000 000020e0
[ 9] .text.warn PROGBITS 0000000000000000 000021c0
[13] .text.sha256_upda PROGBITS 0000000000000000 000022f0
[15] .text.sha224_upda PROGBITS 0000000000000000 00002be0
[17] .text.sha256_fina PROGBITS 0000000000000000 00002bf0
[19] .text.sha224_fina PROGBITS 0000000000000000 00002cc0
This causes WARNING from kexec_purgatory_setup_sechdrs():
WARNING: CPU: 26 PID: 110894 at kernel/kexec_file.c:919
kexec_load_purgatory+0x37f/0x390
Fix this by disabling LTO for purgatory.
[ AFAICT, x86 is the only arch that supports LTO and purgatory. ]
We could also fix this with an explicit linker script to rejoin .text.*
sections back into .text. However, given the benefit of LTOing purgatory
is small, simply disable the production of more .text.* sections for now.
Fixes: b33fff07e3e3 ("x86, build: allow LTO to be selected")
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Link: https://lore.kernel.org/r/20230914170138.995606-1-song@kernel.org
|
|
The decompressor has a hard limit on the number of page tables it can
allocate. This limit is defined at compile-time and will cause boot
failure if it is reached.
The kernel is very strict and calculates the limit precisely for the
worst-case scenario based on the current configuration. However, it is
easy to forget to adjust the limit when a new use-case arises. The
worst-case scenario is rarely encountered during sanity checks.
In the case of enabling 5-level paging, a use-case was overlooked. The
limit needs to be increased by one to accommodate the additional level.
This oversight went unnoticed until Aaron attempted to run the kernel
via kexec with 5-level paging and unaccepted memory enabled.
Update wost-case calculations to include 5-level paging.
To address this issue, let's allocate some extra space for page tables.
128K should be sufficient for any use-case. The logic can be simplified
by using a single value for all kernel configurations.
[ Also add a warning, should this memory run low - by Dave Hansen. ]
Fixes: 34bbb0009f3b ("x86/boot/compressed: Enable 5-level paging during decompression stage")
Reported-by: Aaron Lu <aaron.lu@intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230915070221.10266-1-kirill.shutemov@linux.intel.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI fixes from Ard Biesheuvel:
- Missing x86 patch for the runtime cleanup that was merged in -rc1
- Kconfig tweak for kexec on x86 so EFI support does not get disabled
inadvertently
- Use the right EFI memory type for the unaccepted memory table so
kexec/kdump exposes it to the crash kernel as well
- Work around EFI implementations which do not implement
QueryVariableInfo, which is now called by statfs() on efivarfs
* tag 'efi-fixes-for-v6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
efivarfs: fix statfs() on efivarfs
efi/unaccepted: Use ACPI reclaim memory for unaccepted memory table
efi/x86: Ensure that EFI_RUNTIME_MAP is enabled for kexec
efi/x86: Move EFI runtime call setup/teardown helpers out of line
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- A fix to align kexec'd kernels to PMD boundries
- The T-Head dcache.cva encoding was incorrect, it has been fixed to
invalidate all caches (as opposed to just the L1)
* tag 'riscv-for-linus-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: errata: fix T-Head dcache.cva encoding
riscv: kexec: Align the kexeced kernel entry
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc architecture fixes from Helge Deller:
- fix reference to exported symbols for parisc64 [Masahiro Yamada]
- Block-TLB (BTLB) support on 32-bit CPUs
- sparse and build-warning fixes
* tag 'parisc-for-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
linux/export: fix reference to exported functions for parisc64
parisc: BTLB: Initialize BTLB tables at CPU startup
parisc: firmware: Simplify calling non-PA20 functions
parisc: BTLB: _edata symbol has to be page aligned for BTLB support
parisc: BTLB: Add BTLB insert and purge firmware function wrappers
parisc: BTLB: Clear possibly existing BTLB entries
parisc: Prepare for Block-TLB support on 32-bit kernel
parisc: shmparam.h: Document aliasing requirements of PA-RISC
parisc: irq: Make irq_stack_union static to avoid sparse warning
parisc: drivers: Fix sparse warning
parisc: iosapic.c: Fix sparse warnings
parisc: ccio-dma: Fix sparse warnings
parisc: sba-iommu: Fix sparse warnigs
parisc: sba: Fix compile warning wrt list of SBA devices
parisc: sba_iommu: Fix build warning if procfs if disabled
|
|
Commit 8f2d6c41e5a6 ("x86/sched: Rewrite topology setup") dropped the
SD_ASYM_PACKING flag in the DIE domain added in commit 044f0e27dec6
("x86/sched: Add the SD_ASYM_PACKING flag to the die domain of hybrid
processors"). Restore it on hybrid processors.
The die-level domain does not depend on any build configuration and now
x86_sched_itmt_flags() is always needed. Remove the build dependency on
CONFIG_SCHED_[SMT|CLUSTER|MC].
Fixes: 8f2d6c41e5a6 ("x86/sched: Rewrite topology setup")
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Chen Yu <yu.c.chen@intel.com>
Tested-by: Caleb Callaway <caleb.callaway@intel.com>
Link: https://lkml.kernel.org/r/20230815035747.11529-1-ricardo.neri-calderon@linux.intel.com
|
|
The dcache.cva encoding shown in the comments are wrong, it's for
dcache.cval1 (which is restricted to L1) instead.
Fix this in the comment and in the hardcoded instruction.
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Tested-by: Sergey Matyukevich <sergey.matyukevich@syntacore.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Guo Ren <guoren@kernel.org>
Tested-by: Drew Fustini <dfustini@baylibre.com>
Link: https://lore.kernel.org/r/20230912072410.2481-1-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
The current riscv boot protocol requires 2MB alignment for RV64
and 4MB alignment for RV32.
In KEXEC_FILE path, the elf_find_pbase() function should align
the kexeced kernel entry according to the requirement, otherwise
the kexeced kernel would silently BUG at the setup_vm().
Fixes: 8acea455fafa ("RISC-V: Support for kexec_file on panic")
Signed-off-by: Song Shuai <songshuaishuai@tinylab.org>
Link: https://lore.kernel.org/r/20230906095817.364390-1-songshuaishuai@tinylab.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Commit cb855971d717 ("x86/putuser: Provide room for padding") changed
__put_user_nocheck_*() into proper functions but failed to note that
SYM_FUNC_START() already provides ENDBR, rendering the explicit ENDBR
superfluous.
Fixes: cb855971d717 ("x86/putuser: Provide room for padding")
Reported-by: David Kaplan <David.Kaplan@amd.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230802110323.086971726@infradead.org
|
|
It was reported that under certain circumstances GCC emits ENDBR
instructions for _THIS_IP_ usage. Specifically, when it appears at the
start of a basic block -- but not elsewhere.
Since _THIS_IP_ is never used for control flow, these ENDBR
instructions are completely superfluous. Override the _THIS_IP_
definition for x86_64 to avoid this.
Less ENDBR instructions is better.
Fixes: 156ff4a544ae ("x86/ibt: Base IBT bits")
Reported-by: David Kaplan <David.Kaplan@amd.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230802110323.016197440@infradead.org
|
|
The UV code attempts to build a set of tables to allow it to do
bidirectional socket<=>node lookups.
But when nr_cpus is set to a smaller number than actually present, the
cpu_to_node() mapping information for unused CPUs is not available to
build_socket_tables(). This results in skipping some nodes or sockets
when creating the tables and leaving some -1's for later code to trip.
over, causing oopses.
The problem is that the socket<=>node lookups are created by doing a
loop over all CPUs, then looking up the CPU's APICID and socket. But
if a CPU is not present, there is no way to start this lookup.
Instead of looping over all CPUs, take CPUs out of the equation
entirely. Loop over all APICIDs which are mapped to a valid NUMA node.
Then just extract the socket-id from the APICID.
This avoid tripping over disabled CPUs.
Fixes: 8a50c5851927 ("x86/platform/uv: UV support for sub-NUMA clustering")
Signed-off-by: Steve Wahl <steve.wahl@hpe.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20230807141730.1117278-1-steve.wahl%40hpe.com
|
|
CONFIG_EFI_RUNTIME_MAP needs to be enabled in order for kexec to be able
to provide the required information about the EFI runtime mappings to
the incoming kernel, regardless of whether kexec_load() or
kexec_file_load() is being used. Without this information, kexec boot in
EFI mode is not possible.
The CONFIG_EFI_RUNTIME_MAP option is currently directly configurable if
CONFIG_EXPERT is enabled, so that it can be turned on for debugging
purposes even if KEXEC is not enabled. However, the upshot of this is
that it can also be disabled even when it shouldn't.
So tweak the Kconfig declarations to avoid this situation.
Reported-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
|
|
Only the arch_efi_call_virt() macro that some architectures override
needs to be a macro, given that it is variadic and encapsulates calls
via function pointers that have different prototypes.
The associated setup and teardown code are not special in this regard,
and don't need to be instantiated at each call site. So turn them into
ordinary C functions and move them out of line.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Fix preemption delays in the SGX code, remove unnecessarily
UAPI-exported code, fix a ld.lld linker (in)compatibility quirk and
make the x86 SMP init code a bit more conservative to fix kexec()
lockups"
* tag 'x86-urgent-2023-09-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/sgx: Break up long non-preemptible delays in sgx_vepc_release()
x86: Remove the arch_calc_vm_prot_bits() macro from the UAPI
x86/build: Fix linker fill bytes quirk/incompatibility for ld.lld
x86/smp: Don't send INIT to non-present and non-booted CPUs
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 perf event fix from Ingo Molnar:
"Work around a firmware bug in the uncore PMU driver, affecting certain
Intel systems"
* tag 'perf-urgent-2023-09-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/uncore: Correct the number of CHAs on EMR
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux
Pull sh updates from Adrian Glaubitz:
- Fix a use-after-free bug in the push-switch driver (Duoming Zhou)
- Fix calls to dma_declare_coherent_memory() that incorrectly passed
the buffer end address instead of the buffer size as the size
parameter
* tag 'sh-for-v6.6-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux:
sh: push-switch: Reorder cleanup operations to avoid use-after-free bug
sh: boards: Fix CEU buffer size passed to dma_declare_coherent_memory()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull more RISC-V updates from Palmer Dabbelt:
- The kernel now dynamically probes for misaligned access speed, as
opposed to relying on a table of known implementations.
- Support for non-coherent devices on systems using the Andes AX45MP
core, including the RZ/Five SoCs.
- Support for the V extension in ptrace(), again.
- Support for KASLR.
- Support for the BPF prog pack allocator in RISC-V.
- A handful of bug fixes and cleanups.
* tag 'riscv-for-linus-6.6-mw2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (25 commits)
soc: renesas: Kconfig: For ARCH_R9A07G043 select the required configs if dependencies are met
riscv: Kconfig.errata: Add dependency for RISCV_SBI in ERRATA_ANDES config
riscv: Kconfig.errata: Drop dependency for MMU in ERRATA_ANDES_CMO config
riscv: Kconfig: Select DMA_DIRECT_REMAP only if MMU is enabled
bpf, riscv: use prog pack allocator in the BPF JIT
riscv: implement a memset like function for text
riscv: extend patch_text_nosync() for multiple pages
bpf: make bpf_prog_pack allocator portable
riscv: libstub: Implement KASLR by using generic functions
libstub: Fix compilation warning for rv32
arm64: libstub: Move KASLR handling functions to kaslr.c
riscv: Dump out kernel offset information on panic
riscv: Introduce virtual kernel mapping KASLR
RISC-V: Add ptrace support for vectors
soc: renesas: Kconfig: Select the required configs for RZ/Five SoC
cache: Add L2 cache management for Andes AX45MP RISC-V core
dt-bindings: cache: andestech,ax45mp-cache: Add DT binding documentation for L2 cache controller
riscv: mm: dma-noncoherent: nonstandard cache operations support
riscv: errata: Add Andes alternative ports
riscv: asm: vendorid_list: Add Andes Technology to the vendors list
...
|
|
The original code puts flush_work() before timer_shutdown_sync()
in switch_drv_remove(). Although we use flush_work() to stop
the worker, it could be rescheduled in switch_timer(). As a result,
a use-after-free bug can occur. The details are shown below:
(cpu 0) | (cpu 1)
switch_drv_remove() |
flush_work() |
... | switch_timer // timer
| schedule_work(&psw->work)
timer_shutdown_sync() |
... | switch_work_handler // worker
kfree(psw) // free |
| psw->state = 0 // use
This patch puts timer_shutdown_sync() before flush_work() to
mitigate the bugs. As a result, the worker and timer will be
stopped safely before the deallocate operations.
Fixes: 9f5e8eee5cfe ("sh: generic push-switch framework.")
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Link: https://lore.kernel.org/r/20230802033737.9738-1-duoming@zju.edu.cn
Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
|
|
In all these cases, the last argument to dma_declare_coherent_memory() is
the buffer end address, but the expected value should be the size of the
reserved region.
Fixes: 39fb993038e1 ("media: arch: sh: ap325rxa: Use new renesas-ceu camera driver")
Fixes: c2f9b05fd5c1 ("media: arch: sh: ecovec: Use new renesas-ceu camera driver")
Fixes: f3590dc32974 ("media: arch: sh: kfr2r09: Use new renesas-ceu camera driver")
Fixes: 186c446f4b84 ("media: arch: sh: migor: Use new renesas-ceu camera driver")
Fixes: 1a3c230b4151 ("media: arch: sh: ms7724se: Use new renesas-ceu camera driver")
Signed-off-by: Petr Tesarik <petr.tesarik.ext@huawei.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Link: https://lore.kernel.org/r/20230724120742.2187-1-petrtesarik@huaweicloud.com
Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"The main one is a fix for a broken strscpy() conversion that landed in
the merge window and broke early parsing of the kernel command line.
- Fix an incorrect mask in the CXL PMU driver
- Fix a regression in early parsing of the kernel command line
- Fix an IP checksum OoB access reported by syzbot"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: csum: Fix OoB access in IP checksum code for negative lengths
arm64/sysreg: Fix broken strncpy() -> strscpy() conversion
perf: CXL: fix mismatched number of counters mask
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch updates from Huacai Chen:
- Allow usage of LSX/LASX in the kernel, and use them for
SIMD-optimized RAID5/RAID6 routines
- Add Loongson Binary Translation (LBT) extension support
- Add basic KGDB & KDB support
- Add building with kcov coverage
- Add KFENCE (Kernel Electric-Fence) support
- Add KASAN (Kernel Address Sanitizer) support
- Some bug fixes and other small changes
- Update the default config file
* tag 'loongarch-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (25 commits)
LoongArch: Update Loongson-3 default config file
LoongArch: Add KASAN (Kernel Address Sanitizer) support
LoongArch: Simplify the processing of jumping new kernel for KASLR
kasan: Add (pmd|pud)_init for LoongArch zero_(pud|p4d)_populate process
kasan: Add __HAVE_ARCH_SHADOW_MAP to support arch specific mapping
LoongArch: Add KFENCE (Kernel Electric-Fence) support
LoongArch: Get partial stack information when providing regs parameter
LoongArch: mm: Add page table mapped mode support for virt_to_page()
kfence: Defer the assignment of the local variable addr
LoongArch: Allow building with kcov coverage
LoongArch: Provide kaslr_offset() to get kernel offset
LoongArch: Add basic KGDB & KDB support
LoongArch: Add Loongson Binary Translation (LBT) extension support
raid6: Add LoongArch SIMD recovery implementation
raid6: Add LoongArch SIMD syndrome calculation
LoongArch: Add SIMD-optimized XOR routines
LoongArch: Allow usage of LSX/LASX in the kernel
LoongArch: Define symbol 'fault' as a local label in fpu.S
LoongArch: Adjust {copy, clear}_user exception handler behavior
LoongArch: Use static defined zero page rather than allocated
...
|
|
Andes errata uses sbi_ecalll() which is only available if RISCV_SBI is
enabled. So add an dependency for RISCV_SBI in ERRATA_ANDES config to
avoid any build failures.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202308311610.ec6bm2G8-lkp@intel.com/
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20230901110320.312674-1-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Now that RISCV_DMA_NONCOHERENT conditionally selects DMA_DIRECT_REMAP
ie only if MMU is enabled, we no longer need the MMU dependency in
ERRATA_ANDES_CMO config.
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20230901105858.311745-1-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
kernel/dma/mapping.c has its use of pgprot_dmacoherent() inside
an #ifdef CONFIG_MMU block. kernel/dma/pool.c has its use of
pgprot_dmacoherent() inside an #ifdef CONFIG_DMA_DIRECT_REMAP block.
So select DMA_DIRECT_REMAP only if MMU is enabled for RISCV_DMA_NONCOHERENT
config.
This avoids users to explicitly select MMU.
Suggested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://lore.kernel.org/r/20230901105111.311200-1-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Puranjay Mohan <puranjay12@gmail.com> says:
Here is some data to prove the V2 fixes the problem:
Without this series:
root@rv-selftester:~/src/kselftest/bpf# time ./test_tag
test_tag: OK (40945 tests)
real 7m47.562s
user 0m24.145s
sys 6m37.064s
With this series applied:
root@rv-selftester:~/src/selftest/bpf# time ./test_tag
test_tag: OK (40945 tests)
real 7m29.472s
user 0m25.865s
sys 6m18.401s
BPF programs currently consume a page each on RISCV. For systems with many BPF
programs, this adds significant pressure to instruction TLB. High iTLB pressure
usually causes slow down for the whole system.
Song Liu introduced the BPF prog pack allocator[1] to mitigate the above issue.
It packs multiple BPF programs into a single huge page. It is currently only
enabled for the x86_64 BPF JIT.
I enabled this allocator on the ARM64 BPF JIT[2]. It is being reviewed now.
This patch series enables the BPF prog pack allocator for the RISCV BPF JIT.
======================================================
Performance Analysis of prog pack allocator on RISCV64
======================================================
Test setup:
===========
Host machine: Debian GNU/Linux 11 (bullseye)
Qemu Version: QEMU emulator version 8.0.3 (Debian 1:8.0.3+dfsg-1)
u-boot-qemu Version: 2023.07+dfsg-1
opensbi Version: 1.3-1
To test the performance of the BPF prog pack allocator on RV, a stresser
tool[4] linked below was built. This tool loads 8 BPF programs on the system and
triggers 5 of them in an infinite loop by doing system calls.
The runner script starts 20 instances of the above which loads 8*20=160 BPF
programs on the system, 5*20=100 of which are being constantly triggered.
The script is passed a command which would be run in the above environment.
The script was run with following perf command:
./run.sh "perf stat -a \
-e iTLB-load-misses \
-e dTLB-load-misses \
-e dTLB-store-misses \
-e instructions \
--timeout 60000"
The output of the above command is discussed below before and after enabling the
BPF prog pack allocator.
The tests were run on qemu-system-riscv64 with 8 cpus, 16G memory. The rootfs
was created using Bjorn's riscv-cross-builder[5] docker container linked below.
Results
=======
Before enabling prog pack allocator:
------------------------------------
Performance counter stats for 'system wide':
4939048 iTLB-load-misses
5468689 dTLB-load-misses
465234 dTLB-store-misses
1441082097998 instructions
60.045791200 seconds time elapsed
After enabling prog pack allocator:
-----------------------------------
Performance counter stats for 'system wide':
3430035 iTLB-load-misses
5008745 dTLB-load-misses
409944 dTLB-store-misses
1441535637988 instructions
60.046296600 seconds time elapsed
Improvements in metrics
=======================
It was expected that the iTLB-load-misses would decrease as now a single huge
page is used to keep all the BPF programs compared to a single page for each
program earlier.
--------------------------------------------
The improvement in iTLB-load-misses: -30.5 %
--------------------------------------------
I repeated this expriment more than 100 times in different setups and the
improvement was always greater than 30%.
This patch series is boot tested on the Starfive VisionFive 2 board[6].
The performance analysis was not done on the board because it doesn't
expose iTLB-load-misses, etc. The stresser program was run on the board to test
the loading and unloading of BPF programs
[1] https://lore.kernel.org/bpf/20220204185742.271030-1-song@kernel.org/
[2] https://lore.kernel.org/all/20230626085811.3192402-1-puranjay12@gmail.com/
[3] https://lore.kernel.org/all/20230626085811.3192402-2-puranjay12@gmail.com/
[4] https://github.com/puranjaymohan/BPF-Allocator-Bench
[5] https://github.com/bjoto/riscv-cross-builder
[6] https://www.starfivetech.com/en/site/boards
* b4-shazam-merge:
bpf, riscv: use prog pack allocator in the BPF JIT
riscv: implement a memset like function for text
riscv: extend patch_text_nosync() for multiple pages
bpf: make bpf_prog_pack allocator portable
Link: https://lore.kernel.org/r/20230831131229.497941-1-puranjay12@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Alexandre Ghiti <alexghiti@rivosinc.com> says:
The following KASLR implementation allows to randomize the kernel mapping:
- virtually: we expect the bootloader to provide a seed in the device-tree
- physically: only implemented in the EFI stub, it relies on the firmware to
provide a seed using EFI_RNG_PROTOCOL. arm64 has a similar implementation
hence the patch 3 factorizes KASLR related functions for riscv to take
advantage.
The new virtual kernel location is limited by the early page table that only
has one PUD and with the PMD alignment constraint, the kernel can only take
< 512 positions.
* b4-shazam-merge:
riscv: libstub: Implement KASLR by using generic functions
libstub: Fix compilation warning for rv32
arm64: libstub: Move KASLR handling functions to kaslr.c
riscv: Dump out kernel offset information on panic
riscv: Introduce virtual kernel mapping KASLR
Link: https://lore.kernel.org/r/20230722123850.634544-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
This resurrects the vector ptrace() support that was removed for 6.5 due
to some bugs cropping up as part of the GDB review process.
* b4-shazam-merge:
RISC-V: Add ptrace support for vectors
Link: https://lore.kernel.org/r/20230825050248.32681-1-andy.chiu@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Prabhakar <prabhakar.csengg@gmail.com> says:
From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
non-coherent DMA support for AX45MP
====================================
On the Andes AX45MP core, cache coherency is a specification option so it
may not be supported. In this case DMA will fail. To get around with this
issue this patch series does the below:
1] Andes alternative ports is implemented as errata which checks if the
IOCP is missing and only then applies to CMO errata. One vendor specific
SBI EXT (ANDES_SBI_EXT_IOCP_SW_WORKAROUND) is implemented as part of
errata.
Below are the configs which Andes port provides (and are selected by
RZ/Five):
- ERRATA_ANDES
- ERRATA_ANDES_CMO
OpenSBI patch supporting ANDES_SBI_EXT_IOCP_SW_WORKAROUND SBI is now
part v1.3 release.
2] Andes AX45MP core has a Programmable Physical Memory Attributes (PMA)
block that allows dynamic adjustment of memory attributes in the runtime.
It contains a configurable amount of PMA entries implemented as CSR
registers to control the attributes of memory locations in interest.
OpenSBI configures the PMA regions as required and creates a reserve memory
node and propagates it to the higher boot stack.
Currently OpenSBI (upstream) configures the required PMA region and passes
this a shared DMA pool to Linux.
reserved-memory {
#address-cells = <2>;
#size-cells = <2>;
ranges;
pma_resv0@58000000 {
compatible = "shared-dma-pool";
reg = <0x0 0x58000000 0x0 0x08000000>;
no-map;
linux,dma-default;
};
};
The above shared DMA pool gets appended to Linux DTB so the DMA memory
requests go through this region.
3] We provide callbacks to synchronize specific content between memory and
cache.
4] RZ/Five SoC selects the below configs
- AX45MP_L2_CACHE
- DMA_GLOBAL_POOL
- ERRATA_ANDES
- ERRATA_ANDES_CMO
----------x---------------------x--------------------x---------------x----
* b4-shazam-merge:
soc: renesas: Kconfig: Select the required configs for RZ/Five SoC
cache: Add L2 cache management for Andes AX45MP RISC-V core
dt-bindings: cache: andestech,ax45mp-cache: Add DT binding documentation for L2 cache controller
riscv: mm: dma-noncoherent: nonstandard cache operations support
riscv: errata: Add Andes alternative ports
riscv: asm: vendorid_list: Add Andes Technology to the vendors list
Link: https://lore.kernel.org/r/20230818135723.80612-1-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Prabhakar <prabhakar.csengg@gmail.com> says:
From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
This patch series is a subset from Arnd's original series [0]. Ive just
picked up the bits required for RISC-V unification of cache flushing.
Remaining patches from the series [0] will be taken care by Arnd soon.
* b4-shazam-merge:
riscv: dma-mapping: switch over to generic implementation
riscv: dma-mapping: skip invalidation before bidirectional DMA
riscv: dma-mapping: only invalidate after DMA, not flush
Link: https://lore.kernel.org/r/20230816232336.164413-1-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Evan Green <evan@rivosinc.com> says:
The current setting for the hwprobe bit indicating misaligned access
speed is controlled by a vendor-specific feature probe function. This is
essentially a per-SoC table we have to maintain on behalf of each vendor
going forward. Let's convert that instead to something we detect at
runtime.
We have two assembly routines at the heart of our probe: one that
does a bunch of word-sized accesses (without aligning its input buffer),
and the other that does byte accesses. If we can move a larger number of
bytes using misaligned word accesses than we can with the same amount of
time doing byte accesses, then we can declare misaligned accesses as
"fast".
The tradeoff of reducing this maintenance burden is boot time. We spend
4-6 jiffies per core doing this measurement (0-2 on jiffie edge
alignment, and 4 on measurement). The timing loop was based on
raid6_choose_gen(), which uses (16+1)*N jiffies (where N is the number
of algorithms). By taking only the fastest iteration out of all
attempts for use in the comparison, variance between runs is very low.
On my THead C906, it looks like this:
[ 0.047563] cpu0: Ratio of byte access time to unaligned word access is 4.34, unaligned accesses are fast
Several others have chimed in with results on slow machines with the
older algorithm, which took all runs into account, including noise like
interrupts. Even with this variation, results indicate that in all cases
(fast, slow, and emulated) the measured numbers are nowhere near each
other (always multiple factors away).
* b4-shazam-merge:
RISC-V: alternative: Remove feature_probe_func
RISC-V: Probe for unaligned access speed
Link: https://lore.kernel.org/r/20230818194136.4084400-1-evan@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking updates from Jakub Kicinski:
"Including fixes from netfilter and bpf.
Current release - regressions:
- eth: stmmac: fix failure to probe without MAC interface specified
Current release - new code bugs:
- docs: netlink: fix missing classic_netlink doc reference
Previous releases - regressions:
- deal with integer overflows in kmalloc_reserve()
- use sk_forward_alloc_get() in sk_get_meminfo()
- bpf_sk_storage: fix the missing uncharge in sk_omem_alloc
- fib: avoid warn splat in flow dissector after packet mangling
- skb_segment: call zero copy functions before using skbuff frags
- eth: sfc: check for zero length in EF10 RX prefix
Previous releases - always broken:
- af_unix: fix msg_controllen test in scm_pidfd_recv() for
MSG_CMSG_COMPAT
- xsk: fix xsk_build_skb() dereferencing possible ERR_PTR()
- netfilter:
- nft_exthdr: fix non-linear header modification
- xt_u32, xt_sctp: validate user space input
- nftables: exthdr: fix 4-byte stack OOB write
- nfnetlink_osf: avoid OOB read
- one more fix for the garbage collection work from last release
- igmp: limit igmpv3_newpack() packet size to IP_MAX_MTU
- bpf, sockmap: fix preempt_rt splat when using raw_spin_lock_t
- handshake: fix null-deref in handshake_nl_done_doit()
- ip: ignore dst hint for multipath routes to ensure packets are
hashed across the nexthops
- phy: micrel:
- correct bit assignments for cable test errata
- disable EEE according to the KSZ9477 errata
Misc:
- docs/bpf: document compile-once-run-everywhere (CO-RE) relocations
- Revert "net: macsec: preserve ingress frame ordering", it appears
to have been developed against an older kernel, problem doesn't
exist upstream"
* tag 'net-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (95 commits)
net: enetc: distinguish error from valid pointers in enetc_fixup_clear_rss_rfs()
Revert "net: team: do not use dynamic lockdep key"
net: hns3: remove GSO partial feature bit
net: hns3: fix the port information display when sfp is absent
net: hns3: fix invalid mutex between tc qdisc and dcb ets command issue
net: hns3: fix debugfs concurrency issue between kfree buffer and read
net: hns3: fix byte order conversion issue in hclge_dbg_fd_tcam_read()
net: hns3: Support query tx timeout threshold by debugfs
net: hns3: fix tx timeout issue
net: phy: Provide Module 4 KSZ9477 errata (DS80000754C)
netfilter: nf_tables: Unbreak audit log reset
netfilter: ipset: add the missing IP_SET_HASH_WITH_NET0 macro for ip_set_hash_netportnet.c
netfilter: nft_set_rbtree: skip sync GC for new elements in this transaction
netfilter: nf_tables: uapi: Describe NFTA_RULE_CHAIN_ID
netfilter: nfnetlink_osf: avoid OOB read
netfilter: nftables: exthdr: fix 4-byte stack OOB write
selftests/bpf: Check bpf_sk_storage has uncharged sk_omem_alloc
bpf: bpf_sk_storage: Fix the missing uncharge in sk_omem_alloc
bpf: bpf_sk_storage: Fix invalid wait context lockdep report
s390/bpf: Pass through tail call counter in trampolines
...
|
|
Pull kvm updates from Paolo Bonzini:
"ARM:
- Clean up vCPU targets, always returning generic v8 as the preferred
target
- Trap forwarding infrastructure for nested virtualization (used for
traps that are taken from an L2 guest and are needed by the L1
hypervisor)
- FEAT_TLBIRANGE support to only invalidate specific ranges of
addresses when collapsing a table PTE to a block PTE. This avoids
that the guest refills the TLBs again for addresses that aren't
covered by the table PTE.
- Fix vPMU issues related to handling of PMUver.
- Don't unnecessary align non-stack allocations in the EL2 VA space
- Drop HCR_VIRT_EXCP_MASK, which was never used...
- Don't use smp_processor_id() in kvm_arch_vcpu_load(), but the cpu
parameter instead
- Drop redundant call to kvm_set_pfn_accessed() in user_mem_abort()
- Remove prototypes without implementations
RISC-V:
- Zba, Zbs, Zicntr, Zicsr, Zifencei, and Zihpm support for guest
- Added ONE_REG interface for SATP mode
- Added ONE_REG interface to enable/disable multiple ISA extensions
- Improved error codes returned by ONE_REG interfaces
- Added KVM_GET_REG_LIST ioctl() implementation for KVM RISC-V
- Added get-reg-list selftest for KVM RISC-V
s390:
- PV crypto passthrough enablement (Tony, Steffen, Viktor, Janosch)
Allows a PV guest to use crypto cards. Card access is governed by
the firmware and once a crypto queue is "bound" to a PV VM every
other entity (PV or not) looses access until it is not bound
anymore. Enablement is done via flags when creating the PV VM.
- Guest debug fixes (Ilya)
x86:
- Clean up KVM's handling of Intel architectural events
- Intel bugfixes
- Add support for SEV-ES DebugSwap, allowing SEV-ES guests to use
debug registers and generate/handle #DBs
- Clean up LBR virtualization code
- Fix a bug where KVM fails to set the target pCPU during an IRTE
update
- Fix fatal bugs in SEV-ES intrahost migration
- Fix a bug where the recent (architecturally correct) change to
reinject #BP and skip INT3 broke SEV guests (can't decode INT3 to
skip it)
- Retry APIC map recalculation if a vCPU is added/enabled
- Overhaul emergency reboot code to bring SVM up to par with VMX, tie
the "emergency disabling" behavior to KVM actually being loaded,
and move all of the logic within KVM
- Fix user triggerable WARNs in SVM where KVM incorrectly assumes the
TSC ratio MSR cannot diverge from the default when TSC scaling is
disabled up related code
- Add a framework to allow "caching" feature flags so that KVM can
check if the guest can use a feature without needing to search
guest CPUID
- Rip out the ancient MMU_DEBUG crud and replace the useful bits with
CONFIG_KVM_PROVE_MMU
- Fix KVM's handling of !visible guest roots to avoid premature
triple fault injection
- Overhaul KVM's page-track APIs, and KVMGT's usage, to reduce the
API surface that is needed by external users (currently only
KVMGT), and fix a variety of issues in the process
Generic:
- Wrap kvm_{gfn,hva}_range.pte in a union to allow mmu_notifier
events to pass action specific data without needing to constantly
update the main handlers.
- Drop unused function declarations
Selftests:
- Add testcases to x86's sync_regs_test for detecting KVM TOCTOU bugs
- Add support for printf() in guest code and covert all guest asserts
to use printf-based reporting
- Clean up the PMU event filter test and add new testcases
- Include x86 selftests in the KVM x86 MAINTAINERS entry"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (279 commits)
KVM: x86/mmu: Include mmu.h in spte.h
KVM: x86/mmu: Use dummy root, backed by zero page, for !visible guest roots
KVM: x86/mmu: Disallow guest from using !visible slots for page tables
KVM: x86/mmu: Harden TDP MMU iteration against root w/o shadow page
KVM: x86/mmu: Harden new PGD against roots without shadow pages
KVM: x86/mmu: Add helper to convert root hpa to shadow page
drm/i915/gvt: Drop final dependencies on KVM internal details
KVM: x86/mmu: Handle KVM bookkeeping in page-track APIs, not callers
KVM: x86/mmu: Drop @slot param from exported/external page-track APIs
KVM: x86/mmu: Bug the VM if write-tracking is used but not enabled
KVM: x86/mmu: Assert that correct locks are held for page write-tracking
KVM: x86/mmu: Rename page-track APIs to reflect the new reality
KVM: x86/mmu: Drop infrastructure for multiple page-track modes
KVM: x86/mmu: Use page-track notifiers iff there are external users
KVM: x86/mmu: Move KVM-only page-track declarations to internal header
KVM: x86: Remove the unused page-track hook track_flush_slot()
drm/i915/gvt: switch from ->track_flush_slot() to ->track_remove_region()
KVM: x86: Add a new page-track hook to handle memslot deletion
drm/i915/gvt: Don't bother removing write-protection on to-be-deleted slot
KVM: x86: Reject memslot MOVE operations if KVMGT is attached
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull more s390 updates from Heiko Carstens:
- A couple of virtual vs physical address confusion fixes
- Rework locking in dcssblk driver to address a lockdep warning
- Remove support for "noexec" kernel command line option since there is
no use case where it would make sense
- Simplify kernel mapping setup and get rid of quite a bit of code
- Add architecture specific __set_memory_yy() functions which allow us
to modify kernel mappings. Unlike the set_memory_xx() variants they
take void pointer start and end parameters, which allows using them
without the usual casts, and also to use them on areas larger than
8TB.
Note that the set_memory_xx() family comes with an int num_pages
parameter which overflows with 8TB. This could be addressed by
changing the num_pages parameter to unsigned long, however requires
to change all architectures, since the module code expects an int
parameter (see module_set_memory()).
This was indeed an issue since for debug_pagealloc() we call
set_memory_4k() on the whole identity mapping. Therefore address this
for now with the __set_memory_yy() variant, and address common code
later
- Use dev_set_name() and also fix memory leak in zcrypt driver error
handling
- Remove unused lsi_mask from airq_struct
- Add warning for invalid kernel mapping requests
* tag 's390-6.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/vmem: do not silently ignore mapping limit
s390/zcrypt: utilize dev_set_name() ability to use a formatted string
s390/zcrypt: don't leak memory if dev_set_name() fails
s390/mm: fix MAX_DMA_ADDRESS physical vs virtual confusion
s390/airq: remove lsi_mask from airq_struct
s390/mm: use __set_memory() variants where useful
s390/set_memory: add __set_memory() variant
s390/set_memory: generate all set_memory() functions
s390/mm: improve description of mapping permissions of prefix pages
s390/amode31: change type of __samode31, __eamode31, etc
s390/mm: simplify kernel mapping setup
s390: remove "noexec" option
s390/vmem: fix virtual vs physical address confusion
s390/dcssblk: fix lockdep warning
s390/monreader: fix virtual vs physical address confusion
|
|
Pull MIPS updates from Thomas Bogendoerfer:
"Just cleanups and fixes"
* tag 'mips_6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MIPS: TXx9: Do PCI error checks on own line
arch/mips/configs/*_defconfig cleanup
MIPS: VDSO: Conditionally export __vdso_gettimeofday()
Mips: loongson3_defconfig: Enable ast drm driver by default
mips: remove <asm/export.h>
mips: replace #include <asm/export.h> with #include <linux/export.h>
mips: remove unneeded #include <asm/export.h>
MIPS: Loongson64: Fix more __iomem attributes
MIPS: loongson32: Remove regs-rtc.h
MIPS: loongson32: Remove regs-clk.h
MIPS: More explicit DT include clean-ups
MIPS: Fixup explicit DT include clean-up
Revert MIPS: Loongson: Fix build error when make modules_install
MIPS: Only fiddle with CHECKFLAGS if `need-compiler'
MIPS: Fix CONFIG_CPU_DADDI_WORKAROUNDS `modules_install' regression
MIPS: Explicitly include correct DT includes
|
|
Pull xtensa updates from Max Filippov:
- enable MTD XIP support
- fix base address of the xtensa perf module in newer hardware
* tag 'xtensa-20230905' of https://github.com/jcmvbkbc/linux-xtensa:
xtensa: add XIP-aware MTD support
xtensa: PMU: fix base address for the newer hardware
|
|
Although commit c2c24edb1d9c ("arm64: csum: Fix pathological zero-length
calls") added an early return for zero-length input, syzkaller has
popped up with an example of a _negative_ length which causes an
undefined shift and an out-of-bounds read:
| BUG: KASAN: slab-out-of-bounds in do_csum+0x44/0x254 arch/arm64/lib/csum.c:39
| Read of size 4294966928 at addr ffff0000d7ac0170 by task syz-executor412/5975
|
| CPU: 0 PID: 5975 Comm: syz-executor412 Not tainted 6.4.0-rc4-syzkaller-g908f31f2a05b #0
| Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/25/2023
| Call trace:
| dump_backtrace+0x1b8/0x1e4 arch/arm64/kernel/stacktrace.c:233
| show_stack+0x2c/0x44 arch/arm64/kernel/stacktrace.c:240
| __dump_stack lib/dump_stack.c:88 [inline]
| dump_stack_lvl+0xd0/0x124 lib/dump_stack.c:106
| print_address_description mm/kasan/report.c:351 [inline]
| print_report+0x174/0x514 mm/kasan/report.c:462
| kasan_report+0xd4/0x130 mm/kasan/report.c:572
| kasan_check_range+0x264/0x2a4 mm/kasan/generic.c:187
| __kasan_check_read+0x20/0x30 mm/kasan/shadow.c:31
| do_csum+0x44/0x254 arch/arm64/lib/csum.c:39
| csum_partial+0x30/0x58 lib/checksum.c:128
| gso_make_checksum include/linux/skbuff.h:4928 [inline]
| __udp_gso_segment+0xaf4/0x1bc4 net/ipv4/udp_offload.c:332
| udp6_ufo_fragment+0x540/0xca0 net/ipv6/udp_offload.c:47
| ipv6_gso_segment+0x5cc/0x1760 net/ipv6/ip6_offload.c:119
| skb_mac_gso_segment+0x2b4/0x5b0 net/core/gro.c:141
| __skb_gso_segment+0x250/0x3d0 net/core/dev.c:3401
| skb_gso_segment include/linux/netdevice.h:4859 [inline]
| validate_xmit_skb+0x364/0xdbc net/core/dev.c:3659
| validate_xmit_skb_list+0x94/0x130 net/core/dev.c:3709
| sch_direct_xmit+0xe8/0x548 net/sched/sch_generic.c:327
| __dev_xmit_skb net/core/dev.c:3805 [inline]
| __dev_queue_xmit+0x147c/0x3318 net/core/dev.c:4210
| dev_queue_xmit include/linux/netdevice.h:3085 [inline]
| packet_xmit+0x6c/0x318 net/packet/af_packet.c:276
| packet_snd net/packet/af_packet.c:3081 [inline]
| packet_sendmsg+0x376c/0x4c98 net/packet/af_packet.c:3113
| sock_sendmsg_nosec net/socket.c:724 [inline]
| sock_sendmsg net/socket.c:747 [inline]
| __sys_sendto+0x3b4/0x538 net/socket.c:2144
Extend the early return to reject negative lengths as well, aligning our
implementation with the generic code in lib/checksum.c
Cc: Robin Murphy <robin.murphy@arm.com>
Fixes: 5777eaed566a ("arm64: Implement optimised checksum routine")
Reported-by: syzbot+4a9f9820bd8d302e22f7@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/000000000000e0e94c0603f8d213@google.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Initialize the BTLB entries when starting up a CPU.
Note that BTLBs are not available on 64-bit CPUs.
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
Instead of usig #ifdefs, simply return PDC_BAD_PROC for functions
which aren't available on 64-bit CPUs.
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
Call PDC to remove all existing BTLB entries (which may exist from
some previous operating system runs) before switching to virtual mode.
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
Change HUGEPAGE_SIZE to become 4 MB on 32-bit kernels, which leads
that kernel code and kernel data will start on 4 MB boundaries.
Although a 32-bit kernel does not support huge pages, most
machines have support for Block-TLBs (BTLB) which allow to
configure the system to use large pages (block TLBs) to minimize
the TLB contention. This is done through calls to PDC and the
32-bit kernel can then call BTLB PDC functions to tell
the machine to optimize the TLBs.
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
Add some documentation why PA-RISC uses SHMLBA and SHM_COLOUR.
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
1, Enable LSX and LASX.
2, Enable KASLR (CONFIG_RANDOMIZE_BASE).
3, Enable jump label (patching mechanism for static key).
4, Enable LoongArch CRC32(c) Acceleration.
5, Enable Loongson-specific drivers: I2C/RTC/DRM/SOC/CLK/PINCTRL/GPIO/SPI.
6, Enable EXFAT/NTFS3/JFS/GFS2/OCFS2/UBIFS/EROFS/CEPH file systems.
7, Enable WangXun NGBE/TXGBE NIC drivers.
8, Enable some IPVS options.
9, Remove CONFIG_SYSFS_DEPRECATED since it is removed in Kconfig.
10, Remove CONFIG_IP_NF_TARGET_CLUSTERIP since it is removed in Kconfig.
11, Remove CONFIG_NFT_OBJREF since it is removed in Kconfig.
12, Remove CONFIG_R8188EU since it is replaced by CONFIG_RTL8XXXU.
Signed-off-by: Trevor Woerner <twoerner@gmail.com>
Signed-off-by: Xuewen Wang <wangxuewen@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
On large enclaves we hit the softlockup warning with following call trace:
xa_erase()
sgx_vepc_release()
__fput()
task_work_run()
do_exit()
The latency issue is similar to the one fixed in:
8795359e35bc ("x86/sgx: Silence softlockup detection when releasing large enclaves")
The test system has 64GB of enclave memory, and all is assigned to a single VM.
Release of 'vepc' takes a longer time and causes long latencies, which triggers
the softlockup warning.
Add cond_resched() to give other tasks a chance to run and reduce
latencies, which also avoids the softlockup detector.
[ mingo: Rewrote the changelog. ]
Fixes: 540745ddbc70 ("x86/sgx: Introduce virtual EPC for use by KVM guests")
Reported-by: Yu Zhang <yu.zhang@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Yu Zhang <yu.zhang@ionos.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Acked-by: Haitao Huang <haitao.huang@linux.intel.com>
Cc: stable@vger.kernel.org
|
|
The arch_calc_vm_prot_bits() macro uses VM_PKEY_BIT0 etc. which are
not part of the UAPI, so the macro is completely useless for userspace.
It is also hidden behind the CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
config switch which we shouldn't expose to userspace. Thus let's move
this macro into a new internal header instead.
Fixes: 8f62c883222c ("x86/mm/pkeys: Add arch-specific VMA protection bits")
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Link: https://lore.kernel.org/r/20230906162658.142511-1-thuth@redhat.com
|
|
With ":text =0xcccc", ld.lld fills unused text area with 0xcccc0000.
Example objdump -D output:
ffffffff82b04203: 00 00 add %al,(%rax)
ffffffff82b04205: cc int3
ffffffff82b04206: cc int3
ffffffff82b04207: 00 00 add %al,(%rax)
ffffffff82b04209: cc int3
ffffffff82b0420a: cc int3
Replace it with ":text =0xcccccccc", so we get the following instead:
ffffffff82b04203: cc int3
ffffffff82b04204: cc int3
ffffffff82b04205: cc int3
ffffffff82b04206: cc int3
ffffffff82b04207: cc int3
ffffffff82b04208: cc int3
gcc/ld doesn't seem to have the same issue. The generated code stays the
same for gcc/ld.
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Fixes: 7705dc855797 ("x86/vmlinux: Use INT3 instead of NOP for linker fill bytes")
Link: https://lore.kernel.org/r/20230906175215.2236033-1-song@kernel.org
|
|
Mostafa reports that commit d232606773a0 ("arm64/sysreg: refactor
deprecated strncpy") breaks our early command-line parsing because the
original code is working on space-delimited substrings rather than
NUL-terminated strings.
Rather than simply reverting the broken conversion patch, replace the
strscpy() with a simple memcpy() with an explicit NUL-termination of the
result.
Reported-by: Mostafa Saleh <smostafa@google.com>
Tested-by: Mostafa Saleh <smostafa@google.com>
Fixes: d232606773a0 ("arm64/sysreg: refactor deprecated strncpy")
Signed-off-by: Justin Stitt <justinstitt@google.com>
Link: https://lore.kernel.org/r/20230905-strncpy-arch-arm64-v4-1-bc4b14ddfaef@google.com
Link: https://lore.kernel.org/r/20230831162227.2307863-1-smostafa@google.com
Signed-off-by: Will Deacon <will@kernel.org>
|