Age | Commit message (Collapse) | Author |
|
The use of config_enabled() is ambiguous. For config options,
IS_ENABLED(), IS_REACHABLE(), etc. will make intention clearer.
Sometimes config_enabled() has been used for non-config options because
it is useful to check whether the given symbol is defined or not.
I have been tackling on deprecating config_enabled(), and now is the
time to finish this work.
Some new users have appeared for v4.9-rc1, but it is trivial to replace
them:
- arch/x86/mm/kaslr.c
replace config_enabled() with IS_ENABLED() because
CONFIG_X86_ESPFIX64 and CONFIG_EFI are boolean.
- include/asm-generic/export.h
replace config_enabled() with __is_defined().
Then, config_enabled() can be removed now.
Going forward, please use IS_ENABLED(), IS_REACHABLE(), etc. for config
options, and __is_defined() for non-config symbols.
Link: http://lkml.kernel.org/r/1476616078-32252-1-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Michal Marek <mmarek@suse.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Back in commit f56141e3e2d9 ("all arches, signal: move restart_block to
struct task_struct"), all architectures and core code were changed to
use task_struct::restart_block. However, when h8300 support was
subsequently restored in v4.2, it was not updated to account for this,
and maintains thread_info::restart_block, which is not kept in sync.
This patch drops the redundant restart_block from thread_info, and moves
h8300 to the common one in task_struct, ensuring that syscall restarting
always works as expected.
Fixes: f56141e3e2d9 ("all arches, signal: move restart_block to struct task_struct")
Link: http://lkml.kernel.org/r/1476714934-11635-1-git-send-email-mark.rutland@arm.com
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: uclinux-h8-devel@lists.sourceforge.jp
Cc: <stable@vger.kernel.org> [4.2+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from David Vrabel:
- advertise control feature flags in xenstore
- fix x86 build when XEN_PVHVM is disabled
* tag 'for-linus-4.9-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xenbus: check return value of xenbus_scanf()
xenbus: prefer list_for_each()
x86: xen: move cpu_up functions out of ifdef
xenbus: advertise control feature flags
|
|
Three newly introduced functions are not defined when CONFIG_XEN_PVHVM is
disabled, but are still being used:
arch/x86/xen/enlighten.c:141:12: warning: ‘xen_cpu_up_prepare’ used but never defined
arch/x86/xen/enlighten.c:142:12: warning: ‘xen_cpu_up_online’ used but never defined
arch/x86/xen/enlighten.c:143:12: warning: ‘xen_cpu_dead’ used but never defined
Fixes: 4d737042d6c4 ("xen/x86: Convert to hotplug state machine")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Three fixes, a hw-enablement and a cross-arch fix/enablement change:
- SGI/UV fix for older platforms
- x32 signal handling fix
- older x86 platform bootup APIC fix
- AVX512-4VNNIW (Neural Network Instructions) and AVX512-4FMAPS
(Multiply Accumulation Single precision instructions) enablement.
- move thread_info back into x86 specific code, to make life easier
for other architectures trying to make use of
CONFIG_THREAD_INFO_IN_TASK_STRUCT=y"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot/smp: Don't try to poke disabled/non-existent APIC
sched/core, x86: Make struct thread_info arch specific again
x86/signal: Remove bogus user_64bit_mode() check from sigaction_compat_abi()
x86/platform/UV: Fix support for EFI_OLD_MEMMAP after BIOS callback updates
x86/cpufeature: Add AVX512_4VNNIW and AVX512_4FMAPS features
x86/vmware: Skip timer_irq_works() check on VMware
|
|
Apparently trying to poke a disabled or non-existent APIC
leads to a box that doesn't even boot. Let's not do that.
No real clue if this is the right fix, but at least my
P3 machine boots again.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: dyoung@redhat.com
Cc: kexec@lists.infradead.org
Cc: stable@vger.kernel.org
Fixes: 2a51fe083eba ("arch/x86: Handle non enumerated CPU after physical hotplug")
Link: http://lkml.kernel.org/r/1477102684-5092-1-git-send-email-ville.syrjala@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Fixes marked for stable:
- Prevent unlikely crash in copro_calculate_slb() (Frederic Barrat)
- cxl: Prevent adapter reset if an active context exists (Vaibhav Jain)
Fixes for code merged this cycle:
- Fix boot on systems with uncompressed kernel image (Heiner Kallweit)
- Drop dump_numa_memory_topology() (Michael Ellerman)
- Fix numa topology console print (Aneesh Kumar K.V)
- Ignore the pkey system calls for now (Stephen Rothwell)"
* tag 'powerpc-4.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc: Ignore the pkey system calls for now
powerpc: Fix numa topology console print
powerpc/mm: Drop dump_numa_memory_topology()
cxl: Prevent adapter reset if an active context exists
powerpc/boot: Fix boot on systems with uncompressed kernel image
powerpc/mm: Prevent unlikely crash in copro_calculate_slb()
|
|
Pull KVM fixes from Radim Krčmář:
"ARM:
- avoid livelock when walking guest page tables
- fix HYP mode static keys without CC_HAVE_ASM_GOTO
MIPS:
- fix a build error without TRACEPOINTS_ENABLED
s390:
- reject a malformed userspace configuration
x86:
- suppress a warning without CONFIG_CPU_FREQ
- initialize whole irq_eoi array"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
arm/arm64: KVM: Map the BSS at HYP
arm64: KVM: Take S1 walks into account when determining S2 write faults
KVM: s390: reject invalid modes for runtime instrumentation
kvm: x86: memset whole irq_eoi
kvm/x86: Fix unused variable warning in kvm_timer_init()
KVM: MIPS: Add missing uaccess.h include
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm
KVM/ARM updates for 4.9-rc2
- Handle faults generated by the page table walker as being writes
- Map the BSS at EL2
|
|
When used with a compiler that doesn't implement "asm goto"
(such as the AArch64 port of GCC 4.8), jump labels generate a
memory access to find out about the value of the key (instead
of just patching the code). The key itself is likely to be
stored in the BSS.
This is perfectly fine, except that we don't map the BSS at HYP,
leading to an exploding kernel at the first access. The obvious
fix is simply to map the BSS there (which should have been done
a long while ago, but hey...).
Reported-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
The WnR bit in the HSR/ESR_EL2 indicates whether a data abort was
generated by a read or a write instruction. For stage 2 data aborts
generated by a stage 1 translation table walk (i.e. the actual page
table access faults at EL2), the WnR bit therefore reports whether the
instruction generating the walk was a load or a store, *not* whether the
page table walker was reading or writing the entry.
For page tables marked as read-only at stage 2 (e.g. due to KSM merging
them with the tables from another guest), this could result in livelock,
where a page table walk generated by a load instruction attempts to
set the access flag in the stage 1 descriptor, but fails to trigger
CoW in the host since only a read fault is reported.
This patch modifies the arm64 kvm_vcpu_dabt_iswrite function to
take into account stage 2 faults in stage 1 walks. Since DBM cannot be
disabled at EL2 for CPUs that implement it, we assume that these faults
are always causes by writes, avoiding the livelock situation at the
expense of occasional, spurious CoWs.
We could, in theory, do a bit better by checking the guest TCR
configuration and inspecting the page table to see why the PTE faulted.
However, I doubt this is measurable in practice, and the threat of
livelock is real.
Cc: <stable@vger.kernel.org>
Cc: Julien Grall <julien.grall@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux
KVM: s390: Fix for user-triggerable WARN_ON
A malicious user space can provide an invalid mode for runtime
instrumentation via the interfaces that are normally used on
the target host during migration. This would trigger a WARN_ON
via validity intercept. Let's detect this special case.
|
|
Usually a validity intercept is a programming error of the host
because of invalid entries in the state description.
We can get a validity intercept if the mode of the runtime
instrumentation control block is wrong. As the host does not know
which modes are valid, this can be used by userspace to trigger
a WARN.
Instead of printing a WARN let's return an error to userspace as
this can only happen if userspace provides a malformed initial
value (e.g. on migration). The kernel should never warn on bogus
input. Instead let's log it into the s390 debug feature.
While at it, let's return -EINVAL for all validity intercepts as
this will trigger an error in QEMU like
error: kvm run failed Invalid argument
PSW=mask 0404c00180000000 addr 000000000063c226 cc 00
R00=000000000000004f R01=0000000000000004 R02=0000000000760005 R03=000000007fe0a000
R04=000000000064ba2a R05=000000049db73dd0 R06=000000000082c4b0 R07=0000000000000041
R08=0000000000000002 R09=000003e0804042a8 R10=0000000496152c42 R11=000000007fe0afb0
[...]
This will avoid an endless loop of validity intercepts.
Cc: stable@vger.kernel.org # v4.5+
Fixes: c6e5f166373a ("KVM: s390: implement the RI support of guest")
Acked-by: Fan Zhang <zhangfan@linux.vnet.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"Most of these are CC'd for stable, but there are a few fixing issues
introduced during the recent merge window too.
There's also a fix for the xgene PMU driver, but it seemed daft to
send as a separate pull request, so I've included it here with the
rest of the fixes.
- Fix ACPI boot due to recent broken NUMA changes
- Fix remote enabling of CPU features requiring PSTATE bit manipulation
- Add address range check when emulating user cache maintenance
- Fix LL/SC loops that allow compiler to introduce memory accesses
- Fix recently added write_sysreg_s macro
- Ensure MDCR_EL2 is initialised on qemu targets without a PMU
- Avoid kaslr breakage due to MODVERSIONs and DYNAMIC_FTRACE
- Correctly drive recent ld when building relocatable Image
- Remove junk IS_ERR check from xgene PMU driver added during merge window
- pr_cont fixes after core changes in the merge window"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: remove pr_cont abuse from mem_init
arm64: fix show_regs fallout from KERN_CONT changes
arm64: kernel: force ET_DYN ELF type for CONFIG_RELOCATABLE=y
arm64: suspend: Reconfigure PSTATE after resume from idle
arm64: mm: Set PSTATE.PAN from the cpu_enable_pan() call
arm64: cpufeature: Schedule enable() calls instead of calling them via IPI
arm64: Cortex-A53 errata workaround: check for kernel addresses
arm64: percpu: rewrite ll/sc loops in assembly
arm64: swp emulation: bound LL/SC retries before rescheduling
arm64: sysreg: Fix use of XZR in write_sysreg_s
arm64: kaslr: keep modules close to the kernel when DYNAMIC_FTRACE=y
arm64: kernel: Init MDCR_EL2 even in the absence of a PMU
perf: xgene: Remove bogus IS_ERR() check
arm64: kernel: numa: fix ACPI boot cpu numa node mapping
arm64: kaslr: fix breakage with CONFIG_MODVERSIONS=y
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/kvm-mips
MIPS KVM fix for v4.9-rc2
- Fix build error introduced during the 4.9 merge window when
tracepoints are disabled.
|
|
All the lines printed by mem_init are independent, with each ending with
a newline. While they logically form a large block, none are actually
continuations of previous lines.
The kernel-side printk code and the userspace demsg tool differ in their
handling of KERN_CONT following a newline, and while this isn't always a
problem kernel-side, it does cause difficulty for userspace. Using
pr_cont causes the userspace tool to not print line prefix (e.g.
timestamps) even when following a newline, mis-aligning the output and
making it harder to read, e.g.
[ 0.000000] Virtual kernel memory layout:
[ 0.000000] modules : 0xffff000000000000 - 0xffff000008000000 ( 128 MB)
vmalloc : 0xffff000008000000 - 0xffff7dffbfff0000 (129022 GB)
.text : 0xffff000008080000 - 0xffff0000088b0000 ( 8384 KB)
.rodata : 0xffff0000088b0000 - 0xffff000008c50000 ( 3712 KB)
.init : 0xffff000008c50000 - 0xffff000008d50000 ( 1024 KB)
.data : 0xffff000008d50000 - 0xffff000008e25200 ( 853 KB)
.bss : 0xffff000008e25200 - 0xffff000008e6bec0 ( 284 KB)
fixed : 0xffff7dfffe7fd000 - 0xffff7dfffec00000 ( 4108 KB)
PCI I/O : 0xffff7dfffee00000 - 0xffff7dffffe00000 ( 16 MB)
vmemmap : 0xffff7e0000000000 - 0xffff800000000000 ( 2048 GB maximum)
0xffff7e0000000000 - 0xffff7e0026000000 ( 608 MB actual)
memory : 0xffff800000000000 - 0xffff800980000000 ( 38912 MB)
[ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=6, Nodes=1
Fix this by using pr_notice consistently for all lines, which both the
kernel and userspace are happy with.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
Recently in commit 4bcc595ccd80decb ("printk: reinstate KERN_CONT for
printing continuation lines"), the behaviour of printk changed w.r.t.
KERN_CONT. Now, KERN_CONT is mandatory to continue existing lines.
Without this, prefixes are inserted, making output illegible, e.g.
[ 1007.069010] pc : [<ffff00000871898c>] lr : [<ffff000008718948>] pstate: 40000145
[ 1007.076329] sp : ffff000008d53ec0
[ 1007.079606] x29: ffff000008d53ec0 [ 1007.082797] x28: 0000000080c50018
[ 1007.086160]
[ 1007.087630] x27: ffff000008e0c7f8 [ 1007.090820] x26: ffff80097631ca00
[ 1007.094183]
[ 1007.095653] x25: 0000000000000001 [ 1007.098843] x24: 000000ea68b61cac
[ 1007.102206]
... or when dumped with the userpace dmesg tool, which has slightly
different implicit newline behaviour. e.g.
[ 1007.069010] pc : [<ffff00000871898c>] lr : [<ffff000008718948>] pstate: 40000145
[ 1007.076329] sp : ffff000008d53ec0
[ 1007.079606] x29: ffff000008d53ec0
[ 1007.082797] x28: 0000000080c50018
[ 1007.086160]
[ 1007.087630] x27: ffff000008e0c7f8
[ 1007.090820] x26: ffff80097631ca00
[ 1007.094183]
[ 1007.095653] x25: 0000000000000001
[ 1007.098843] x24: 000000ea68b61cac
[ 1007.102206]
We can't simply always use KERN_CONT for lines which may or may not be
continuations. That causes line prefixes (e.g. timestamps) to be
supressed, and the alignment of all but the first line will be broken.
For even more fun, we can't simply insert some dummy empty-string printk
calls, as GCC warns for an empty printk string, and even if we pass
KERN_DEFAULT explcitly to silence the warning, the prefix gets swallowed
unless there is an additional part to the string.
Instead, we must manually iterate over pairs of registers, which gives
us the legible output we want in either case, e.g.
[ 169.771790] pc : [<ffff00000871898c>] lr : [<ffff000008718948>] pstate: 40000145
[ 169.779109] sp : ffff000008d53ec0
[ 169.782386] x29: ffff000008d53ec0 x28: 0000000080c50018
[ 169.787650] x27: ffff000008e0c7f8 x26: ffff80097631de00
[ 169.792913] x25: 0000000000000001 x24: 00000027827b2cf4
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
gcc 7 warns:
arch/x86/kvm/ioapic.c: In function 'kvm_ioapic_reset':
arch/x86/kvm/ioapic.c:597:2: warning: 'memset' used with length equal to number of elements without multiplication by element size [-Wmemset-elt-size]
And it is right. Memset whole array using sizeof operator.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
[Added x86 subject tag]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
When CONFIG_CPU_FREQ is not set, int cpu is unused and gcc rightfully
warns about it:
arch/x86/kvm/x86.c: In function ‘kvm_timer_init’:
arch/x86/kvm/x86.c:5697:6: warning: unused variable ‘cpu’ [-Wunused-variable]
int cpu;
^~~
But since it is used only in the CONFIG_CPU_FREQ block, simply move it
there, thus squashing the warning too.
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
The following commit:
c65eacbe290b ("sched/core: Allow putting thread_info into task_struct")
... made 'struct thread_info' a generic struct with only a
single ::flags member, if CONFIG_THREAD_INFO_IN_TASK_STRUCT=y is
selected.
This change however seems to be quite x86 centric, since at least the
generic preemption code (asm-generic/preempt.h) assumes that struct
thread_info also has a preempt_count member, which apparently was not
true for x86.
We could add a bit more #ifdefs to solve this problem too, but it seems
to be much simpler to make struct thread_info arch specific
again. This also makes the conversion to THREAD_INFO_IN_TASK_STRUCT a
bit easier for architectures that have a couple of arch specific stuff
in their thread_info definition.
The arch specific stuff _could_ be moved to thread_struct. However
keeping them in thread_info makes it easier: accessing thread_info
members is simple, since it is at the beginning of the task_struct,
while the thread_struct is at the end. At least on s390 the offsets
needed to access members of the thread_struct (with task_struct as
base) are too large for various asm instructions. This is not a
problem when keeping these members within thread_info.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: keescook@chromium.org
Cc: linux-arch@vger.kernel.org
Link: http://lkml.kernel.org/r/1476901693-8492-2-git-send-email-mark.rutland@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The recent introduction of SA_X32/IA32 sa_flags added a check for
user_64bit_mode() into sigaction_compat_abi(). user_64bit_mode() is true
for native 64-bit processes and x32 processes.
Due to that the function returns w/o setting the SA_X32_ABI flag for X32
processes. In consequence the kernel attempts to deliver the signal to the
X32 process in native 64-bit mode causing the process to segfault.
Remove the check, so the actual check for X32 mode which sets the ABI flag
can be reached. There is no side effect for native 64-bit mode.
[ tglx: Rewrote changelog ]
Fixes: 6846351052e6 ("x86/signal: Add SA_{X32,IA32}_ABI sa_flags")
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Tested-by: Adam Borowski <kilobyte@angband.pl>
Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: linux-mm@kvack.org
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Link: http://lkml.kernel.org/r/CAJwJo6Z8ZWPqNfT6t-i8GW1MKxQrKDUagQqnZ%2B0%2B697%3DMyVeGg@mail.gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
GNU ld used to set the ELF file type to ET_DYN for PIE executables, which
is the same file type used for shared libraries. However, this was changed
recently, and now PIE executables are emitted as ET_EXEC instead.
The distinction is only relevant for ELF loaders, and so there is little
reason to care about the difference when building the kernel, which is
why the change has gone unnoticed until now.
However, debuggers do use the ELF binary, and expect ET_EXEC type files
to appear in memory at the exact offset described in the ELF metadata.
This means source level debugging is no longer possible when KASLR is in
effect or when executing the stub.
So add the -shared LD option when building with CONFIG_RELOCATABLE=y. This
forces the ELF file type to be set to ET_DYN (which is what you get when
building with binutils 2.24 and earlier anyway), and has no other ill
effects.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
The suspend/resume path in kernel/sleep.S, as used by cpu-idle, does not
save/restore PSTATE. As a result of this cpufeatures that were detected
and have bits in PSTATE get lost when we resume from idle.
UAO gets set appropriately on the next context switch. PAN will be
re-enabled next time we return from user-space, but on a preemptible
kernel we may run work accessing user space before this point.
Add code to re-enable theses two features in __cpu_suspend_exit().
We re-use uao_thread_switch() passing current.
Signed-off-by: James Morse <james.morse@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
Commit 338d4f49d6f7 ("arm64: kernel: Add support for Privileged Access
Never") enabled PAN by enabling the 'SPAN' feature-bit in SCTLR_EL1.
This means the PSTATE.PAN bit won't be set until the next return to the
kernel from userspace. On a preemptible kernel we may schedule work that
accesses userspace on a CPU before it has done this.
Now that cpufeature enable() calls are scheduled via stop_machine(), we
can set PSTATE.PAN from the cpu_enable_pan() call.
Add WARN_ON_ONCE(in_interrupt()) to check the PSTATE value we updated
is not immediately discarded.
Reported-by: Tony Thompson <anthony.thompson@arm.com>
Reported-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
[will: fixed typo in comment]
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
The enable() call for a cpufeature/errata is called using on_each_cpu().
This issues a cross-call IPI to get the work done. Implicitly, this
stashes the running PSTATE in SPSR when the CPU receives the IPI, and
restores it when we return. This means an enable() call can never modify
PSTATE.
To allow PAN to do this, change the on_each_cpu() call to use
stop_machine(). This schedules the work on each CPU which allows
us to modify PSTATE.
This involves changing the protype of all the enable() functions.
enable_cpu_capabilities() is called during boot and enables the feature
on all online CPUs. This path now uses stop_machine(). CPU features for
hotplug'd CPUs are enabled by verify_local_cpu_features() which only
acts on the local CPU, and can already modify the running PSTATE as it
is called from secondary_start_kernel().
Reported-by: Tony Thompson <anthony.thompson@arm.com>
Reported-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
Commit 7dd01aef0557 ("arm64: trap userspace "dc cvau" cache operation on
errata-affected core") adds code to execute cache maintenance instructions
in the kernel on behalf of userland on CPUs with certain ARM CPU errata.
It turns out that the address hasn't been checked to be a valid user
space address, allowing userland to clean cache lines in kernel space.
Fix this by introducing an address check before executing the
instructions on behalf of userland.
Since the address doesn't come via a syscall parameter, we can't just
reject tagged pointers and instead have to remove the tag when checking
against the user address limit.
Cc: <stable@vger.kernel.org>
Fixes: 7dd01aef0557 ("arm64: trap userspace "dc cvau" cache operation on errata-affected core")
Reported-by: Kristina Martsenko <kristina.martsenko@arm.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
[will: rework commit message + replace access_ok with max_user_addr()]
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
Some time ago, we brought our UV BIOS callback code up to speed with the
new EFI memory mapping scheme, in commit:
d1be84a232e3 ("x86/uv: Update uv_bios_call() to use efi_call_virt_pointer()")
By leveraging some changes that I made to a few of the EFI runtime
callback mechanisms, in commit:
80e75596079f ("efi: Convert efi_call_virt() to efi_call_virt_pointer()")
This got everything running smoothly on UV, with the new EFI mapping
code. However, this left one, small loose end, in that EFI_OLD_MEMMAP
(a.k.a. efi=old_map) will no longer work on UV, on kernels that include
the aforementioned changes.
At the time this was not a major issue (in fact, it still really isn't),
but there's no reason that EFI_OLD_MEMMAP *shouldn't* work on our
systems. This commit adds a check into uv_bios_call(), to see if we have
the EFI_OLD_MEMMAP bit set in efi.flags. If it is set, we fall back to
using our old callback method, which uses efi_call() directly on the __va()
of our function pointer.
Signed-off-by: Alex Thorlton <athorlton@sgi.com>
Acked-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: <stable@vger.kernel.org> # v4.7 and later
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dimitri Sivanich <sivanich@sgi.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Mike Travis <travis@sgi.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russ Anderson <rja@sgi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1476928131-170101-1-git-send-email-athorlton@sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Pull arch/sh updates from Rich Felker:
"Minor changes to improve J2 support and match Kconfig expectations of
other subsystems"
* tag 'sh-for-4.9' of git://git.libc.org/linux-sh:
sh: add earlycon support to j2_defconfig
sh: add Kconfig option for J-Core SoC core drivers
sh: support CPU_J2 when compiler lacks -mj2
|
|
Merge the gup_flags cleanups from Lorenzo Stoakes:
"This patch series adjusts functions in the get_user_pages* family such
that desired FOLL_* flags are passed as an argument rather than
implied by flags.
The purpose of this change is to make the use of FOLL_FORCE explicit
so it is easier to grep for and clearer to callers that this flag is
being used. The use of FOLL_FORCE is an issue as it overrides missing
VM_READ/VM_WRITE flags for the VMA whose pages we are reading
from/writing to, which can result in surprising behaviour.
The patch series came out of the discussion around commit 38e088546522
("mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing"),
which addressed a BUG_ON() being triggered when a page was faulted in
with PROT_NONE set but having been overridden by FOLL_FORCE.
do_numa_page() was run on the assumption the page _must_ be one marked
for NUMA node migration as an actual PROT_NONE page would have been
dealt with prior to this code path, however FOLL_FORCE introduced a
situation where this assumption did not hold.
See
https://marc.info/?l=linux-mm&m=147585445805166
for the patch proposal"
Additionally, there's a fix for an ancient bug related to FOLL_FORCE and
FOLL_WRITE by me.
[ This branch was rebased recently to add a few more acked-by's and
reviewed-by's ]
* gup_flag-cleanups:
mm: replace access_process_vm() write parameter with gup_flags
mm: replace access_remote_vm() write parameter with gup_flags
mm: replace __access_remote_vm() write parameter with gup_flags
mm: replace get_user_pages_remote() write/force parameters with gup_flags
mm: replace get_user_pages() write/force parameters with gup_flags
mm: replace get_vaddr_frames() write/force parameters with gup_flags
mm: replace get_user_pages_locked() write/force parameters with gup_flags
mm: replace get_user_pages_unlocked() write/force parameters with gup_flags
mm: remove write/force parameters from __get_user_pages_unlocked()
mm: remove write/force parameters from __get_user_pages_locked()
mm: remove gup_flags FOLL_WRITE games from __get_user_pages()
|
|
AVX512_4VNNIW - Vector instructions for deep learning enhanced word
variable precision.
AVX512_4FMAPS - Vector instructions for deep learning floating-point
single precision.
These new instructions are to be used in future Intel Xeon & Xeon Phi
processors. The bits 2&3 of CPUID[level:0x07, EDX] inform that new
instructions are supported by a processor.
The spec can be found in the Intel Software Developer Manual (SDM) or in
the Instruction Set Extensions Programming Reference (ISE).
Define new feature flags to enumerate the new instructions in /proc/cpuinfo
accordingly to CPUID bits and add the required xsave extensions which are
required for proper operation.
Signed-off-by: Piotr Luc <piotr.luc@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20161018150111.29926-1-piotr.luc@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The timer_irq_works() boot check may sometimes fail in a VM, when
the Host is overcommitted or when the Guest is running nested.
Since the intended check is unnecessary on VMware's virtual
hardware, by-pass it.
Signed-off-by: Renat Valiullin <rvaliullin@vmware.com>
Acked-by: Alok N Kataria <akataria@vmware.com>
Cc: virtualization@lists.linux-foundation.org
Link: http://lkml.kernel.org/r/20161013184539.GA11497@rvaliullin-vm
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
This removes the 'write' argument from access_process_vm() and replaces
it with 'gup_flags' as use of this function previously silently implied
FOLL_FORCE, whereas after this patch callers explicitly pass this flag.
We make this explicit as use of FOLL_FORCE can result in surprising
behaviour (and hence bugs) within the mm subsystem.
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This removes the 'write' and 'force' from get_user_pages() and replaces
them with 'gup_flags' to make the use of FOLL_FORCE explicit in callers
as use of this flag can result in surprising behaviour (and hence bugs)
within the mm subsystem.
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Writing the outer loop of an LL/SC sequence using do {...} while
constructs potentially allows the compiler to hoist memory accesses
between the STXR and the branch back to the LDXR. On CPUs that do not
guarantee forward progress of LL/SC loops when faced with memory
accesses to the same ERG (up to 2k) between the failed STXR and the
branch back, we may end up livelocking.
This patch avoids this issue in our percpu atomics by rewriting the
outer loop as part of the LL/SC inline assembly block.
Cc: <stable@vger.kernel.org>
Fixes: f97fc810798c ("arm64: percpu: Implement this_cpu operations")
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
If a CPU does not implement a global monitor for certain memory types,
then userspace can attempt a kernel DoS by issuing SWP instructions
targetting the problematic memory (for example, a framebuffer mapped
with non-cacheable attributes).
The SWP emulation code protects against these sorts of attacks by
checking for pending signals and potentially rescheduling when the STXR
instruction fails during the emulation. Whilst this is good for avoiding
livelock, it harms emulation of legitimate SWP instructions on CPUs
where forward progress is not guaranteed if there are memory accesses to
the same reservation granule (up to 2k) between the failing STXR and
the retry of the LDXR.
This patch solves the problem by retrying the STXR a bounded number of
times (4) before breaking out of the LL/SC loop and looking for
something else to do.
Cc: <stable@vger.kernel.org>
Fixes: bd35a4adc413 ("arm64: Port SWP/SWPB emulation support from arm")
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
Eliminates warning messages:
<stdin>:1316:2: warning: #warning syscall pkey_mprotect not implemented [-Wcpp]
<stdin>:1319:2: warning: #warning syscall pkey_alloc not implemented [-Wcpp]
<stdin>:1322:2: warning: #warning syscall pkey_free not implemented [-Wcpp]
Hopefully we will remember to revert this commit if we ever implement
them.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
With recent update to printk, we get console output like below:
[ 0.550639] Brought up 160 CPUs
[ 0.550718] Node 0 CPUs:
[ 0.550721] 0
[ 0.550754] -39
[ 0.550794] Node 1 CPUs:
[ 0.550798] 40
[ 0.550817] -79
[ 0.550856] Node 16 CPUs:
[ 0.550860] 80
[ 0.550880] -119
[ 0.550917] Node 17 CPUs:
[ 0.550923] 120
[ 0.550942] -159
Fix this by properly using pr_cont(), ie. KERN_CONT.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
At boot we dump the NUMA memory topology in dump_numa_memory_topology(),
at KERN_DEBUG level, resulting in output like:
Node 0 Memory: 0x0-0x100000000
Node 1 Memory: 0x100000000-0x200000000
Which is nice enough, but immediately after that we iterate over each
node and call setup_node_data(), which also prints out the node ranges,
at KERN_INFO, giving eg:
numa: Initmem setup node 0 [mem 0x00000000-0xffffffff]
numa: Initmem setup node 1 [mem 0x100000000-0x1ffffffff]
Additionally dump_numa_memory_topology() does not use KERN_CONT
correctly, resulting in split output lines on recent kernels.
So drop dump_numa_memory_topology() as superfluous chatter.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
This commit broke boot on systems with an uncompressed kernel image,
namely systems using a cuImage. On such systems the compressed boot
image (boot wrapper, uncompressed kernel image, ..) is decompressed
by u-boot already, therefore the boot wrapper code sees an
uncompressed kernel image.
The old decompression code silently assumed an uncompressed kernel
image if it found no valid gzip signature, whilst the new code
bailed out in this case.
Fix this by re-introducing such a fallback if no valid compressed
image is found.
Fixes: 1b7898ee276b ("Use the pre-boot decompression API")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
If a cxl adapter faults on an invalid address for a kernel context, we
may enter copro_calculate_slb() with a NULL mm pointer (kernel
context) and an effective address which looks like a user
address. Which will cause a crash when dereferencing mm. It is clearly
an AFU bug, but there's no reason to crash either. So return an error,
so that cxl can ack the interrupt with an address error.
Fixes: 73d16a6e0e51 ("powerpc/cell: Move data segment faulting code out of cell platform")
Cc: stable@vger.kernel.org # v3.18+
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
MIPS KVM uses user memory accessors but mips.c doesn't directly include
uaccess.h, so include it now.
This wasn't too much of a problem before v4.9-rc1 as asm/module.h
included asm/uaccess.h, however since commit 29abfbd9cbba ("mips:
separate extable.h, switch module.h to it") this is no longer the case.
This resulted in build failures when trace points were disabled, as
trace/define_trace.h includes trace/trace_events.h only ifdef
TRACEPOINTS_ENABLED, which goes on to include asm/uaccess.h via a couple
of other headers.
Fixes: 29abfbd9cbba ("mips: separate extable.h, switch module.h to it")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
|
|
Signed-off-by: Rich Felker <dalias@libc.org>
|
|
Signed-off-by: Rich Felker <dalias@libc.org>
|
|
This removes the 'write' and 'force' use from get_user_pages_unlocked()
and replaces them with 'gup_flags' to make the use of FOLL_FORCE
explicit in callers as use of this flag can result in surprising
behaviour (and hence bugs) within the mm subsystem.
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Misc fixes, plus hw-enablement changes:
- fix persistent RAM handling
- remove pkeys warning
- remove duplicate macro
- fix debug warning in irq handler
- add new 'Knights Mill' CPU related constants and enable the perf bits"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel/uncore: Add Knights Mill CPUID
perf/x86/intel/rapl: Add Knights Mill CPUID
perf/x86/intel: Add Knights Mill CPUID
x86/cpu/intel: Add Knights Mill to Intel family
x86/e820: Don't merge consecutive E820_PRAM ranges
pkeys: Remove easily triggered WARN
x86: Remove duplicate rtit status MSR macro
x86/smp: Add irq_enter/exit() in smp_reschedule_interrupt()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Four tooling fixes, two kprobes KASAN related fixes and an x86 PMU
driver fix/cleanup"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf jit: Fix build issue on Ubuntu
perf jevents: Handle events including .c and .o
perf/x86/intel: Remove an inconsistent NULL check
kprobes: Unpoison stack in jprobe_return() for KASAN
kprobes: Avoid false KASAN reports during stack copy
perf header: Set nr_numa_nodes only when we parsed all the data
perf top: Fix refreshing hierarchy entries on TUI
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:
"Two fixes:
- a file locks fix (missing critical section, bug introduced in this
merge window)
- an x86 down_write() stack frame annotation"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking, fs/locks: Add missing file_sem locks
locking/rwsem/x86: Add stack frame dependency for ____down_write()
|
|
Arnd reported the following objtool warning:
kernel/locking/rwsem.o: warning: objtool: down_write_killable()+0x16: call without frame pointer save/setup
The warning means gcc placed the ____down_write() inline asm (and its
call instruction) before the frame pointer setup in
down_write_killable(), which breaks frame pointer convention and can
result in incorrect stack traces.
Force the stack frame to be created before the call instruction by
listing the stack pointer as an output operand in the inline asm
statement.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1188b7015f04baf361e59de499ee2d7272c59dce.1476393828.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
pkey_set() and pkey_get() were syscalls present in older versions
of the protection keys patches. The syscall number definitions
were inadvertently left in place. This patch removes them.
I did a git grep and verified that these are the last places in
the tree that these appear, save for the protection_keys.c tests
and Documentation. Those spots talk about functions called
pkey_get/set() which are wrappers for the direct PKRU
instructions, not the syscalls.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: mgorman@techsingularity.net
Cc: arnd@arndb.de
Cc: linux-api@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: luto@kernel.org
Cc: akpm@linux-foundation.org
Fixes: f9afc6197e9bb ("x86: Wire up protection keys system calls")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit 8a71f0c656e0 ("arm64: sysreg: replace open-coded mrs_s/msr_s with
{read,write}_sysreg_s") introduced a write_sysreg_s macro for writing
to system registers that are not supported by binutils.
Unfortunately, this was implemented with the wrong template (%0 vs %x0),
so in the case that we are writing a constant 0, we will generate
invalid instruction syntax and bail with a cryptic assembler error:
| Error: constant expression required
This patch fixes the template.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|