Age | Commit message (Collapse) | Author |
|
CPUmasks are never big enough to warrant 64-bit code.
Space savings:
add/remove: 0/0 grow/shrink: 1/4 up/down: 3/-17 (-14)
Function old new delta
sched_init_numa 1530 1533 +3
compat_sys_sched_setaffinity 160 159 -1
sys_sched_getaffinity 197 195 -2
sys_sched_setaffinity 183 176 -7
compat_sys_sched_getaffinity 179 172 -7
Link: http://lkml.kernel.org/r/20171204165531.GA8221@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Fix grammar and add an omitted word.
Link: http://lkml.kernel.org/r/1a5a021c-0207-f793-7f07-addca26772d5@infradead.org
Fixes: f9886bc50a8e ("signal: Document the strange si_codes used by ptrace event stops")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
print_ip_sym() is mostly used for debugging, so I think it should print
the raw addresses.
Link: http://lkml.kernel.org/r/1514519382-405-1-git-send-email-chenhc@lemote.com
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Fuxin Zhang <zhangfx@lemote.com>
Cc: "Tobin C. Harding" <me@tobin.cc>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
We've measured that we spend ~0.6% of sys cpu time in cpumask_next_and().
It's essentially a joined iteration in search for a non-zero bit, which is
currently implemented as a lookup join (find a nonzero bit on the lhs,
lookup the rhs to see if it's set there).
Implement a direct join (find a nonzero bit on the incrementally built
join). Also add generic bitmap benchmarks in the new `test_find_bit`
module for new function (see `find_next_and_bit` in [2] and [3] below).
For cpumask_next_and, direct benchmarking shows that it's 1.17x to 14x
faster with a geometric mean of 2.1 on 32 CPUs [1]. No impact on memory
usage. Note that on Arm, the new pure-C implementation still outperforms
the old one that uses a mix of C and asm (`find_next_bit`) [3].
[1] Approximate benchmark code:
```
unsigned long src1p[nr_cpumask_longs] = {pattern1};
unsigned long src2p[nr_cpumask_longs] = {pattern2};
for (/*a bunch of repetitions*/) {
for (int n = -1; n <= nr_cpu_ids; ++n) {
asm volatile("" : "+rm"(src1p)); // prevent any optimization
asm volatile("" : "+rm"(src2p));
unsigned long result = cpumask_next_and(n, src1p, src2p);
asm volatile("" : "+rm"(result));
}
}
```
Results:
pattern1 pattern2 time_before/time_after
0x0000ffff 0x0000ffff 1.65
0x0000ffff 0x00005555 2.24
0x0000ffff 0x00001111 2.94
0x0000ffff 0x00000000 14.0
0x00005555 0x0000ffff 1.67
0x00005555 0x00005555 1.71
0x00005555 0x00001111 1.90
0x00005555 0x00000000 6.58
0x00001111 0x0000ffff 1.46
0x00001111 0x00005555 1.49
0x00001111 0x00001111 1.45
0x00001111 0x00000000 3.10
0x00000000 0x0000ffff 1.18
0x00000000 0x00005555 1.18
0x00000000 0x00001111 1.17
0x00000000 0x00000000 1.25
-----------------------------
geo.mean 2.06
[2] test_find_next_bit, X86 (skylake)
[ 3913.477422] Start testing find_bit() with random-filled bitmap
[ 3913.477847] find_next_bit: 160868 cycles, 16484 iterations
[ 3913.477933] find_next_zero_bit: 169542 cycles, 16285 iterations
[ 3913.478036] find_last_bit: 201638 cycles, 16483 iterations
[ 3913.480214] find_first_bit: 4353244 cycles, 16484 iterations
[ 3913.480216] Start testing find_next_and_bit() with random-filled
bitmap
[ 3913.481074] find_next_and_bit: 89604 cycles, 8216 iterations
[ 3913.481075] Start testing find_bit() with sparse bitmap
[ 3913.481078] find_next_bit: 2536 cycles, 66 iterations
[ 3913.481252] find_next_zero_bit: 344404 cycles, 32703 iterations
[ 3913.481255] find_last_bit: 2006 cycles, 66 iterations
[ 3913.481265] find_first_bit: 17488 cycles, 66 iterations
[ 3913.481266] Start testing find_next_and_bit() with sparse bitmap
[ 3913.481272] find_next_and_bit: 764 cycles, 1 iterations
[3] test_find_next_bit, arm (v7 odroid XU3).
[ 267.206928] Start testing find_bit() with random-filled bitmap
[ 267.214752] find_next_bit: 4474 cycles, 16419 iterations
[ 267.221850] find_next_zero_bit: 5976 cycles, 16350 iterations
[ 267.229294] find_last_bit: 4209 cycles, 16419 iterations
[ 267.279131] find_first_bit: 1032991 cycles, 16420 iterations
[ 267.286265] Start testing find_next_and_bit() with random-filled
bitmap
[ 267.302386] find_next_and_bit: 2290 cycles, 8140 iterations
[ 267.309422] Start testing find_bit() with sparse bitmap
[ 267.316054] find_next_bit: 191 cycles, 66 iterations
[ 267.322726] find_next_zero_bit: 8758 cycles, 32703 iterations
[ 267.329803] find_last_bit: 84 cycles, 66 iterations
[ 267.336169] find_first_bit: 4118 cycles, 66 iterations
[ 267.342627] Start testing find_next_and_bit() with sparse bitmap
[ 267.356919] find_next_and_bit: 91 cycles, 1 iterations
[courbet@google.com: v6]
Link: http://lkml.kernel.org/r/20171129095715.23430-1-courbet@google.com
[geert@linux-m68k.org: m68k/bitops: always include <asm-generic/bitops/find.h>]
Link: http://lkml.kernel.org/r/1512556816-28627-1-git-send-email-geert@linux-m68k.org
Link: http://lkml.kernel.org/r/20171128131334.23491-1-courbet@google.com
Signed-off-by: Clement Courbet <courbet@google.com>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Yury Norov <ynorov@caviumnetworks.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Behaviour of bitmap_fill() differs from bitmap_zero() in a way how bits
behind bitmap are handed. bitmap_zero() clears entire bitmap by unsigned
long boundary, while bitmap_fill() mimics bitmap_set().
Here we change bitmap_fill() behaviour to be consistent with bitmap_zero()
and add a note to documentation.
The change might reveal some bugs in the code where unused bits are
handled differently and in such cases bitmap_set() has to be used.
Link: http://lkml.kernel.org/r/20180109172430.87452-4-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yury Norov <ynorov@caviumnetworks.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
with bitmap_{from,to}_arr32 over the kernel. Additionally to it:
* __check_eq_bitmap() now takes single nbits argument.
* __check_eq_u32_array is not used in new test but may be used in
future. So I don't remove it here, but annotate as __used.
Tested on arm64 and 32-bit BE mips.
[arnd@arndb.de: perf: arm_dsu_pmu: convert to bitmap_from_arr32]
Link: http://lkml.kernel.org/r/20180201172508.5739-2-ynorov@caviumnetworks.com
[ynorov@caviumnetworks.com: fix net/core/ethtool.c]
Link: http://lkml.kernel.org/r/20180205071747.4ekxtsbgxkj5b2fz@yury-thinkpad
Link: http://lkml.kernel.org/r/20171228150019.27953-2-ynorov@caviumnetworks.com
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: David Decotigny <decot@googlers.com>,
Cc: David S. Miller <davem@davemloft.net>,
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This patchset replaces bitmap_{to,from}_u32array with more simple and
standard looking copy-like functions.
bitmap_from_u32array() takes 4 arguments (bitmap_to_u32array is similar):
- unsigned long *bitmap, which is destination;
- unsigned int nbits, the length of destination bitmap, in bits;
- const u32 *buf, the source; and
- unsigned int nwords, the length of source buffer in ints.
In description to the function it is detailed like:
* copy min(nbits, 32*nwords) bits from @buf to @bitmap, remaining
* bits between nword and nbits in @bitmap (if any) are cleared.
Having two size arguments looks unneeded and potentially dangerous.
It is unneeded because normally user of copy-like function should take
care of the size of destination and make it big enough to fit source
data.
And it is dangerous because function may hide possible error if user
doesn't provide big enough bitmap, and data becomes silently dropped.
That's why all copy-like functions have 1 argument for size of copying
data, and I don't see any reason to make bitmap_from_u32array()
different.
One exception that comes in mind is strncpy() which also provides size
of destination in arguments, but it's strongly argued by the possibility
of taking broken strings in source. This is not the case of
bitmap_{from,to}_u32array().
There is no many real users of bitmap_{from,to}_u32array(), and they all
very clearly provide size of destination matched with the size of
source, so additional functionality is not used in fact. Like this:
bitmap_from_u32array(to->link_modes.supported,
__ETHTOOL_LINK_MODE_MASK_NBITS,
link_usettings.link_modes.supported,
__ETHTOOL_LINK_MODE_MASK_NU32);
Where:
#define __ETHTOOL_LINK_MODE_MASK_NU32 \
DIV_ROUND_UP(__ETHTOOL_LINK_MODE_MASK_NBITS, 32)
In this patch, bitmap_copy_safe and bitmap_{from,to}_arr32 are introduced.
'Safe' in bitmap_copy_safe() stands for clearing unused bits in bitmap
beyond last bit till the end of last word. It is useful for hardening
API when bitmap is assumed to be exposed to userspace.
bitmap_{from,to}_arr32 functions are replacements for
bitmap_{from,to}_u32array. They don't take unneeded nwords argument, and
so simpler in implementation and understanding.
This patch suggests optimization for 32-bit systems - aliasing
bitmap_{from,to}_arr32 to bitmap_copy_safe.
Other possible optimization is aliasing 64-bit LE bitmap_{from,to}_arr32 to
more generic function(s). But I didn't end up with the function that would
be helpful by itself, and can be used to alias 64-bit LE
bitmap_{from,to}_arr32, like bitmap_copy_safe() does. So I preferred to
leave things as is.
The following patch switches kernel to new API and introduces test for it.
Discussion is here: https://lkml.org/lkml/2017/11/15/592
[ynorov@caviumnetworks.com: rename bitmap_copy_safe to bitmap_copy_clear_tail]
Link: http://lkml.kernel.org/r/20180201172508.5739-3-ynorov@caviumnetworks.com
Link: http://lkml.kernel.org/r/20171228150019.27953-1-ynorov@caviumnetworks.com
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: David Decotigny <decot@googlers.com>,
Cc: David S. Miller <davem@davemloft.net>,
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Exported header doesn't use anything from <linux/string.h>,
it is <linux/uuid.h> which uses memcmp().
Link: http://lkml.kernel.org/r/20171225171121.GA22754@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Right now the fact that KASAN uses a single shadow byte for 8 bytes of
memory is scattered all over the code.
This change defines KASAN_SHADOW_SCALE_SHIFT early in asm include files
and makes use of this constant where necessary.
[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/34937ca3b90736eaad91b568edf5684091f662e3.1515775666.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Detect frees of pointers into middle of mempool objects.
I did a one-off test, but it turned out to be very tricky, so I reverted
it. First, mempool does not call kasan_poison_kfree() unless allocation
function fails. I stubbed an allocation function to fail on second and
subsequent allocations. But then mempool stopped to call
kasan_poison_kfree() at all, because it does it only when allocation
function is mempool_kmalloc(). We could support this special failing
test allocation function in mempool, but it also can't live with kasan
tests, because these are in a module.
Link: http://lkml.kernel.org/r/bf7a7d035d7a5ed62d2dd0e3d2e8a4fcdf456aa7.1514378558.git.dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>a
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
__builtin_return_address(1) is unreliable without frame pointers.
With defconfig on kmalloc_pagealloc_invalid_free test I am getting:
BUG: KASAN: double-free or invalid-free in (null)
Pass caller PC from callers explicitly.
Link: http://lkml.kernel.org/r/9b01bc2d237a4df74ff8472a3bf6b7635908de01.1514378558.git.dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>a
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Patch series "kasan: detect invalid frees".
KASAN detects double-frees, but does not detect invalid-frees (when a
pointer into a middle of heap object is passed to free). We recently had
a very unpleasant case in crypto code which freed an inner object inside
of a heap allocation. This left unnoticed during free, but totally
corrupted heap and later lead to a bunch of random crashes all over kernel
code.
Detect invalid frees.
This patch (of 5):
Detect frees of pointers into middle of large heap objects.
I dropped const from kasan_kfree_large() because it starts propagating
through a bunch of functions in kasan_report.c, slab/slub nearest_obj(),
all of their local variables, fixup_red_left(), etc.
Link: http://lkml.kernel.org/r/1b45b4fe1d20fc0de1329aab674c1dd973fee723.1514378558.git.dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>a
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Patch series "kasan: support alloca, LLVM", v4.
This patch (of 5):
For now we can hard-code ASAN ABI level 5, since historical clang builds
can't build the kernel anyway. We also need to emulate gcc's
__SANITIZE_ADDRESS__ flag, or memset() calls won't be instrumented.
Link: http://lkml.kernel.org/r/20171204191735.132544-2-paullawrence@google.com
Signed-off-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Every flow_offload entry is added into the table twice. Because of this,
rhashtable_free_and_destroy can't be used, since it would call kfree for
each flow_offload object twice.
This patch cleans up the flowtable via nf_flow_table_iterate() to
schedule removal of entries by setting on the dying bit, then there is
an explicitly invocation of the garbage collector to release resources.
Based on patch from Felix Fietkau.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Move the flowtable cleanup routines to nf_flow_table and expose the
nf_flow_table_cleanup() helper function.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
git://git.infradead.org/linux-platform-drivers-x86
Pull x86 platform-driver updates from Darren Hart:
"New model support added for Dell, Ideapad, Acer, Asus, Thinkpad, and
GPD laptops. Improvements to the common intel-vbtn driver, including
tablet mode, rotate, and front button support. Intel CPU support added
for Cannonlake and platform support for Dollar Cove power button.
Overhaul of the mellanox platform driver, creating a new
platform/mellanox directory for the newly multi-architecture regmap
interface.
Significant Intel PMC update with CannonLake support, Coffeelake
update, CPUID enumeration, module support, new read64 API, refactoring
and cleanups.
Revert the apple-gmux iGP IO lock, addressing reported issues with
non-binary drivers, leaving Nvidia binary driver users to comment out
conflicting code.
Miscellaneous fixes and cleanups"
* tag 'platform-drivers-x86-v4.16-1' of git://git.infradead.org/linux-platform-drivers-x86: (81 commits)
platform/x86: mlx-platform: Fix an ERR_PTR vs NULL issue
platform/x86: intel_pmc_core: Special case for Coffeelake
platform/x86: intel_pmc_core: Add CannonLake PCH support
x86/cpu: Add Cannonlake to Intel family
platform/x86: intel_pmc_core: Read base address from LPIT
ACPI / LPIT: Export lpit_read_residency_count_address()
platform/x86: intel-vbtn: Replace License by SDPX identifier
platform/x86: intel-vbtn: Remove redundant inclusions
platform/x86: intel-vbtn: Support tablet mode switch
platform/x86: dell-laptop: Allocate buffer on heap rather than globally
platform/x86: intel_pmc_core: Remove unused header file
platform/x86: mlx-platform: Add hotplug device unregister to error path
platform/x86: mlx-platform: fix module aliases
platform/mellanox: mlxreg-hotplug: Add check for negative adapter number
platform/x86: mlx-platform: Add IO access verification callbacks
platform/x86: mlx-platform: Document pdev_hotplug field
platform/x86: mlx-platform: Allow compilation for 32 bit arch
platform/mellanox: mlxreg-hotplug: Enable building for ARM
platform/mellanox: mlxreg-hotplug: Modify to use a regmap interface
platform/mellanox: Group create/destroy with attribute functions
...
|
|
One of the major improvement of SMCCC v1.1 is that it only clobbers
the first 4 registers, both on 32 and 64bit. This means that it
becomes very easy to provide an inline version of the SMC call
primitive, and avoid performing a function call to stash the
registers that would otherwise be clobbered by SMCCC v1.0.
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Function identifiers are a 32bit, unsigned quantity. But we never
tell so to the compiler, resulting in the following:
4ac: b26187e0 mov x0, #0xffffffff80000001
We thus rely on the firmware narrowing it for us, which is not
always a reasonable expectation.
Cc: stable@vger.kernel.org
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Since PSCI 1.0 allows the SMCCC version to be (indirectly) probed,
let's do that at boot time, and expose the version of the calling
convention as part of the psci_ops structure.
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
In order to call into the firmware to apply workarounds, it is
useful to find out whether we're using HVC or SMC. Let's expose
this through the psci_ops.
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
A new feature of SMCCC 1.1 is that it offers firmware-based CPU
workarounds. In particular, SMCCC_ARCH_WORKAROUND_1 provides
BP hardening for CVE-2017-5715.
If the host has some mitigation for this issue, report that
we deal with it using SMCCC_ARCH_WORKAROUND_1, as we apply the
host workaround on every guest exit.
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
We're about to need kvm_psci_version in HYP too. So let's turn it
into a static inline, and pass the kvm structure as a second
parameter (so that HYP can do a kern_hyp_va on it).
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
The new SMC Calling Convention (v1.1) allows for a reduced overhead
when calling into the firmware, and provides a new feature discovery
mechanism.
Make it visible to KVM guests.
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
PSCI 1.0 can be trivially implemented by providing the FEATURES
call on top of PSCI 0.2 and returning 1.0 as the PSCI version.
We happily ignore everything else, as they are either optional or
are clarifications that do not require any additional change.
PSCI 1.0 is now the default until we decide to add a userspace
selection API.
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
As we're about to trigger a PSCI version explosion, it doesn't
hurt to introduce a PSCI_VERSION helper that is going to be
used everywhere.
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
As we're about to update the PSCI support, and because I'm lazy,
let's move the PSCI include file to include/kvm so that both
ARM architectures can find it.
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Move the idr kernel-doc to its own idr.rst file and add a few
paragraphs about how to use it. Also add some more kernel-doc.
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
|
|
About 20% of the IDR users in the kernel want the allocated IDs to start
at 1. The implementation currently searches all the way down the left
hand side of the tree, finds no free ID other than ID 0, walks all the
way back up, and then all the way down again. This patch 'rebases' the
ID so we fill the entire radix tree, rather than leave a gap at 0.
Chris Wilson says: "I did the quick hack of allocating index 0 of the
idr and that eradicated idr_get_free() from being at the top of the
profiles for the many-object stress tests. This improvement will be
much appreciated."
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
|
|
Most places in the kernel that we need to distinguish functions by the
type of their arguments, we use '_ul' as a suffix for the unsigned long
variant, not '_ext'. Also add kernel-doc.
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
|
|
It has no more users, so remove it. Move idr_alloc() back into idr.c,
move the guts of idr_alloc_cmn() into idr_alloc_u32(), remove the
wrappers around idr_get_free_cmn() and rename it to idr_get_free().
While there is now no interface to allocate IDs larger than a u32,
the IDR internals remain ready to handle a larger ID should a need arise.
These changes make it possible to provide the guarantee that, if the
nextid pointer points into the object, the object's ID will be initialised
before a concurrent lookup can find the object.
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
|
|
All current users of idr_alloc_ext() actually want to allocate a u32
and idr_alloc_u32() fits their needs better.
Like idr_get_next(), it uses a 'nextid' argument which serves as both
a pointer to the start ID and the assigned ID (instead of a separate
minimum and pointer-to-assigned-ID argument). It uses a 'max' argument
rather than 'end' because the semantics that idr_alloc has for 'end'
don't work well for unsigned types.
Since idr_alloc_u32() returns an errno instead of the allocated ID, mark
it as __must_check to help callers use it correctly. Include copious
kernel-doc. Chris Mi <chrism@mellanox.com> has promised to contribute
test-cases for idr_alloc_u32.
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
|
|
Simply changing idr_remove's 'id' argument to 'unsigned long' works
for all callers.
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
|
|
Changing idr_replace's 'id' argument to 'unsigned long' works for all
callers. Callers which passed a negative ID now get -ENOENT instead of
-EINVAL. No callers relied on this error value.
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
|
|
Simply changing idr_remove's 'id' argument to 'unsigned long' suffices
for all callers.
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
|
|
Conflicts:
arch/arm64/kernel/entry.S
arch/x86/Kconfig
include/linux/sched/mm.h
kernel/fork.c
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media updates from Mauro Carvalho Chehab:
- videobuf2 was moved to a media/common dir, as it is now used by the
DVB subsystem too
- Digital TV core memory mapped support interface
- new sensor driver: ov7740
- several improvements at ddbridge driver
- new V4L2 driver: IPU3 CIO2 CSI-2 receiver unit, found on some Intel
SoCs
- new tuner driver: tda18250
- finally got rid of all LIRC staging drivers
- as we don't have old lirc drivers anymore, restruct the lirc device
code
- add support for UVC metadata
- add a new staging driver for NVIDIA Tegra Video Decoder Engine
- DVB kAPI headers moved to include/media
- synchronize the kAPI and uAPI for the DVB subsystem, removing the gap
for non-legacy APIs
- reduce the kAPI gap for V4L2
- lots of other driver enhancements, cleanups, etc.
* tag 'media/v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (407 commits)
media: v4l2-compat-ioctl32.c: make ctrl_is_pointer work for subdevs
media: v4l2-compat-ioctl32.c: refactor compat ioctl32 logic
media: v4l2-compat-ioctl32.c: don't copy back the result for certain errors
media: v4l2-compat-ioctl32.c: drop pr_info for unknown buffer type
media: v4l2-compat-ioctl32.c: copy clip list in put_v4l2_window32
media: v4l2-compat-ioctl32.c: fix ctrl_is_pointer
media: v4l2-compat-ioctl32.c: copy m.userptr in put_v4l2_plane32
media: v4l2-compat-ioctl32.c: avoid sizeof(type)
media: v4l2-compat-ioctl32.c: move 'helper' functions to __get/put_v4l2_format32
media: v4l2-compat-ioctl32.c: fix the indentation
media: v4l2-compat-ioctl32.c: add missing VIDIOC_PREPARE_BUF
media: v4l2-ioctl.c: don't copy back the result for -ENOTTY
media: v4l2-ioctl.c: use check_fmt for enum/g/s/try_fmt
media: vivid: fix module load error when enabling fb and no_error_inj=1
media: dvb_demux: improve debug messages
media: dvb_demux: Better handle discontinuity errors
media: cxusb, dib0700: ignore XC2028_I2C_FLUSH
media: ts2020: avoid integer overflows on 32 bit machines
media: i2c: ov7740: use gpio/consumer.h instead of gpio.h
media: entity: Add a nop variant of media_entity_cleanup
...
|
|
Pull more rdma updates from Doug Ledford:
"Items of note:
- two patches fix a regression in the 4.15 kernel. The 4.14 kernel
worked fine with NVMe over Fabrics and mlx5 adapters. That broke in
4.15. The fix is here.
- one of the patches (the endian notation patch from Lijun) looks
like a lot of lines of change, but it's mostly mechanical in
nature. It amounts to the biggest chunk of change in it (it's about
2/3rds of the overall pull request).
Summary:
- Clean up some function signatures in rxe for clarity
- Tidy the RDMA netlink header to remove unimplemented constants
- bnxt_re driver fixes, one is a regression this window.
- Minor hns driver fixes
- Various fixes from Dan Carpenter and his tool
- Fix IRQ cleanup race in HFI1
- HF1 performance optimizations and a fix to report counters in the right units
- Fix for an IPoIB startup sequence race with the external manager
- Oops fix for the new kabi path
- Endian cleanups for hns
- Fix for mlx5 related to the new automatic affinity support"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (38 commits)
net/mlx5: increase async EQ to avoid EQ overrun
mlx5: fix mlx5_get_vector_affinity to start from completion vector 0
RDMA/hns: Fix the endian problem for hns
IB/uverbs: Use the standard kConfig format for experimental
IB: Update references to libibverbs
IB/hfi1: Add 16B rcvhdr trace support
IB/hfi1: Convert kzalloc_node and kcalloc to use kcalloc_node
IB/core: Avoid a potential OOPs for an unused optional parameter
IB/core: Map iWarp AH type to undefined in rdma_ah_find_type
IB/ipoib: Fix for potential no-carrier state
IB/hfi1: Show fault stats in both TX and RX directions
IB/hfi1: Remove blind constants from 16B update
IB/hfi1: Convert PortXmitWait/PortVLXmitWait counters to flit times
IB/hfi1: Do not override given pcie_pset value
IB/hfi1: Optimize process_receive_ib()
IB/hfi1: Remove unnecessary fecn and becn fields
IB/hfi1: Look up ibport using a pointer in receive path
IB/hfi1: Optimize packet type comparison using 9B and bypass code paths
IB/hfi1: Compute BTH only for RDMA_WRITE_LAST/SEND_LAST packet
IB/hfi1: Remove dependence on qp->s_hdrwords
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm updates from Ross Zwisler:
- Require struct page by default for filesystem DAX to remove a number
of surprising failure cases. This includes failures with direct I/O,
gdb and fork(2).
- Add support for the new Platform Capabilities Structure added to the
NFIT in ACPI 6.2a. This new table tells us whether the platform
supports flushing of CPU and memory controller caches on unexpected
power loss events.
- Revamp vmem_altmap and dev_pagemap handling to clean up code and
better support future future PCI P2P uses.
- Deprecate the ND_IOCTL_SMART_THRESHOLD command whose payload has
become out-of-sync with recent versions of the NVDIMM_FAMILY_INTEL
spec, and instead rely on the generic ND_CMD_CALL approach used by
the two other IOCTL families, NVDIMM_FAMILY_{HPE,MSFT}.
- Enhance nfit_test so we can test some of the new things added in
version 1.6 of the DSM specification. This includes testing firmware
download and simulating the Last Shutdown State (LSS) status.
* tag 'libnvdimm-for-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (37 commits)
libnvdimm, namespace: remove redundant initialization of 'nd_mapping'
acpi, nfit: fix register dimm error handling
libnvdimm, namespace: make min namespace size 4K
tools/testing/nvdimm: force nfit_test to depend on instrumented modules
libnvdimm/nfit_test: adding support for unit testing enable LSS status
libnvdimm/nfit_test: add firmware download emulation
nfit-test: Add platform cap support from ACPI 6.2a to test
libnvdimm: expose platform persistence attribute for nd_region
acpi: nfit: add persistent memory control flag for nd_region
acpi: nfit: Add support for detect platform CPU cache flush on power loss
device-dax: Fix trailing semicolon
libnvdimm, btt: fix uninitialized err_lock
dax: require 'struct page' by default for filesystem dax
ext2: auto disable dax instead of failing mount
ext4: auto disable dax instead of failing mount
mm, dax: introduce pfn_t_special()
mm: Fix devm_memremap_pages() collision handling
mm: Fix memory size alignment in devm_memremap_pages_release()
memremap: merge find_dev_pagemap into get_dev_pagemap
memremap: change devm_memremap_pages interface to use struct dev_pagemap
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
- skip AER driver error recovery callbacks for correctable errors
reported via ACPI APEI, as we already do for errors reported via the
native path (Tyler Baicar)
- fix DPC shared interrupt handling (Alex Williamson)
- print full DPC interrupt number (Keith Busch)
- enable DPC only if AER is available (Keith Busch)
- simplify DPC code (Bjorn Helgaas)
- calculate ASPM L1 substate parameter instead of hardcoding it (Bjorn
Helgaas)
- enable Latency Tolerance Reporting for ASPM L1 substates (Bjorn
Helgaas)
- move ASPM internal interfaces out of public header (Bjorn Helgaas)
- allow hot-removal of VGA devices (Mika Westerberg)
- speed up unplug and shutdown by assuming Thunderbolt controllers
don't support Command Completed events (Lukas Wunner)
- add AtomicOps support for GPU and Infiniband drivers (Felix Kuehling,
Jay Cornwall)
- expose "ari_enabled" in sysfs to help NIC naming (Stuart Hayes)
- clean up PCI DMA interface usage (Christoph Hellwig)
- remove PCI pool API (replaced with DMA pool) (Romain Perier)
- deprecate pci_get_bus_and_slot(), which assumed PCI domain 0 (Sinan
Kaya)
- move DT PCI code from drivers/of/ to drivers/pci/ (Rob Herring)
- add PCI-specific wrappers for dev_info(), etc (Frederick Lawler)
- remove warnings on sysfs mmap failure (Bjorn Helgaas)
- quiet ROM validation messages (Alex Deucher)
- remove redundant memory alloc failure messages (Markus Elfring)
- fill in types for compile-time VGA and other I/O port resources
(Bjorn Helgaas)
- make "pci=pcie_scan_all" work for Root Ports as well as Downstream
Ports to help AmigaOne X1000 (Bjorn Helgaas)
- add SPDX tags to all PCI files (Bjorn Helgaas)
- quirk Marvell 9128 DMA aliases (Alex Williamson)
- quirk broken INTx disable on Ceton InfiniTV4 (Bjorn Helgaas)
- fix CONFIG_PCI=n build by adding dummy pci_irqd_intx_xlate() (Niklas
Cassel)
- use DMA API to get MSI address for DesignWare IP (Niklas Cassel)
- fix endpoint-mode DMA mask configuration (Kishon Vijay Abraham I)
- fix ARTPEC-6 incorrect IS_ERR() usage (Wei Yongjun)
- add support for ARTPEC-7 SoC (Niklas Cassel)
- add endpoint-mode support for ARTPEC (Niklas Cassel)
- add Cadence PCIe host and endpoint controller driver (Cyrille
Pitchen)
- handle multiple INTx status bits being set in dra7xx (Vignesh R)
- translate dra7xx hwirq range to fix INTD handling (Vignesh R)
- remove deprecated Exynos PHY initialization code (Jaehoon Chung)
- fix MSI erratum workaround for HiSilicon Hip06/Hip07 (Dongdong Liu)
- fix NULL pointer dereference in iProc BCMA driver (Ray Jui)
- fix Keystone interrupt-controller-node lookup (Johan Hovold)
- constify qcom driver structures (Julia Lawall)
- rework Tegra config space mapping to increase space available for
endpoints (Vidya Sagar)
- simplify Tegra driver by using bus->sysdata (Manikanta Maddireddy)
- remove PCI_REASSIGN_ALL_BUS usage on Tegra (Manikanta Maddireddy)
- add support for Global Fabric Manager Server (GFMS) event to
Microsemi Switchtec switch driver (Logan Gunthorpe)
- add IDs for Switchtec PSX 24xG3 and PSX 48xG3 (Kelvin Cao)
* tag 'pci-v4.16-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (140 commits)
PCI: cadence: Add EndPoint Controller driver for Cadence PCIe controller
dt-bindings: PCI: cadence: Add DT bindings for Cadence PCIe endpoint controller
PCI: endpoint: Fix EPF device name to support multi-function devices
PCI: endpoint: Add the function number as argument to EPC ops
PCI: cadence: Add host driver for Cadence PCIe controller
dt-bindings: PCI: cadence: Add DT bindings for Cadence PCIe host controller
PCI: Add vendor ID for Cadence
PCI: Add generic function to probe PCI host controllers
PCI: generic: fix missing call of pci_free_resource_list()
PCI: OF: Add generic function to parse and allocate PCI resources
PCI: Regroup all PCI related entries into drivers/pci/Makefile
PCI/DPC: Reformat DPC register definitions
PCI/DPC: Add and use DPC Status register field definitions
PCI/DPC: Squash dpc_rp_pio_get_info() into dpc_process_rp_pio_error()
PCI/DPC: Remove unnecessary RP PIO register structs
PCI/DPC: Push dpc->rp_pio_status assignment into dpc_rp_pio_get_info()
PCI/DPC: Squash dpc_rp_pio_print_error() into dpc_rp_pio_get_info()
PCI/DPC: Make RP PIO log size check more generic
PCI/DPC: Rename local "status" to "dpc_status"
PCI/DPC: Squash dpc_rp_pio_print_tlp_header() into dpc_rp_pio_print_error()
...
|
|
Commit d350a823020e ("net: erspan: create erspan metadata uapi header")
moves the erspan 'version' in front of the 'struct erspan_md2' for
later extensibility reason. This breaks the existing erspan metadata
extraction code because the erspan_md2 then has a 4-byte offset
to between the erspan_metadata and erspan_base_hdr. This patch
fixes it.
Fixes: 1a66a836da63 ("gre: add collect_md mode to ERSPAN tunnel")
Fixes: ef7baf5e083c ("ip6_gre: add ip6 erspan collect_md mode")
Fixes: 1d7e2ed22f8d ("net: erspan: refactor existing erspan code")
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The selftests test_maps program was leaving dangling BPF sockmap
programs around because not all psock elements were removed from
the map. The elements in turn hold a reference on the BPF program
they are attached to causing BPF programs to stay open even after
test_maps has completed.
The original intent was that sk_state_change() would be called
when TCP socks went through TCP_CLOSE state. However, because
socks may be in SOCK_DEAD state or the sock may be a listening
socket the event is not always triggered.
To resolve this use the ULP infrastructure and register our own
proto close() handler. This fixes the above case.
Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support")
Reported-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Create a UID field and enum that can be used to assign ULPs to
sockets. This saves a set of string comparisons if the ULP id
is known.
For sockmap, which is added in the next patches, a ULP is used to
hook into TCP sockets close state. In this case the ULP being added
is done at map insert time and the ULP is known and done on the kernel
side. In this case the named lookup is not needed. Because we don't
want to expose psock internals to user space socket options a user
visible flag is also added. For TLS this is set for BPF it will be
cleared.
Alos remove pr_notice, user gets an error code back and should check
that rather than rely on logs.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Version 20180105.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Erik Schmauss <erik.schmauss@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
including tool signons.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Erik Schmauss <erik.schmauss@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Moving the qrwlock struct definition into a header file introduced
a subtle bug on all little-endian machines, where some files in some
configurations would see the fields in an incorrect order. This was
found by building with an LTO enabled compiler that warns every time we
try to link together files with incompatible data structures.
A second patch changes linux/kconfig.h to always define the symbols,
but this seems to be the root cause of most of the issues, so I'd suggest
we do both.
On a current linux-next kernel, I verified that this header is
responsible for all type mismatches as a result from the endianess
confusion.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Babu Moger <babu.moger@oracle.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nicolas Pitre <nico@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Fixes: e0d02285f16e ("locking/qrwlock: Use 'struct qrwlock' instead of 'struct __qrwlock'")
Link: http://lkml.kernel.org/r/20180202154104.1522809-1-arnd@arndb.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
For some reason these were missing, I've not observed this patch
making a difference in the few code locations I checked, but this
makes sense.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The select_idle_sibling() (SIS) rewrite in commit:
10e2f1acd010 ("sched/core: Rewrite and improve select_idle_siblings()")
... replaced a domain iteration with a search that broadly speaking
does a wrapped walk of the scheduler domain sharing a last-level-cache.
While this had a number of improvements, one consequence is that two tasks
that share a waker/wakee relationship push each other around a socket. Even
though two tasks may be active, all cores are evenly used. This is great from
a search perspective and spreads a load across individual cores, but it has
adverse consequences for cpufreq. As each CPU has relatively low utilisation,
cpufreq may decide the utilisation is too low to used a higher P-state and
overall computation throughput suffers.
While individual cpufreq and cpuidle drivers may compensate by artifically
boosting P-state (at c0) or avoiding lower C-states (during idle), it does
not help if hardware-based cpufreq (e.g. HWP) is used.
This patch tracks a recently used CPU based on what CPU a task was running
on when it last was a waker a CPU it was recently using when a task is a
wakee. During SIS, the recently used CPU is used as a target if it's still
allowed by the task and is idle.
The benefit may be non-obvious so consider an example of two tasks
communicating back and forth. Task A may be an application doing IO where
task B is a kworker or kthread like journald. Task A may issue IO, wake
B and B wakes up A on completion. With the existing scheme this may look
like the following (potentially different IDs if SMT is in use but similar
principal applies).
A (cpu 0) wake B (wakes on cpu 1)
B (cpu 1) wake A (wakes on cpu 2)
A (cpu 2) wake B (wakes on cpu 3)
etc.
A careful reader may wonder why CPU 0 was not idle when B wakes A the
first time and it's simply due to the fact that A can be rescheduled to
another CPU and the pattern is that prev == target when B tries to wakeup A
and the information about CPU 0 has been lost.
With this patch, the pattern is more likely to be:
A (cpu 0) wake B (wakes on cpu 1)
B (cpu 1) wake A (wakes on cpu 0)
A (cpu 0) wake B (wakes on cpu 1)
etc
i.e. two communicating casts are more likely to use just two cores instead
of all available cores sharing a LLC.
The most dramatic speedup was noticed on dbench using the XFS filesystem on
UMA as clients interact heavily with workqueues in that configuration. Note
that a similar speedup is not observed on ext4 as the wakeup pattern
is different:
4.15.0-rc9 4.15.0-rc9
waprev-v1 biasancestor-v1
Hmean 1 287.54 ( 0.00%) 817.01 ( 184.14%)
Hmean 2 1268.12 ( 0.00%) 1781.24 ( 40.46%)
Hmean 4 1739.68 ( 0.00%) 1594.47 ( -8.35%)
Hmean 8 2464.12 ( 0.00%) 2479.56 ( 0.63%)
Hmean 64 1455.57 ( 0.00%) 1434.68 ( -1.44%)
The results can be less dramatic on NUMA where automatic balancing interferes
with the test. It's also known that network benchmarks running on localhost
also benefit quite a bit from this patch (roughly 10% on netperf RR for UDP
and TCP depending on the machine). Hackbench also seens small improvements
(6-11% depending on machine and thread count). The facebook schbench was also
tested but in most cases showed little or no different to wakeup latencies.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180130104555.4125-5-mgorman@techsingularity.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull overlayfs updates from Miklos Szeredi:
"This work from Amir adds NFS export capability to overlayfs. NFS
exporting an overlay filesystem is a challange because we want to keep
track of any copy-up of a file or directory between encoding the file
handle and decoding it.
This is achieved by indexing copied up objects by lower layer file
handle. The index is already used for hard links, this patchset
extends the use to NFS file handle decoding"
* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: (51 commits)
ovl: check ERR_PTR() return value from ovl_encode_fh()
ovl: fix regression in fsnotify of overlay merge dir
ovl: wire up NFS export operations
ovl: lookup indexed ancestor of lower dir
ovl: lookup connected ancestor of dir in inode cache
ovl: hash non-indexed dir by upper inode for NFS export
ovl: decode pure lower dir file handles
ovl: decode indexed dir file handles
ovl: decode lower file handles of unlinked but open files
ovl: decode indexed non-dir file handles
ovl: decode lower non-dir file handles
ovl: encode lower file handles
ovl: copy up before encoding non-connectable dir file handle
ovl: encode non-indexed upper file handles
ovl: decode connected upper dir file handles
ovl: decode pure upper file handles
ovl: encode pure upper file handles
ovl: document NFS export
vfs: factor out helpers d_instantiate_anon() and d_alloc_anon()
ovl: store 'has_upper' and 'opaque' as bit flags
...
|
|
Provide core serializing membarrier command to support memory reclaim
by JIT.
Each architecture needs to explicitly opt into that support by
documenting in their architecture code how they provide the core
serializing instructions required when returning from the membarrier
IPI, and after the scheduler has updated the curr->mm pointer (before
going back to user-space). They should then select
ARCH_HAS_MEMBARRIER_SYNC_CORE to enable support for that command on
their architecture.
Architectures selecting this feature need to either document that
they issue core serializing instructions when returning to user-space,
or implement their architecture-specific sync_core_before_usermode().
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrea Parri <parri.andrea@gmail.com>
Cc: Andrew Hunter <ahh@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Avi Kivity <avi@scylladb.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Dave Watson <davejwatson@fb.com>
Cc: David Sehr <sehr@google.com>
Cc: Greg Hackmann <ghackmann@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Maged Michael <maged.michael@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-api@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Link: http://lkml.kernel.org/r/20180129202020.8515-9-mathieu.desnoyers@efficios.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|