Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull zstd support from Chris Mason:
"Nick Terrell's patch series to add zstd support to the kernel has been
floating around for a while. After talking with Dave Sterba, Herbert
and Phillip, we decided to send the whole thing in as one pull
request.
zstd is a big win in speed over zlib and in compression ratio over
lzo, and the compression team here at FB has gotten great results
using it in production. Nick will continue to update the kernel side
with new improvements from the open source zstd userland code.
Nick has a number of benchmarks for the main zstd code in his lib/zstd
commit:
I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB
of RAM. The VM is running on a MacBook Pro with a 3.1 GHz Intel
Core i7 processor, 16 GB of RAM, and a SSD. I benchmarked using
`silesia.tar` [3], which is 211,988,480 B large. Run the following
commands for the benchmark:
sudo modprobe zstd_compress_test
sudo mknod zstd_compress_test c 245 0
sudo cp silesia.tar zstd_compress_test
The time is reported by the time of the userland `cp`.
The MB/s is computed with
1,536,217,008 B / time(buffer size, hash)
which includes the time to copy from userland.
The Adjusted MB/s is computed with
1,536,217,088 B / (time(buffer size, hash) - time(buffer size, none)).
The memory reported is the amount of memory the compressor
requests.
| Method | Size (B) | Time (s) | Ratio | MB/s | Adj MB/s | Mem (MB) |
|----------|----------|----------|-------|---------|----------|----------|
| none | 11988480 | 0.100 | 1 | 2119.88 | - | - |
| zstd -1 | 73645762 | 1.044 | 2.878 | 203.05 | 224.56 | 1.23 |
| zstd -3 | 66988878 | 1.761 | 3.165 | 120.38 | 127.63 | 2.47 |
| zstd -5 | 65001259 | 2.563 | 3.261 | 82.71 | 86.07 | 2.86 |
| zstd -10 | 60165346 | 13.242 | 3.523 | 16.01 | 16.13 | 13.22 |
| zstd -15 | 58009756 | 47.601 | 3.654 | 4.45 | 4.46 | 21.61 |
| zstd -19 | 54014593 | 102.835 | 3.925 | 2.06 | 2.06 | 60.15 |
| zlib -1 | 77260026 | 2.895 | 2.744 | 73.23 | 75.85 | 0.27 |
| zlib -3 | 72972206 | 4.116 | 2.905 | 51.50 | 52.79 | 0.27 |
| zlib -6 | 68190360 | 9.633 | 3.109 | 22.01 | 22.24 | 0.27 |
| zlib -9 | 67613382 | 22.554 | 3.135 | 9.40 | 9.44 | 0.27 |
I benchmarked zstd decompression using the same method on the same
machine. The benchmark file is located in the upstream zstd repo
under `contrib/linux-kernel/zstd_decompress_test.c` [4]. The
memory reported is the amount of memory required to decompress
data compressed with the given compression level. If you know the
maximum size of your input, you can reduce the memory usage of
decompression irrespective of the compression level.
| Method | Time (s) | MB/s | Adjusted MB/s | Memory (MB) |
|----------|----------|---------|---------------|-------------|
| none | 0.025 | 8479.54 | - | - |
| zstd -1 | 0.358 | 592.15 | 636.60 | 0.84 |
| zstd -3 | 0.396 | 535.32 | 571.40 | 1.46 |
| zstd -5 | 0.396 | 535.32 | 571.40 | 1.46 |
| zstd -10 | 0.374 | 566.81 | 607.42 | 2.51 |
| zstd -15 | 0.379 | 559.34 | 598.84 | 4.61 |
| zstd -19 | 0.412 | 514.54 | 547.77 | 8.80 |
| zlib -1 | 0.940 | 225.52 | 231.68 | 0.04 |
| zlib -3 | 0.883 | 240.08 | 247.07 | 0.04 |
| zlib -6 | 0.844 | 251.17 | 258.84 | 0.04 |
| zlib -9 | 0.837 | 253.27 | 287.64 | 0.04 |
I ran a long series of tests and benchmarks on the btrfs side and the
gains are very similar to the core benchmarks Nick ran"
* 'zstd-minimal' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
squashfs: Add zstd support
btrfs: Add zstd support
lib: Add zstd modules
lib: Add xxhash module
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Use Make-builtin $(abspath ...) helper to get absolute path
- Add W=2 extra warning option to detect unused macros
- Use more KCONFIG_CONFIG instead hard-coded .config
- Fix bugs of tar*-pkg targets
* tag 'kbuild-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: buildtar: do not print successful message if tar returns error
kbuild: buildtar: fix tar error when CONFIG_MODULES is disabled
kbuild: Use KCONFIG_CONFIG in buildtar
Kbuild: enable -Wunused-macros warning for "make W=2"
kbuild: use $(abspath ...) instead of $(shell cd ... && /bin/pwd)
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper updates from Mike Snitzer:
- Some request-based DM core and DM multipath fixes and cleanups
- Constify a few variables in DM core and DM integrity
- Add bufio optimization and checksum failure accounting to DM
integrity
- Fix DM integrity to avoid checking integrity of failed reads
- Fix DM integrity to use init_completion
- A couple DM log-writes target fixes
- Simplify DAX flushing by eliminating the unnecessary flush
abstraction that was stood up for DM's use.
* tag 'for-4.14/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dax: remove the pmem_dax_ops->flush abstraction
dm integrity: use init_completion instead of COMPLETION_INITIALIZER_ONSTACK
dm integrity: make blk_integrity_profile structure const
dm integrity: do not check integrity for failed read operations
dm log writes: fix >512b sectorsize support
dm log writes: don't use all the cpu while waiting to log blocks
dm ioctl: constify ioctl lookup table
dm: constify argument arrays
dm integrity: count and display checksum failures
dm integrity: optimize writing dm-bufio buffers that are partially changed
dm rq: do not update rq partially in each ending bio
dm rq: make dm-sq requeuing behavior consistent with dm-mq behavior
dm mpath: complain about unsupported __multipath_map_bio() return values
dm mpath: avoid that building with W=1 causes gcc 7 to complain about fall-through
|
|
Pull fbdev updates from Bartlomiej Zolnierkiewicz:
- make fbcon a built-time depency for fbdev (fbcon was tristate option
before, now it is a bool) - this is a first step in preparations for
making console_lock usage saner (currently it acts like the BKL for
all things fbdev/fbcon) (Daniel Vetter)
- add fbcon=margin:<color> command line option to select the fbcon
margin color (David Lechner)
- add DMI quirk table for x86 systems which need fbcon rotation
(devices like Asus T100HA, GPD Pocket, the GPD win and the I.T.Works
TW891) (Hans de Goede)
- fix 1bpp logo support for unusual width (needed by LEGO MINDSTORMS
EV3) (David Lechner)
- enable Xilinx FB driver for ARM ZynqMP platform (Michal Simek)
- fix use after free in the error path of udlfb driver (Anton Vasilyev)
- fix error return code handling in pxa3xx_gcu driver (Gustavo A. R.
Silva)
- fix bootparams.screeninfo arguments checking in vgacon (Jan H.
Schönherr)
- do not leak uninitialized padding in clk to userspace in the debug
code of atyfb driver (Vladis Dronov)
- fix compiler warnings in fbcon code and matroxfb driver (Arnd
Bergmann)
- convert fbdev susbsytem to using %pOF instead of full_name (Rob
Herring)
- structures constifications (Arvind Yadav, Bhumika Goyal, Gustavo A.
R. Silva, Julia Lawall)
- misc cleanups (Gustavo A. R. Silva, Hyun Kwon, Julia Lawall, Kuninori
Morimoto, Lynn Lei)
* tag 'fbdev-v4.14' of git://github.com/bzolnier/linux: (75 commits)
video/console: Update BIOS dates list for GPD win console rotation DMI quirk
video/console: Add rotated LCD-panel DMI quirk for the VIOS LTH17
video: fbdev: sis: fix duplicated code for different branches
video: fbdev: make fb_var_screeninfo const
video: fbdev: aty: do not leak uninitialized padding in clk to userspace
vgacon: Prevent faulty bootparams.screeninfo from causing harm
video: fbdev: make fb_videomode const
video/console: Add new BIOS date for GPD pocket to dmi quirk table
fbcon: remove restriction on margin color
video: ARM CLCD: constify amba_id
video: fm2fb: constify zorro_device_id
video: fbdev: annotate fb_fix_screeninfo with const and __initconst
omapfb: constify omap_video_timings structures
video: fbdev: udlfb: Fix use after free on dlfb_usb_probe error path
fbdev: i810: make fb_ops const
fbdev: matrox: make fb_ops const
video: fbdev: pxa3xx_gcu: fix error return code in pxa3xx_gcu_probe()
video: fbdev: Enable Xilinx FB for ZynqMP
video: fbdev: Fix multiple style issues in xilinxfb
video: fbdev: udlfb: constify usb_device_id.
...
|
|
Pull watchdog updates from Wim Van Sebroeck:
- add support for the watchdog on Meson8 and Meson8m2
- add support for MediaTek MT7623 and MT7622 SoC
- add support for the r8a77995 wdt
- explicitly request exclusive reset control for asm9260_wdt,
zx2967_wdt, rt2880_wdt and mt7621_wdt
- improvements to asm9260_wdt, aspeed_wdt, renesas_wdt and cadence_wdt
- add support for reading freq via CCF + suspend/resume support for
of_xilinx_wdt
- constify watchdog_ops and various device-id structures
- revert of commit 1fccb73011ea ("iTCO_wdt: all versions count down
twice") (Bug 196509)
* git://www.linux-watchdog.org/linux-watchdog: (40 commits)
watchdog: mei_wdt: constify mei_cl_device_id
watchdog: sp805: constify amba_id
watchdog: ziirave: constify i2c_device_id
watchdog: sc1200: constify pnp_device_id
dt-bindings: watchdog: renesas-wdt: Add support for the r8a77995 wdt
watchdog: renesas_wdt: update copyright dates
watchdog: renesas_wdt: make 'clk' a variable local to probe()
watchdog: renesas_wdt: consistently use RuntimePM for clock management
watchdog: aspeed: Support configuration of external signal properties
dt-bindings: watchdog: aspeed: External reset signal properties
drivers/watchdog: Add optional ASPEED device tree properties
drivers/watchdog: ASPEED reference dev tree properties for config
watchdog: da9063_wdt: Simplify by removing unneeded struct...
watchdog: bcm7038: Check the return value from clk_prepare_enable()
watchdog: qcom: Check for platform_get_resource() failure
watchdog: of_xilinx_wdt: Add suspend/resume support
watchdog: of_xilinx_wdt: Add support for reading freq via CCF
dt-bindings: watchdog: mediatek: add support for MediaTek MT7623 and MT7622 SoC
watchdog: max77620_wdt: constify platform_device_id
watchdog: pcwd_usb: constify usb_device_id
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
Pull dmi update from Jean Delvare:
"Mark all struct dmi_system_id instances const"
* 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
dmi: Mark all struct dmi_system_id instances const
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
"This slew of fixes for pin control was noticed and patched up early,
so to get the annoyance out of the way for -rc1 it would make sense to
send them already.
- Fix a build include in the Uniphier driver to keep pace with
ongoing refactorings.
- Fix a slew of minor semantic and syntactic issues as well as
stricting up Kconfig for the new Spreadtrum driver.
- Fix the GPIO interrupt set-up on the Marvell 37xx Armada as fallout
for dynamically allocating irq descriptors from the core. (Also
tagged for stable.)
- Fix AMD register suspend/resume state spool/unspooling so that
wakeup works as it should. (Also tagged for stable.)"
* tag 'pinctrl-v4.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl/amd: save pin registers over suspend/resume
pinctrl: armada-37xx: Fix gpio interrupt setup
pinctrl: sprd: fix off by one bugs
pinctrl: sprd: check for allocation failure
pinctrl: sprd: Restrict PINCTRL_SPRD to ARCH_SPRD or COMPILE_TEST
pinctrl: sprd: fix build errors and dependencies
pinctrl: sprd: make three local functions static
pinctrl: uniphier: include <linux/build_bug.h> instead of <linux/bug.h>
|
|
Merge misc fixes from Andrew Morton:
"A few leftovers"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm, page_owner: skip unnecessary stack_trace entries
arm64: stacktrace: avoid listing stacktrace functions in stacktrace
mm: treewide: remove GFP_TEMPORARY allocation flag
IB/mlx4: fix sprintf format warning
fscache: fix fscache_objlist_show format processing
lib/test_bitmap.c: use ULL suffix for 64-bit constants
procfs: remove unused variable
drivers/media/cec/cec-adap.c: fix build with gcc-4.4.4
idr: remove WARN_ON_ONCE() when trying to replace negative ID
|
|
Now that we have added breaks in the wait queue scan and allow bookmark
on scan position, we put this logic in the wake_up_page_bit function.
We can have very long page wait list in large system where multiple
pages share the same wait list. We break the wake up walk here to allow
other cpus a chance to access the list, and not to disable the interrupts
when traversing the list for too long. This reduces the interrupt and
rescheduling latency, and excessive page wait queue lock hold time.
[ v2: Remove bookmark_wake_function ]
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
We encountered workloads that have very long wake up list on large
systems. A waker takes a long time to traverse the entire wake list and
execute all the wake functions.
We saw page wait list that are up to 3700+ entries long in tests of
large 4 and 8 socket systems. It took 0.8 sec to traverse such list
during wake up. Any other CPU that contends for the list spin lock will
spin for a long time. It is a result of the numa balancing migration of
hot pages that are shared by many threads.
Multiple CPUs waking are queued up behind the lock, and the last one
queued has to wait until all CPUs did all the wakeups.
The page wait list is traversed with interrupt disabled, which caused
various problems. This was the original cause that triggered the NMI
watch dog timer in: https://patchwork.kernel.org/patch/9800303/ . Only
extending the NMI watch dog timer there helped.
This patch bookmarks the waker's scan position in wake list and break
the wake up walk, to allow access to the list before the waker resume
its walk down the rest of the wait list. It lowers the interrupt and
rescheduling latency.
This patch also provides a performance boost when combined with the next
patch to break up page wakeup list walk. We saw 22% improvement in the
will-it-scale file pread2 test on a Xeon Phi system running 256 threads.
[ v2: Merged in Linus' changes to remove the bookmark_wake_function, and
simply access to flags. ]
Reported-by: Kan Liang <kan.liang@intel.com>
Tested-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
... and __initconst if applicable.
Based on similar work for an older kernel in the Grsecurity patch.
[JD: fix toshiba-wmi build]
[JD: add htcpen]
[JD: move __initconst where checkscript wants it]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jean Delvare <jdelvare@suse.de>
|
|
The page_owner stacktrace always begin as follows:
[<ffffff987bfd48f4>] save_stack+0x40/0xc8
[<ffffff987bfd4da8>] __set_page_owner+0x3c/0x6c
These two entries do not provide any useful information and limits the
available stacktrace depth. The page_owner stacktrace was skipping
caller function from stack entries but this was missed with commit
f2ca0b557107 ("mm/page_owner: use stackdepot to store stacktrace")
Example page_owner entry after the patch:
Page allocated via order 0, mask 0x8(ffffff80085fb714)
PFN 654411 type Movable Block 639 type CMA Flags 0x0(ffffffbe5c7f12c0)
[<ffffff9b64989c14>] post_alloc_hook+0x70/0x80
...
[<ffffff9b651216e8>] msm_comm_try_state+0x5f8/0x14f4
[<ffffff9b6512486c>] msm_vidc_open+0x5e4/0x7d0
[<ffffff9b65113674>] msm_v4l2_open+0xa8/0x224
Link: http://lkml.kernel.org/r/1504078343-28754-2-git-send-email-guptap@codeaurora.org
Fixes: f2ca0b557107 ("mm/page_owner: use stackdepot to store stacktrace")
Signed-off-by: Prakash Gupta <guptap@codeaurora.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The stacktraces always begin as follows:
[<c00117b4>] save_stack_trace_tsk+0x0/0x98
[<c0011870>] save_stack_trace+0x24/0x28
...
This is because the stack trace code includes the stack frames for
itself. This is incorrect behaviour, and also leads to "skip" doing the
wrong thing (which is the number of stack frames to avoid recording.)
Perversely, it does the right thing when passed a non-current thread.
Fix this by ensuring that we have a known constant number of frames
above the main stack trace function, and always skip these.
This was fixed for arch arm by commit 3683f44c42e9 ("ARM: stacktrace:
avoid listing stacktrace functions in stacktrace")
Link: http://lkml.kernel.org/r/1504078343-28754-1-git-send-email-guptap@codeaurora.org
Signed-off-by: Prakash Gupta <guptap@codeaurora.org>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
GFP_TEMPORARY was introduced by commit e12ba74d8ff3 ("Group short-lived
and reclaimable kernel allocations") along with __GFP_RECLAIMABLE. It's
primary motivation was to allow users to tell that an allocation is
short lived and so the allocator can try to place such allocations close
together and prevent long term fragmentation. As much as this sounds
like a reasonable semantic it becomes much less clear when to use the
highlevel GFP_TEMPORARY allocation flag. How long is temporary? Can the
context holding that memory sleep? Can it take locks? It seems there is
no good answer for those questions.
The current implementation of GFP_TEMPORARY is basically GFP_KERNEL |
__GFP_RECLAIMABLE which in itself is tricky because basically none of
the existing caller provide a way to reclaim the allocated memory. So
this is rather misleading and hard to evaluate for any benefits.
I have checked some random users and none of them has added the flag
with a specific justification. I suspect most of them just copied from
other existing users and others just thought it might be a good idea to
use without any measuring. This suggests that GFP_TEMPORARY just
motivates for cargo cult usage without any reasoning.
I believe that our gfp flags are quite complex already and especially
those with highlevel semantic should be clearly defined to prevent from
confusion and abuse. Therefore I propose dropping GFP_TEMPORARY and
replace all existing users to simply use GFP_KERNEL. Please note that
SLAB users with shrinkers will still get __GFP_RECLAIMABLE heuristic and
so they will be placed properly for memory fragmentation prevention.
I can see reasons we might want some gfp flag to reflect shorterm
allocations but I propose starting from a clear semantic definition and
only then add users with proper justification.
This was been brought up before LSF this year by Matthew [1] and it
turned out that GFP_TEMPORARY really doesn't have a clear semantic. It
seems to be a heuristic without any measured advantage for most (if not
all) its current users. The follow up discussion has revealed that
opinions on what might be temporary allocation differ a lot between
developers. So rather than trying to tweak existing users into a
semantic which they haven't expected I propose to simply remove the flag
and start from scratch if we really need a semantic for short term
allocations.
[1] http://lkml.kernel.org/r/20170118054945.GD18349@bombadil.infradead.org
[akpm@linux-foundation.org: fix typo]
[akpm@linux-foundation.org: coding-style fixes]
[sfr@canb.auug.org.au: drm/i915: fix up]
Link: http://lkml.kernel.org/r/20170816144703.378d4f4d@canb.auug.org.au
Link: http://lkml.kernel.org/r/20170728091904.14627-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Neil Brown <neilb@suse.de>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
gcc-7 points out that a negative port_num value would overflow the
string buffer:
drivers/infiniband/hw/mlx4/sysfs.c: In function 'mlx4_ib_device_register_sysfs':
drivers/infiniband/hw/mlx4/sysfs.c:251:16: error: 'sprintf' may write a terminating nul past the end of the destination [-Werror=format-overflow=]
drivers/infiniband/hw/mlx4/sysfs.c:251:2: note: 'sprintf' output between 2 and 11 bytes into a destination of size 10
drivers/infiniband/hw/mlx4/sysfs.c:303:17: error: 'sprintf' may write a terminating nul past the end of the destination [-Werror=format-overflow=]
drivers/infiniband/hw/mlx4/sysfs.c:303:3: note: 'sprintf' output between 2 and 11 bytes into a destination of size 10
While we should be able to assume that port_num is positive here, making
the buffer one byte longer has no downsides and avoids the warning.
Fixes: c1e7e466120b ("IB/mlx4: Add iov directory in sysfs under the ib device")
Link: http://lkml.kernel.org/r/20170714120720.906842-23-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
gcc points out a minor bug in the handling of unknown cookie types,
which could result in a string overflow when the integer is copied into
a 3-byte string:
fs/fscache/object-list.c: In function 'fscache_objlist_show':
fs/fscache/object-list.c:265:19: error: 'sprintf' may write a terminating nul past the end of the destination [-Werror=format-overflow=]
sprintf(_type, "%02u", cookie->def->type);
^~~~~~
fs/fscache/object-list.c:265:4: note: 'sprintf' output between 3 and 4 bytes into a destination of size 3
This is currently harmless as no code sets a type other than 0 or 1, but
it makes sense to use snprintf() here to avoid overflowing the array if
that changes.
Link: http://lkml.kernel.org/r/20170714120720.906842-22-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
With gcc 4.1.2:
lib/test_bitmap.c:189: warning: integer constant is too large for `long' type
lib/test_bitmap.c:190: warning: integer constant is too large for `long' type
lib/test_bitmap.c:194: warning: integer constant is too large for `long' type
lib/test_bitmap.c:195: warning: integer constant is too large for `long' type
Add the missing "ULL" suffix to fix this.
Link: http://lkml.kernel.org/r/1505040523-31230-1-git-send-email-geert@linux-m68k.org
Fixes: 60ef690018b262dd ("bitmap: introduce BITMAP_FROM_U64()")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Yury Norov <ynorov@caviumnetworks.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
In NOMMU configurations, we get a warning about a variable that has become
unused:
fs/proc/task_nommu.c: In function 'nommu_vma_show':
fs/proc/task_nommu.c:148:28: error: unused variable 'priv' [-Werror=unused-variable]
Link: http://lkml.kernel.org/r/20170911200231.3171415-1-arnd@arndb.de
Fixes: 1240ea0dc3bb ("fs, proc: remove priv argument from is_stack")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
gcc-4.4.4 has issues with initialization of anonymous unions:
drivers/media/cec/cec-adap.c: In function 'cec_queue_msg_fh':
drivers/media/cec/cec-adap.c:184: error: unknown field 'lost_msgs' specified in initializer
work around this.
Fixes: 6b2bbb08747a5 ("media: cec: rework the cec event handling")
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Hans Verkuil <hans.verkuil@cisco.com>
Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
IDR only supports non-negative IDs. There used to be a 'WARN_ON_ONCE(id <
0)' in idr_replace(), but it was intentionally removed by commit
2e1c9b286765 ("idr: remove WARN_ON_ONCE() on negative IDs").
Then it was added back by commit 0a835c4f090a ("Reimplement IDR and IDA
using the radix tree"). However it seems that adding it back was a
mistake, given that some users such as drm_gem_handle_delete()
(DRM_IOCTL_GEM_CLOSE) pass in a value from userspace to idr_replace(),
allowing the WARN_ON_ONCE to be triggered. drm_gem_handle_delete()
actually just wants idr_replace() to return an error code if the ID is
not allocated, including in the case where the ID is invalid (negative).
So once again remove the bogus WARN_ON_ONCE().
This bug was found by syzkaller, which encountered the following
warning:
WARNING: CPU: 3 PID: 3008 at lib/idr.c:157 idr_replace+0x1d8/0x240 lib/idr.c:157
Kernel panic - not syncing: panic_on_warn set ...
CPU: 3 PID: 3008 Comm: syzkaller218828 Not tainted 4.13.0-rc4-next-20170811 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:190
do_trap_no_signal arch/x86/kernel/traps.c:224 [inline]
do_trap+0x260/0x390 arch/x86/kernel/traps.c:273
do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:310
do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:323
invalid_op+0x1e/0x30 arch/x86/entry/entry_64.S:930
RIP: 0010:idr_replace+0x1d8/0x240 lib/idr.c:157
RSP: 0018:ffff8800394bf9f8 EFLAGS: 00010297
RAX: ffff88003c6c60c0 RBX: 1ffff10007297f43 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8800394bfa78
RBP: ffff8800394bfae0 R08: ffffffff82856487 R09: 0000000000000000
R10: ffff8800394bf9a8 R11: ffff88006c8bae28 R12: ffffffffffffffff
R13: ffff8800394bfab8 R14: dffffc0000000000 R15: ffff8800394bfbc8
drm_gem_handle_delete+0x33/0xa0 drivers/gpu/drm/drm_gem.c:297
drm_gem_close_ioctl+0xa1/0xe0 drivers/gpu/drm/drm_gem.c:671
drm_ioctl_kernel+0x1e7/0x2e0 drivers/gpu/drm/drm_ioctl.c:729
drm_ioctl+0x72e/0xa50 drivers/gpu/drm/drm_ioctl.c:825
vfs_ioctl fs/ioctl.c:45 [inline]
do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:685
SYSC_ioctl fs/ioctl.c:700 [inline]
SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691
entry_SYSCALL_64_fastpath+0x1f/0xbe
Here is a C reproducer:
#include <fcntl.h>
#include <stddef.h>
#include <stdint.h>
#include <sys/ioctl.h>
#include <drm/drm.h>
int main(void)
{
int cardfd = open("/dev/dri/card0", O_RDONLY);
ioctl(cardfd, DRM_IOCTL_GEM_CLOSE,
&(struct drm_gem_close) { .handle = -1 } );
}
Link: http://lkml.kernel.org/r/20170906235306.20534-1-ebiggers3@gmail.com
Fixes: 0a835c4f090a ("Reimplement IDR and IDA using the radix tree")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: <stable@vger.kernel.org> [v4.11+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"A handful of tooling fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf stat: Wait for the correct child
perf tools: Support running perf binaries with a dash in their name
perf config: Check not only section->from_system_config but also item's
perf ui progress: Fix progress update
perf ui progress: Make sure we always define step value
perf tools: Open perf.data with O_CLOEXEC flag
tools lib api: Fix make DEBUG=1 build
perf tests: Fix compile when libunwind's unwind.h is available
tools include linux: Guard against redefinition of some macros
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
"Three CPU hotplug related fixes and a debugging improvement"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/debug: Add debugfs knob for "sched_debug"
sched/core: WARN() when migrating to an offline CPU
sched/fair: Plug hole between hotplug and active_load_balance()
sched/fair: Avoid newidle balance for !active CPUs
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"The main changes are the PCID fixes from Andy, but there's also two
hyperv fixes and two paravirt updates"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/hyper-v: Remove duplicated HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED definition
x86/hyper-V: Allocate the IDT entry early in boot
paravirt: Switch maintainer
x86/paravirt: Remove no longer used paravirt functions
x86/mm/64: Initialize CR4.PCIDE early
x86/hibernate/64: Mask off CR3's PCID bits in the saved CR3
x86/mm: Get rid of VM_BUG_ON in switch_tlb_irqs_off()
|
|
Pull OpenRISC fixlet from Stafford Horne:
"Fix warning for upcoming work to remove linux/vmalloc.h from
asm-generic/io.h"
* tag 'openrisc-for-linus' of git://github.com/openrisc/linux:
openrisc: add forward declaration for struct vm_area_struct
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux
Pull modules updates from Jessica Yu:
"Summary of modules changes for the 4.14 merge window:
- minor code cleanups and fixes
- modpost: avoid building modules that have names that exceed the
size of the name field in struct module"
* tag 'modules-for-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
module: Remove const attribute from alias for MODULE_DEVICE_TABLE
module: fix ddebug_remove_module()
modpost: abort if module name is too long
|
|
Another merge window, another MAINTAINERS file disaster.
People have serious problems with the alphabet and sorting, and poor
Jérôme Glisse and Radim Krčmář get their names mangled by locale issues,
turning them into some mangled mess (probably others do too, but those
two stood out when sorting things again).
And we now have two copies of the same 'AS3645A LED FLASH CONTROLLER
DRIVER' in the tree and in the MAINTAINERS file, but that's a separate
issue - the duplication is real, and I left them as two entries for the
same name.
This does not try to sort the actual section pattern entries, although I
may end up doing that later.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk updates from Stephen Boyd:
"The diff is dominated by the Allwinner A10/A20 SoCs getting converted
to the sunxi-ng framework. Otherwise, the heavy hitters are various
drivers for SoCs like AT91, Amlogic, Renesas, and Rockchip. There are
some other new clk drivers in here too but overall this is just a
bunch of clk drivers for various different pieces of hardware and a
collection of non-critical fixes for clk drivers.
New Drivers:
- Allwinner R40 SoCs
- Renesas R-Car Gen3 USB 2.0 clock selector PHY
- Atmel AT91 audio PLL
- Uniphier PXs3 SoCs
- ARC HSDK Board PLLs
- AXS10X Board PLLs
- STMicroelectronics STM32H743 SoCs
Removed Drivers:
- Non-compiling mb86s7x support
Updates:
- Allwinner A10/A20 SoCs converted to sunxi-ng framework
- Allwinner H3 CPU clk fixes
- Renesas R-Car D3 SoC
- Renesas V2H and M3-W modules
- Samsung Exynos5420/5422/5800 audio fixes
- Rockchip fractional clk approximation fixes
- Rockchip rk3126 SoC support within the rk3128 driver
- Amlogic gxbb CEC32 and sd_emmc clks
- Amlogic meson8b reset controller support
- IDT VersaClock 5P49V5925/5P49V6901 support
- Qualcomm MSM8996 SMMU clks
- Various 'const' applications for struct clk_ops
- si5351 PLL reset bugfix
- Uniphier audio on LD11/LD20 and ethernet support on LD11/LD20/Pro4/PXs2
- Assorted Tegra clk driver fixes"
* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (120 commits)
clk: si5351: fix PLL reset
ASoC: atmel-classd: remove aclk clock
ASoC: atmel-classd: remove aclk clock from DT binding
clk: at91: clk-generated: make gclk determine audio_pll rate
clk: at91: clk-generated: create function to find best_diff
clk: at91: add audio pll clock drivers
dt-bindings: clk: at91: add audio plls to the compatible list
clk: at91: clk-generated: remove useless divisor loop
clk: mb86s7x: Drop non-building driver
clk: ti: check for null return in strrchr to avoid null dereferencing
clk: Don't write error code into divider register
clk: uniphier: add video input subsystem clock
clk: uniphier: add audio system clock
clk: stm32h7: Add stm32h743 clock driver
clk: gate: expose clk_gate_ops::is_enabled
clk: nxp: clk-lpc32xx: rename clk_gate_is_enabled()
clk: uniphier: add PXs3 clock data
clk: hi6220: change watchdog clock source
clk: Kconfig: Name RK805 in Kconfig for COMMON_CLK_RK808
clk: cs2000: Add cs2000_set_saved_rate
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux
Pull RTC updates from Alexandre Belloni:
"Subsystem:
- remove .open() and .release() RTC ops
- constify i2c_device_id
New driver:
- Realtek RTD1295
- Android emulator (goldfish) RTC
Drivers:
- ds1307: Beginning of a huge cleanup
- s35390a: handle invalid RTC time
- sun6i: external oscillator gate support"
* tag 'rtc-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (40 commits)
rtc: ds1307: use octal permissions
rtc: ds1307: fix braces
rtc: ds1307: fix alignments and blank lines
rtc: ds1307: use BIT
rtc: ds1307: use u32
rtc: ds1307: use sizeof
rtc: ds1307: remove regs member
rtc: Add Realtek RTD1295
dt-bindings: rtc: Add Realtek RTD1295
rtc: sun6i: Add support for the external oscillator gate
rtc: goldfish: Add RTC driver for Android emulator
dt-bindings: Add device tree binding for Goldfish RTC driver
rtc: ds1307: add basic support for ds1341 chip
rtc: ds1307: remove member nvram_offset from struct ds1307
rtc: ds1307: factor out offset to struct chip_desc
rtc: ds1307: factor out rtc_ops to struct chip_desc
rtc: ds1307: factor out irq_handler to struct chip_desc
rtc: ds1307: improve irq setup
rtc: ds1307: constify struct chip_desc variables
rtc: ds1307: improve trickle charger initialization
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Most of the commits are trivial cleanup patches, while one commit is a
significant fix for the race at ALSA sequencer that was spotted by
syzkaller"
* tag 'sound-fix-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: seq: Cancel pending autoload work at unbinding device
ALSA: firewire: Use common error handling code in snd_motu_stream_start_duplex()
ALSA: asihpi: Kill BUG_ON() usages
ALSA: core: Use %pS printk format for direct addresses
ALSA: ymfpci: Use common error handling code in snd_ymfpci_create()
ALSA: ymfpci: Use common error handling code in snd_card_ymfpci_probe()
ALSA: 6fire: Use common error handling code in usb6fire_chip_probe()
ALSA: usx2y: Use common error handling code in submit_urbs()
ALSA: us122l: Use common error handling code in us122l_create_card()
ALSA: hdspm: Use common error handling code in snd_hdspm_probe()
ALSA: rme9652: Use common code in hdsp_get_iobox_version()
ALSA: maestro3: Use common error handling code in two functions
|
|
Pull SCSI fixes from James Bottomley:
"A tiny update: one patch corrects a Kconfig problem with the shift of
the SAS SMP code to BSG and the other removes a vestige of user space
target mode"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: scsi_transport_sas: select BLK_DEV_BSGLIB
scsi: Remove Scsi_Host.uspace_req_q
|
|
Pull block fixes from Jens Axboe:
"Small collection of fixes that would be nice to have in -rc1. This
contains:
- NVMe pull request form Christoph, mostly with fixes for nvme-pci,
host memory buffer in particular.
- Error handling fixup for cgwb_create(), in case allocation of 'wb'
fails. From Christophe Jaillet.
- Ensure that trace_block_getrq() gets the 'dev' in an appropriate
fashion, to avoid a potential NULL deref. From Greg Thelen.
- Regression fix for dm-mq with blk-mq, fixing a problem with
stacking IO schedulers. From me.
- string.h fixup, fixing an issue with memcpy_and_pad(). This
original change came in through an NVMe dependency, which is why
I'm including it here. From Martin Wilck.
- Fix potential int overflow in __blkdev_sectors_to_bio_pages(), from
Mikulas.
- MBR enable fix for sed-opal, from Scott"
* 'for-linus' of git://git.kernel.dk/linux-block:
block: directly insert blk-mq request from blk_insert_cloned_request()
mm/backing-dev.c: fix an error handling path in 'cgwb_create()'
string.h: un-fortify memcpy_and_pad
nvme-pci: implement the HMB entry number and size limitations
nvme-pci: propagate (some) errors from host memory buffer setup
nvme-pci: use appropriate initial chunk size for HMB allocation
nvme-pci: fix host memory buffer allocation fallback
nvme: fix lightnvm check
block: fix integer overflow in __blkdev_sectors_to_bio_pages()
block: sed-opal: Set MBRDone on S3 resume path if TPER is MBREnabled
block: tolerate tracing of NULL bio
|
|
Pull documentation fixes from Jonathan Corbet:
"A cleanup from Mauro that needed to wait for the media pull, plus a
handful of other fixes that wandered in"
* tag 'docs-4.14' of git://git.lwn.net/linux:
kokr/memory-barriers.txt: Apply atomic_t.txt change
kokr/doc: Update memory-barriers.txt for read-to-write dependencies
docs-rst: don't require adjustbox anymore
docs-rst: conf.py: only setup notice box colors if Sphinx < 1.6
docs-rst: conf.py: remove lscape from LaTeX preamble
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse updates from Miklos Szeredi:
"This fixes a regression (spotted by the Sandstorm.io folks) in the pid
namespace handling introduced in 4.12.
There's also a fix for honoring sync/dsync flags for pwritev2()"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: getattr cleanup
fuse: honor iocb sync flags on write
fuse: allow server to run in different pid_ns
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull overlayfs updates from Miklos Szeredi:
"This fixes d_ino correctness in readdir, which brings overlayfs on par
with normal filesystems regarding inode number semantics, as long as
all layers are on the same filesystem.
There are also some bug fixes, one in particular (random ioctl's
shouldn't be able to modify lower layers) that touches some vfs code,
but of course no-op for non-overlay fs"
* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: fix false positive ESTALE on lookup
ovl: don't allow writing ioctl on lower layer
ovl: fix relatime for directories
vfs: add flags to d_real()
ovl: cleanup d_real for negative
ovl: constant d_ino for non-merge dirs
ovl: constant d_ino across copy up
ovl: fix readdir error value
ovl: check snprintf return
|
|
Commits:
7dcf90e9e032 ("PCI: hv: Use vPCI protocol version 1.2")
628f54cc6451 ("x86/hyper-v: Support extended CPU ranges for TLB flush hypercalls")
added the same definition and they came in through different trees.
Fix the duplication.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: devel@linuxdriverproject.org
Link: http://lkml.kernel.org/r/20170911150620.3998-1-vkuznets@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Allocate the hypervisor callback IDT entry early in the boot sequence.
The previous code would allocate the entry as part of registering the handler
when the vmbus driver loaded, and this caused a problem for the IDT cleanup
that Thomas is working on for v4.15.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: apw@canonical.com
Cc: devel@linuxdriverproject.org
Cc: gregkh@linuxfoundation.org
Cc: jasowang@redhat.com
Cc: olaf@aepfle.de
Link: http://lkml.kernel.org/r/20170908231557.2419-1-kys@exchange.microsoft.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Jeremy Fitzhardinge is stepping down as a paravirt maintainer. I'll
replace him.
While at it, update the file list to the actual pattern.
Signed-off-by: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akataria@vmware.com
Cc: chrisw@sous-sol.org
Cc: jeremy@goop.org
Cc: rusty@rustcorp.com.au
Cc: virtualization@lists.linux-foundation.org
Link: http://lkml.kernel.org/r/20170905143407.9227-1-jgross@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
With removal of lguest some of the paravirt functions are no longer
needed:
->read_cr4()
->store_idt()
->set_pmd_at()
->set_pud_at()
->pte_update()
Remove them.
Signed-off-by: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akataria@vmware.com
Cc: boris.ostrovsky@oracle.com
Cc: chrisw@sous-sol.org
Cc: jeremy@goop.org
Cc: rusty@rustcorp.com.au
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/20170904102527.25409-1-jgross@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
cpu_init() is weird: it's called rather late (after early
identification and after most MMU state is initialized) on the boot
CPU but is called extremely early (before identification) on secondary
CPUs. It's called just late enough on the boot CPU that its CR4 value
isn't propagated to mmu_cr4_features.
Even if we put CR4.PCIDE into mmu_cr4_features, we'd hit two
problems. First, we'd crash in the trampoline code. That's
fixable, and I tried that. It turns out that mmu_cr4_features is
totally ignored by secondary_start_64(), though, so even with the
trampoline code fixed, it wouldn't help.
This means that we don't currently have CR4.PCIDE reliably initialized
before we start playing with cpu_tlbstate. This is very fragile and
tends to cause boot failures if I make even small changes to the TLB
handling code.
Make it more robust: initialize CR4.PCIDE earlier on the boot CPU
and propagate it to secondary CPUs in start_secondary().
( Yes, this is ugly. I think we should have improved mmu_cr4_features
to actually control CR4 during secondary bootup, but that would be
fairly intrusive at this stage. )
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reported-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Tested-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Fixes: 660da7c9228f ("x86/mm: Enable CR4.PCIDE on supported systems")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Jiri reported a resume-from-hibernation failure triggered by PCID.
The root cause appears to be rather odd. The hibernation asm
restores a CR3 value that comes from the image header. If the image
kernel has PCID on, it's entirely reasonable for this CR3 value to
have one of the low 12 bits set. The restore code restores it with
CR4.PCIDE=0, which means that those low 12 bits are accepted by the
CPU but are either ignored or interpreted as a caching mode. This
is odd, but still works. We blow up later when the image kernel
restores CR4, though, since changing CR4.PCIDE with CR3[11:0] != 0
is illegal. Boom!
FWIW, it's entirely unclear to me what's supposed to happen if a PAE
kernel restores a non-PAE image or vice versa. Ditto for LA57.
Reported-by: Jiri Kosina <jikos@kernel.org>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 660da7c9228f ("x86/mm: Enable CR4.PCIDE on supported systems")
Link: http://lkml.kernel.org/r/18ca57090651a6341e97083883f9e814c4f14684.1504847163.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
If we hit the VM_BUG_ON(), we're detecting a genuinely bad situation,
but we're very unlikely to get a useful call trace.
Make it a warning instead.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/3b4e06bbb382ca54a93218407c93925ff5871546.1504847163.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
- Fix TUI progress bar when delta from new total from that of the
previous update is greater than the progress "step" (screen width
progress bar block)) (Jiri Olsa)
- Make tools/lib/api make DEBUG=1 build use -D_FORTIFY_SOURCE=2 not
to cripple debuginfo, just like tools/perf/ does (Jiri Olsa)
- Avoid leaking the 'perf.data' file to workloads started from the
'perf record' command line by using the O_CLOEXEC open flag (Jiri Olsa)
- Fix building when libunwind's 'unwind.h' file is present in the
include path, clashing with tools/perf/util/unwind.h (Milian Wolff)
- Check per .perfconfig section entry flag, not just per section (Taeung Song)
- Support running perf binaries with a dash in their name, needed to
run perf as an AppImage (Milian Wolff)
- Wait for the right child by using waitpid() when running workloads
from 'perf stat', also to fix using perf as an AppImage (Milian Wolff)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"In this round, we've mostly tuned f2fs to provide better user
experience for Android. Especially, we've worked on atomic write
feature again with SQLite community in order to support it officially.
And we added or modified several facilities to analyze and enhance IO
behaviors.
Major changes include:
- add app/fs io stat
- add inode checksum feature
- support project/journalled quota
- enhance atomic write with new ioctl() which exposes feature set
- enhance background gc/discard/fstrim flows with new gc_urgent mode
- add F2FS_IOC_FS{GET,SET}XATTR
- fix some quota flows"
* tag 'f2fs-for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (63 commits)
f2fs: hurry up to issue discard after io interruption
f2fs: fix to show correct discard_granularity in sysfs
f2fs: detect dirty inode in evict_inode
f2fs: clear radix tree dirty tag of pages whose dirty flag is cleared
f2fs: speed up gc_urgent mode with SSR
f2fs: better to wait for fstrim completion
f2fs: avoid race in between read xattr & write xattr
f2fs: make get_lock_data_page to handle encrypted inode
f2fs: use generic terms used for encrypted block management
f2fs: introduce f2fs_encrypted_file for clean-up
Revert "f2fs: add a new function get_ssr_cost"
f2fs: constify super_operations
f2fs: fix to wake up all sleeping flusher
f2fs: avoid race in between atomic_read & atomic_inc
f2fs: remove unneeded parameter of change_curseg
f2fs: update i_flags correctly
f2fs: don't check inode's checksum if it was dirtied or writebacked
f2fs: don't need to update inode checksum for recovery
f2fs: trigger fdatasync for non-atomic_write file
f2fs: fix to avoid race in between aio and gc
...
|
|
Pull ceph updates from Ilya Dryomov:
"The highlights include:
- a large series of fixes and improvements to the snapshot-handling
code (Zheng Yan)
- individual read/write OSD requests passed down to libceph are now
limited to 16M in size to avoid hitting OSD-side limits (Zheng Yan)
- encode MStatfs v2 message to allow for more accurate space usage
reporting (Douglas Fuller)
- switch to the new writeback error tracking infrastructure (Jeff
Layton)"
* tag 'ceph-for-4.14-rc1' of git://github.com/ceph/ceph-client: (35 commits)
ceph: stop on-going cached readdir if mds revokes FILE_SHARED cap
ceph: wait on writeback after writing snapshot data
ceph: fix capsnap dirty pages accounting
ceph: ignore wbc->range_{start,end} when write back snapshot data
ceph: fix "range cyclic" mode writepages
ceph: cleanup local variables in ceph_writepages_start()
ceph: optimize pagevec iterating in ceph_writepages_start()
ceph: make writepage_nounlock() invalidate page that beyonds EOF
ceph: properly get capsnap's size in get_oldest_context()
ceph: remove stale check in ceph_invalidatepage()
ceph: queue cap snap only when snap realm's context changes
ceph: handle race between vmtruncate and queuing cap snap
ceph: fix message order check in handle_cap_export()
ceph: fix NULL pointer dereference in ceph_flush_snaps()
ceph: adjust 36 checks for NULL pointers
ceph: delete an unnecessary return statement in update_dentry_lease()
ceph: ENOMEM pr_err in __get_or_create_frag() is redundant
ceph: check negative offsets in ceph_llseek()
ceph: more accurate statfs
ceph: properly set snap follows for cap reconnect
...
|
|
If using a kernel with CONFIG_XFS_RT=y and we set the RHINHERIT flag on
a directory in a filesystem that does not have a realtime device and
create a new file in that directory, it gets marked as a real time file.
When data is written and a fsync is issued, the filesystem attempts to
flush a non-existent rt device during the fsync process.
This results in a crash dereferencing a null buftarg pointer in
xfs_blkdev_issue_flush():
BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
IP: xfs_blkdev_issue_flush+0xd/0x20
.....
Call Trace:
xfs_file_fsync+0x188/0x1c0
vfs_fsync_range+0x3b/0xa0
do_fsync+0x3d/0x70
SyS_fsync+0x10/0x20
do_syscall_64+0x4d/0xb0
entry_SYSCALL64_slow_path+0x25/0x25
Setting RT inode flags does not require special privileges so any
unprivileged user can cause this oops to occur. To reproduce, confirm
kernel is compiled with CONFIG_XFS_RT=y and run:
# mkfs.xfs -f /dev/pmem0
# mount /dev/pmem0 /mnt/test
# mkdir /mnt/test/foo
# xfs_io -c 'chattr +t' /mnt/test/foo
# xfs_io -f -c 'pwrite 0 5m' -c fsync /mnt/test/foo/bar
Or just run xfstests with MKFS_OPTIONS="-d rtinherit=1" and wait.
Kernels built with CONFIG_XFS_RT=n are not exposed to this bug.
Fixes: f538d4da8d52 ("[XFS] write barrier support")
Cc: <stable@vger.kernel.org>
Signed-off-by: Richard Wareing <rwareing@fb.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Pull dma-mapping updates from Christoph Hellwig:
- removal of the old dma_alloc_noncoherent interface
- remove unused flags to dma_declare_coherent_memory
- restrict OF DMA configuration to specific physical busses
- use the iommu mailing list for dma-mapping questions and patches
* tag 'dma-mapping-4.14' of git://git.infradead.org/users/hch/dma-mapping:
dma-coherent: fix dma_declare_coherent_memory() logic error
ARM: imx: mx31moboard: Remove unused 'dma' variable
dma-coherent: remove an unused variable
MAINTAINERS: use the iommu list for the dma-mapping subsystem
dma-coherent: remove the DMA_MEMORY_MAP and DMA_MEMORY_IO flags
dma-coherent: remove the DMA_MEMORY_INCLUDES_CHILDREN flag
of: restrict DMA configuration
dma-mapping: remove dma_alloc_noncoherent and dma_free_noncoherent
i825xx: switch to switch to dma_alloc_attrs
au1000_eth: switch to dma_alloc_attrs
sgiseeq: switch to dma_alloc_attrs
dma-mapping: reduce dma_mapping_error inline bloat
|
|
Pull uuid updates from Christoph Hellwig:
"Just a single conversion to the new UUID API for this merge window"
* tag 'uuid-for-4.14' of git://git.infradead.org/users/hch/uuid:
efi: switch to use new generic UUID API
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux updates from Paul Moore:
"A relatively quiet period for SELinux, 11 patches with only two/three
having any substantive changes.
These noteworthy changes include another tweak to the NNP/nosuid
handling, per-file labeling for cgroups, and an object class fix for
AF_UNIX/SOCK_RAW sockets; the rest of the changes are minor tweaks or
administrative updates (Stephen's email update explains the file
explosion in the diffstat).
Everything passes the selinux-testsuite"
[ Also a couple of small patches from the security tree from Tetsuo
Handa for Tomoyo and LSM cleanup. The separation of security policy
updates wasn't all that clean - Linus ]
* tag 'selinux-pr-20170831' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: constify nf_hook_ops
selinux: allow per-file labeling for cgroupfs
lsm_audit: update my email address
selinux: update my email address
MAINTAINERS: update the NetLabel and Labeled Networking information
selinux: use GFP_NOWAIT in the AVC kmem_caches
selinux: Generalize support for NNP/nosuid SELinux domain transitions
selinux: genheaders should fail if too many permissions are defined
selinux: update the selinux info in MAINTAINERS
credits: update Paul Moore's info
selinux: Assign proper class to PF_UNIX/SOCK_RAW sockets
tomoyo: Update URLs in Documentation/admin-guide/LSM/tomoyo.rst
LSM: Remove security_task_create() hook.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Two fixes: dead code removal, plus a SME memory encryption fix on
32-bit kernels that crashed Xen guests"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpu: Remove unused and undefined __generic_processor_info() declaration
x86/mm: Make the SME mask a u64
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
"Three fixes:
- fix a suspend/resume cpusets bug
- fix a !CONFIG_NUMA_BALANCING bug
- fix a kerneldoc warning"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Fix nuisance kernel-doc warning
sched/cpuset/pm: Fix cpuset vs. suspend-resume bugs
sched/fair: Fix wake_affine_llc() balancing rules
|