summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-09-14Merge branch 'zstd-minimal' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull zstd support from Chris Mason: "Nick Terrell's patch series to add zstd support to the kernel has been floating around for a while. After talking with Dave Sterba, Herbert and Phillip, we decided to send the whole thing in as one pull request. zstd is a big win in speed over zlib and in compression ratio over lzo, and the compression team here at FB has gotten great results using it in production. Nick will continue to update the kernel side with new improvements from the open source zstd userland code. Nick has a number of benchmarks for the main zstd code in his lib/zstd commit: I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM. The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor, 16 GB of RAM, and a SSD. I benchmarked using `silesia.tar` [3], which is 211,988,480 B large. Run the following commands for the benchmark: sudo modprobe zstd_compress_test sudo mknod zstd_compress_test c 245 0 sudo cp silesia.tar zstd_compress_test The time is reported by the time of the userland `cp`. The MB/s is computed with 1,536,217,008 B / time(buffer size, hash) which includes the time to copy from userland. The Adjusted MB/s is computed with 1,536,217,088 B / (time(buffer size, hash) - time(buffer size, none)). The memory reported is the amount of memory the compressor requests. | Method | Size (B) | Time (s) | Ratio | MB/s | Adj MB/s | Mem (MB) | |----------|----------|----------|-------|---------|----------|----------| | none | 11988480 | 0.100 | 1 | 2119.88 | - | - | | zstd -1 | 73645762 | 1.044 | 2.878 | 203.05 | 224.56 | 1.23 | | zstd -3 | 66988878 | 1.761 | 3.165 | 120.38 | 127.63 | 2.47 | | zstd -5 | 65001259 | 2.563 | 3.261 | 82.71 | 86.07 | 2.86 | | zstd -10 | 60165346 | 13.242 | 3.523 | 16.01 | 16.13 | 13.22 | | zstd -15 | 58009756 | 47.601 | 3.654 | 4.45 | 4.46 | 21.61 | | zstd -19 | 54014593 | 102.835 | 3.925 | 2.06 | 2.06 | 60.15 | | zlib -1 | 77260026 | 2.895 | 2.744 | 73.23 | 75.85 | 0.27 | | zlib -3 | 72972206 | 4.116 | 2.905 | 51.50 | 52.79 | 0.27 | | zlib -6 | 68190360 | 9.633 | 3.109 | 22.01 | 22.24 | 0.27 | | zlib -9 | 67613382 | 22.554 | 3.135 | 9.40 | 9.44 | 0.27 | I benchmarked zstd decompression using the same method on the same machine. The benchmark file is located in the upstream zstd repo under `contrib/linux-kernel/zstd_decompress_test.c` [4]. The memory reported is the amount of memory required to decompress data compressed with the given compression level. If you know the maximum size of your input, you can reduce the memory usage of decompression irrespective of the compression level. | Method | Time (s) | MB/s | Adjusted MB/s | Memory (MB) | |----------|----------|---------|---------------|-------------| | none | 0.025 | 8479.54 | - | - | | zstd -1 | 0.358 | 592.15 | 636.60 | 0.84 | | zstd -3 | 0.396 | 535.32 | 571.40 | 1.46 | | zstd -5 | 0.396 | 535.32 | 571.40 | 1.46 | | zstd -10 | 0.374 | 566.81 | 607.42 | 2.51 | | zstd -15 | 0.379 | 559.34 | 598.84 | 4.61 | | zstd -19 | 0.412 | 514.54 | 547.77 | 8.80 | | zlib -1 | 0.940 | 225.52 | 231.68 | 0.04 | | zlib -3 | 0.883 | 240.08 | 247.07 | 0.04 | | zlib -6 | 0.844 | 251.17 | 258.84 | 0.04 | | zlib -9 | 0.837 | 253.27 | 287.64 | 0.04 | I ran a long series of tests and benchmarks on the btrfs side and the gains are very similar to the core benchmarks Nick ran" * 'zstd-minimal' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: squashfs: Add zstd support btrfs: Add zstd support lib: Add zstd modules lib: Add xxhash module
2017-09-14Merge tag 'kbuild-v4.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Use Make-builtin $(abspath ...) helper to get absolute path - Add W=2 extra warning option to detect unused macros - Use more KCONFIG_CONFIG instead hard-coded .config - Fix bugs of tar*-pkg targets * tag 'kbuild-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kbuild: buildtar: do not print successful message if tar returns error kbuild: buildtar: fix tar error when CONFIG_MODULES is disabled kbuild: Use KCONFIG_CONFIG in buildtar Kbuild: enable -Wunused-macros warning for "make W=2" kbuild: use $(abspath ...) instead of $(shell cd ... && /bin/pwd)
2017-09-14Merge tag 'for-4.14/dm-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper updates from Mike Snitzer: - Some request-based DM core and DM multipath fixes and cleanups - Constify a few variables in DM core and DM integrity - Add bufio optimization and checksum failure accounting to DM integrity - Fix DM integrity to avoid checking integrity of failed reads - Fix DM integrity to use init_completion - A couple DM log-writes target fixes - Simplify DAX flushing by eliminating the unnecessary flush abstraction that was stood up for DM's use. * tag 'for-4.14/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dax: remove the pmem_dax_ops->flush abstraction dm integrity: use init_completion instead of COMPLETION_INITIALIZER_ONSTACK dm integrity: make blk_integrity_profile structure const dm integrity: do not check integrity for failed read operations dm log writes: fix >512b sectorsize support dm log writes: don't use all the cpu while waiting to log blocks dm ioctl: constify ioctl lookup table dm: constify argument arrays dm integrity: count and display checksum failures dm integrity: optimize writing dm-bufio buffers that are partially changed dm rq: do not update rq partially in each ending bio dm rq: make dm-sq requeuing behavior consistent with dm-mq behavior dm mpath: complain about unsupported __multipath_map_bio() return values dm mpath: avoid that building with W=1 causes gcc 7 to complain about fall-through
2017-09-14Merge tag 'fbdev-v4.14' of git://github.com/bzolnier/linuxLinus Torvalds
Pull fbdev updates from Bartlomiej Zolnierkiewicz: - make fbcon a built-time depency for fbdev (fbcon was tristate option before, now it is a bool) - this is a first step in preparations for making console_lock usage saner (currently it acts like the BKL for all things fbdev/fbcon) (Daniel Vetter) - add fbcon=margin:<color> command line option to select the fbcon margin color (David Lechner) - add DMI quirk table for x86 systems which need fbcon rotation (devices like Asus T100HA, GPD Pocket, the GPD win and the I.T.Works TW891) (Hans de Goede) - fix 1bpp logo support for unusual width (needed by LEGO MINDSTORMS EV3) (David Lechner) - enable Xilinx FB driver for ARM ZynqMP platform (Michal Simek) - fix use after free in the error path of udlfb driver (Anton Vasilyev) - fix error return code handling in pxa3xx_gcu driver (Gustavo A. R. Silva) - fix bootparams.screeninfo arguments checking in vgacon (Jan H. Schönherr) - do not leak uninitialized padding in clk to userspace in the debug code of atyfb driver (Vladis Dronov) - fix compiler warnings in fbcon code and matroxfb driver (Arnd Bergmann) - convert fbdev susbsytem to using %pOF instead of full_name (Rob Herring) - structures constifications (Arvind Yadav, Bhumika Goyal, Gustavo A. R. Silva, Julia Lawall) - misc cleanups (Gustavo A. R. Silva, Hyun Kwon, Julia Lawall, Kuninori Morimoto, Lynn Lei) * tag 'fbdev-v4.14' of git://github.com/bzolnier/linux: (75 commits) video/console: Update BIOS dates list for GPD win console rotation DMI quirk video/console: Add rotated LCD-panel DMI quirk for the VIOS LTH17 video: fbdev: sis: fix duplicated code for different branches video: fbdev: make fb_var_screeninfo const video: fbdev: aty: do not leak uninitialized padding in clk to userspace vgacon: Prevent faulty bootparams.screeninfo from causing harm video: fbdev: make fb_videomode const video/console: Add new BIOS date for GPD pocket to dmi quirk table fbcon: remove restriction on margin color video: ARM CLCD: constify amba_id video: fm2fb: constify zorro_device_id video: fbdev: annotate fb_fix_screeninfo with const and __initconst omapfb: constify omap_video_timings structures video: fbdev: udlfb: Fix use after free on dlfb_usb_probe error path fbdev: i810: make fb_ops const fbdev: matrox: make fb_ops const video: fbdev: pxa3xx_gcu: fix error return code in pxa3xx_gcu_probe() video: fbdev: Enable Xilinx FB for ZynqMP video: fbdev: Fix multiple style issues in xilinxfb video: fbdev: udlfb: constify usb_device_id. ...
2017-09-14Merge git://www.linux-watchdog.org/linux-watchdogLinus Torvalds
Pull watchdog updates from Wim Van Sebroeck: - add support for the watchdog on Meson8 and Meson8m2 - add support for MediaTek MT7623 and MT7622 SoC - add support for the r8a77995 wdt - explicitly request exclusive reset control for asm9260_wdt, zx2967_wdt, rt2880_wdt and mt7621_wdt - improvements to asm9260_wdt, aspeed_wdt, renesas_wdt and cadence_wdt - add support for reading freq via CCF + suspend/resume support for of_xilinx_wdt - constify watchdog_ops and various device-id structures - revert of commit 1fccb73011ea ("iTCO_wdt: all versions count down twice") (Bug 196509) * git://www.linux-watchdog.org/linux-watchdog: (40 commits) watchdog: mei_wdt: constify mei_cl_device_id watchdog: sp805: constify amba_id watchdog: ziirave: constify i2c_device_id watchdog: sc1200: constify pnp_device_id dt-bindings: watchdog: renesas-wdt: Add support for the r8a77995 wdt watchdog: renesas_wdt: update copyright dates watchdog: renesas_wdt: make 'clk' a variable local to probe() watchdog: renesas_wdt: consistently use RuntimePM for clock management watchdog: aspeed: Support configuration of external signal properties dt-bindings: watchdog: aspeed: External reset signal properties drivers/watchdog: Add optional ASPEED device tree properties drivers/watchdog: ASPEED reference dev tree properties for config watchdog: da9063_wdt: Simplify by removing unneeded struct... watchdog: bcm7038: Check the return value from clk_prepare_enable() watchdog: qcom: Check for platform_get_resource() failure watchdog: of_xilinx_wdt: Add suspend/resume support watchdog: of_xilinx_wdt: Add support for reading freq via CCF dt-bindings: watchdog: mediatek: add support for MediaTek MT7623 and MT7622 SoC watchdog: max77620_wdt: constify platform_device_id watchdog: pcwd_usb: constify usb_device_id ...
2017-09-14Merge branch 'dmi-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging Pull dmi update from Jean Delvare: "Mark all struct dmi_system_id instances const" * 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging: dmi: Mark all struct dmi_system_id instances const
2017-09-14Merge tag 'pinctrl-v4.14-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: "This slew of fixes for pin control was noticed and patched up early, so to get the annoyance out of the way for -rc1 it would make sense to send them already. - Fix a build include in the Uniphier driver to keep pace with ongoing refactorings. - Fix a slew of minor semantic and syntactic issues as well as stricting up Kconfig for the new Spreadtrum driver. - Fix the GPIO interrupt set-up on the Marvell 37xx Armada as fallout for dynamically allocating irq descriptors from the core. (Also tagged for stable.) - Fix AMD register suspend/resume state spool/unspooling so that wakeup works as it should. (Also tagged for stable.)" * tag 'pinctrl-v4.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl/amd: save pin registers over suspend/resume pinctrl: armada-37xx: Fix gpio interrupt setup pinctrl: sprd: fix off by one bugs pinctrl: sprd: check for allocation failure pinctrl: sprd: Restrict PINCTRL_SPRD to ARCH_SPRD or COMPILE_TEST pinctrl: sprd: fix build errors and dependencies pinctrl: sprd: make three local functions static pinctrl: uniphier: include <linux/build_bug.h> instead of <linux/bug.h>
2017-09-14Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "A few leftovers" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm, page_owner: skip unnecessary stack_trace entries arm64: stacktrace: avoid listing stacktrace functions in stacktrace mm: treewide: remove GFP_TEMPORARY allocation flag IB/mlx4: fix sprintf format warning fscache: fix fscache_objlist_show format processing lib/test_bitmap.c: use ULL suffix for 64-bit constants procfs: remove unused variable drivers/media/cec/cec-adap.c: fix build with gcc-4.4.4 idr: remove WARN_ON_ONCE() when trying to replace negative ID
2017-09-14sched/wait: Introduce wakeup boomark in wake_up_page_bitTim Chen
Now that we have added breaks in the wait queue scan and allow bookmark on scan position, we put this logic in the wake_up_page_bit function. We can have very long page wait list in large system where multiple pages share the same wait list. We break the wake up walk here to allow other cpus a chance to access the list, and not to disable the interrupts when traversing the list for too long. This reduces the interrupt and rescheduling latency, and excessive page wait queue lock hold time. [ v2: Remove bookmark_wake_function ] Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-14sched/wait: Break up long wake list walkTim Chen
We encountered workloads that have very long wake up list on large systems. A waker takes a long time to traverse the entire wake list and execute all the wake functions. We saw page wait list that are up to 3700+ entries long in tests of large 4 and 8 socket systems. It took 0.8 sec to traverse such list during wake up. Any other CPU that contends for the list spin lock will spin for a long time. It is a result of the numa balancing migration of hot pages that are shared by many threads. Multiple CPUs waking are queued up behind the lock, and the last one queued has to wait until all CPUs did all the wakeups. The page wait list is traversed with interrupt disabled, which caused various problems. This was the original cause that triggered the NMI watch dog timer in: https://patchwork.kernel.org/patch/9800303/ . Only extending the NMI watch dog timer there helped. This patch bookmarks the waker's scan position in wake list and break the wake up walk, to allow access to the list before the waker resume its walk down the rest of the wait list. It lowers the interrupt and rescheduling latency. This patch also provides a performance boost when combined with the next patch to break up page wakeup list walk. We saw 22% improvement in the will-it-scale file pread2 test on a Xeon Phi system running 256 threads. [ v2: Merged in Linus' changes to remove the bookmark_wake_function, and simply access to flags. ] Reported-by: Kan Liang <kan.liang@intel.com> Tested-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-14dmi: Mark all struct dmi_system_id instances constChristoph Hellwig
... and __initconst if applicable. Based on similar work for an older kernel in the Grsecurity patch. [JD: fix toshiba-wmi build] [JD: add htcpen] [JD: move __initconst where checkscript wants it] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jean Delvare <jdelvare@suse.de>
2017-09-13mm, page_owner: skip unnecessary stack_trace entriesPrakash Gupta
The page_owner stacktrace always begin as follows: [<ffffff987bfd48f4>] save_stack+0x40/0xc8 [<ffffff987bfd4da8>] __set_page_owner+0x3c/0x6c These two entries do not provide any useful information and limits the available stacktrace depth. The page_owner stacktrace was skipping caller function from stack entries but this was missed with commit f2ca0b557107 ("mm/page_owner: use stackdepot to store stacktrace") Example page_owner entry after the patch: Page allocated via order 0, mask 0x8(ffffff80085fb714) PFN 654411 type Movable Block 639 type CMA Flags 0x0(ffffffbe5c7f12c0) [<ffffff9b64989c14>] post_alloc_hook+0x70/0x80 ... [<ffffff9b651216e8>] msm_comm_try_state+0x5f8/0x14f4 [<ffffff9b6512486c>] msm_vidc_open+0x5e4/0x7d0 [<ffffff9b65113674>] msm_v4l2_open+0xa8/0x224 Link: http://lkml.kernel.org/r/1504078343-28754-2-git-send-email-guptap@codeaurora.org Fixes: f2ca0b557107 ("mm/page_owner: use stackdepot to store stacktrace") Signed-off-by: Prakash Gupta <guptap@codeaurora.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-13arm64: stacktrace: avoid listing stacktrace functions in stacktracePrakash Gupta
The stacktraces always begin as follows: [<c00117b4>] save_stack_trace_tsk+0x0/0x98 [<c0011870>] save_stack_trace+0x24/0x28 ... This is because the stack trace code includes the stack frames for itself. This is incorrect behaviour, and also leads to "skip" doing the wrong thing (which is the number of stack frames to avoid recording.) Perversely, it does the right thing when passed a non-current thread. Fix this by ensuring that we have a known constant number of frames above the main stack trace function, and always skip these. This was fixed for arch arm by commit 3683f44c42e9 ("ARM: stacktrace: avoid listing stacktrace functions in stacktrace") Link: http://lkml.kernel.org/r/1504078343-28754-1-git-send-email-guptap@codeaurora.org Signed-off-by: Prakash Gupta <guptap@codeaurora.org> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-13mm: treewide: remove GFP_TEMPORARY allocation flagMichal Hocko
GFP_TEMPORARY was introduced by commit e12ba74d8ff3 ("Group short-lived and reclaimable kernel allocations") along with __GFP_RECLAIMABLE. It's primary motivation was to allow users to tell that an allocation is short lived and so the allocator can try to place such allocations close together and prevent long term fragmentation. As much as this sounds like a reasonable semantic it becomes much less clear when to use the highlevel GFP_TEMPORARY allocation flag. How long is temporary? Can the context holding that memory sleep? Can it take locks? It seems there is no good answer for those questions. The current implementation of GFP_TEMPORARY is basically GFP_KERNEL | __GFP_RECLAIMABLE which in itself is tricky because basically none of the existing caller provide a way to reclaim the allocated memory. So this is rather misleading and hard to evaluate for any benefits. I have checked some random users and none of them has added the flag with a specific justification. I suspect most of them just copied from other existing users and others just thought it might be a good idea to use without any measuring. This suggests that GFP_TEMPORARY just motivates for cargo cult usage without any reasoning. I believe that our gfp flags are quite complex already and especially those with highlevel semantic should be clearly defined to prevent from confusion and abuse. Therefore I propose dropping GFP_TEMPORARY and replace all existing users to simply use GFP_KERNEL. Please note that SLAB users with shrinkers will still get __GFP_RECLAIMABLE heuristic and so they will be placed properly for memory fragmentation prevention. I can see reasons we might want some gfp flag to reflect shorterm allocations but I propose starting from a clear semantic definition and only then add users with proper justification. This was been brought up before LSF this year by Matthew [1] and it turned out that GFP_TEMPORARY really doesn't have a clear semantic. It seems to be a heuristic without any measured advantage for most (if not all) its current users. The follow up discussion has revealed that opinions on what might be temporary allocation differ a lot between developers. So rather than trying to tweak existing users into a semantic which they haven't expected I propose to simply remove the flag and start from scratch if we really need a semantic for short term allocations. [1] http://lkml.kernel.org/r/20170118054945.GD18349@bombadil.infradead.org [akpm@linux-foundation.org: fix typo] [akpm@linux-foundation.org: coding-style fixes] [sfr@canb.auug.org.au: drm/i915: fix up] Link: http://lkml.kernel.org/r/20170816144703.378d4f4d@canb.auug.org.au Link: http://lkml.kernel.org/r/20170728091904.14627-1-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Mel Gorman <mgorman@suse.de> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Matthew Wilcox <willy@infradead.org> Cc: Neil Brown <neilb@suse.de> Cc: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-13IB/mlx4: fix sprintf format warningArnd Bergmann
gcc-7 points out that a negative port_num value would overflow the string buffer: drivers/infiniband/hw/mlx4/sysfs.c: In function 'mlx4_ib_device_register_sysfs': drivers/infiniband/hw/mlx4/sysfs.c:251:16: error: 'sprintf' may write a terminating nul past the end of the destination [-Werror=format-overflow=] drivers/infiniband/hw/mlx4/sysfs.c:251:2: note: 'sprintf' output between 2 and 11 bytes into a destination of size 10 drivers/infiniband/hw/mlx4/sysfs.c:303:17: error: 'sprintf' may write a terminating nul past the end of the destination [-Werror=format-overflow=] drivers/infiniband/hw/mlx4/sysfs.c:303:3: note: 'sprintf' output between 2 and 11 bytes into a destination of size 10 While we should be able to assume that port_num is positive here, making the buffer one byte longer has no downsides and avoids the warning. Fixes: c1e7e466120b ("IB/mlx4: Add iov directory in sysfs under the ib device") Link: http://lkml.kernel.org/r/20170714120720.906842-23-arnd@arndb.de Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-13fscache: fix fscache_objlist_show format processingArnd Bergmann
gcc points out a minor bug in the handling of unknown cookie types, which could result in a string overflow when the integer is copied into a 3-byte string: fs/fscache/object-list.c: In function 'fscache_objlist_show': fs/fscache/object-list.c:265:19: error: 'sprintf' may write a terminating nul past the end of the destination [-Werror=format-overflow=] sprintf(_type, "%02u", cookie->def->type); ^~~~~~ fs/fscache/object-list.c:265:4: note: 'sprintf' output between 3 and 4 bytes into a destination of size 3 This is currently harmless as no code sets a type other than 0 or 1, but it makes sense to use snprintf() here to avoid overflowing the array if that changes. Link: http://lkml.kernel.org/r/20170714120720.906842-22-arnd@arndb.de Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-13lib/test_bitmap.c: use ULL suffix for 64-bit constantsGeert Uytterhoeven
With gcc 4.1.2: lib/test_bitmap.c:189: warning: integer constant is too large for `long' type lib/test_bitmap.c:190: warning: integer constant is too large for `long' type lib/test_bitmap.c:194: warning: integer constant is too large for `long' type lib/test_bitmap.c:195: warning: integer constant is too large for `long' type Add the missing "ULL" suffix to fix this. Link: http://lkml.kernel.org/r/1505040523-31230-1-git-send-email-geert@linux-m68k.org Fixes: 60ef690018b262dd ("bitmap: introduce BITMAP_FROM_U64()") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Yury Norov <ynorov@caviumnetworks.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-13procfs: remove unused variableArnd Bergmann
In NOMMU configurations, we get a warning about a variable that has become unused: fs/proc/task_nommu.c: In function 'nommu_vma_show': fs/proc/task_nommu.c:148:28: error: unused variable 'priv' [-Werror=unused-variable] Link: http://lkml.kernel.org/r/20170911200231.3171415-1-arnd@arndb.de Fixes: 1240ea0dc3bb ("fs, proc: remove priv argument from is_stack") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-13drivers/media/cec/cec-adap.c: fix build with gcc-4.4.4Andrew Morton
gcc-4.4.4 has issues with initialization of anonymous unions: drivers/media/cec/cec-adap.c: In function 'cec_queue_msg_fh': drivers/media/cec/cec-adap.c:184: error: unknown field 'lost_msgs' specified in initializer work around this. Fixes: 6b2bbb08747a5 ("media: cec: rework the cec event handling") Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Hans Verkuil <hans.verkuil@cisco.com> Cc: Maxime Ripard <maxime.ripard@free-electrons.com> Cc: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-13idr: remove WARN_ON_ONCE() when trying to replace negative IDEric Biggers
IDR only supports non-negative IDs. There used to be a 'WARN_ON_ONCE(id < 0)' in idr_replace(), but it was intentionally removed by commit 2e1c9b286765 ("idr: remove WARN_ON_ONCE() on negative IDs"). Then it was added back by commit 0a835c4f090a ("Reimplement IDR and IDA using the radix tree"). However it seems that adding it back was a mistake, given that some users such as drm_gem_handle_delete() (DRM_IOCTL_GEM_CLOSE) pass in a value from userspace to idr_replace(), allowing the WARN_ON_ONCE to be triggered. drm_gem_handle_delete() actually just wants idr_replace() to return an error code if the ID is not allocated, including in the case where the ID is invalid (negative). So once again remove the bogus WARN_ON_ONCE(). This bug was found by syzkaller, which encountered the following warning: WARNING: CPU: 3 PID: 3008 at lib/idr.c:157 idr_replace+0x1d8/0x240 lib/idr.c:157 Kernel panic - not syncing: panic_on_warn set ... CPU: 3 PID: 3008 Comm: syzkaller218828 Not tainted 4.13.0-rc4-next-20170811 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:190 do_trap_no_signal arch/x86/kernel/traps.c:224 [inline] do_trap+0x260/0x390 arch/x86/kernel/traps.c:273 do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:310 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:323 invalid_op+0x1e/0x30 arch/x86/entry/entry_64.S:930 RIP: 0010:idr_replace+0x1d8/0x240 lib/idr.c:157 RSP: 0018:ffff8800394bf9f8 EFLAGS: 00010297 RAX: ffff88003c6c60c0 RBX: 1ffff10007297f43 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8800394bfa78 RBP: ffff8800394bfae0 R08: ffffffff82856487 R09: 0000000000000000 R10: ffff8800394bf9a8 R11: ffff88006c8bae28 R12: ffffffffffffffff R13: ffff8800394bfab8 R14: dffffc0000000000 R15: ffff8800394bfbc8 drm_gem_handle_delete+0x33/0xa0 drivers/gpu/drm/drm_gem.c:297 drm_gem_close_ioctl+0xa1/0xe0 drivers/gpu/drm/drm_gem.c:671 drm_ioctl_kernel+0x1e7/0x2e0 drivers/gpu/drm/drm_ioctl.c:729 drm_ioctl+0x72e/0xa50 drivers/gpu/drm/drm_ioctl.c:825 vfs_ioctl fs/ioctl.c:45 [inline] do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:685 SYSC_ioctl fs/ioctl.c:700 [inline] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691 entry_SYSCALL_64_fastpath+0x1f/0xbe Here is a C reproducer: #include <fcntl.h> #include <stddef.h> #include <stdint.h> #include <sys/ioctl.h> #include <drm/drm.h> int main(void) { int cardfd = open("/dev/dri/card0", O_RDONLY); ioctl(cardfd, DRM_IOCTL_GEM_CLOSE, &(struct drm_gem_close) { .handle = -1 } ); } Link: http://lkml.kernel.org/r/20170906235306.20534-1-ebiggers3@gmail.com Fixes: 0a835c4f090a ("Reimplement IDR and IDA using the radix tree") Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: <stable@vger.kernel.org> [v4.11+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-13Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "A handful of tooling fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf stat: Wait for the correct child perf tools: Support running perf binaries with a dash in their name perf config: Check not only section->from_system_config but also item's perf ui progress: Fix progress update perf ui progress: Make sure we always define step value perf tools: Open perf.data with O_CLOEXEC flag tools lib api: Fix make DEBUG=1 build perf tests: Fix compile when libunwind's unwind.h is available tools include linux: Guard against redefinition of some macros
2017-09-13Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Three CPU hotplug related fixes and a debugging improvement" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/debug: Add debugfs knob for "sched_debug" sched/core: WARN() when migrating to an offline CPU sched/fair: Plug hole between hotplug and active_load_balance() sched/fair: Avoid newidle balance for !active CPUs
2017-09-13Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "The main changes are the PCID fixes from Andy, but there's also two hyperv fixes and two paravirt updates" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/hyper-v: Remove duplicated HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED definition x86/hyper-V: Allocate the IDT entry early in boot paravirt: Switch maintainer x86/paravirt: Remove no longer used paravirt functions x86/mm/64: Initialize CR4.PCIDE early x86/hibernate/64: Mask off CR3's PCID bits in the saved CR3 x86/mm: Get rid of VM_BUG_ON in switch_tlb_irqs_off()
2017-09-13Merge tag 'openrisc-for-linus' of git://github.com/openrisc/linuxLinus Torvalds
Pull OpenRISC fixlet from Stafford Horne: "Fix warning for upcoming work to remove linux/vmalloc.h from asm-generic/io.h" * tag 'openrisc-for-linus' of git://github.com/openrisc/linux: openrisc: add forward declaration for struct vm_area_struct
2017-09-13Merge tag 'modules-for-v4.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux Pull modules updates from Jessica Yu: "Summary of modules changes for the 4.14 merge window: - minor code cleanups and fixes - modpost: avoid building modules that have names that exceed the size of the name field in struct module" * tag 'modules-for-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux: module: Remove const attribute from alias for MODULE_DEVICE_TABLE module: fix ddebug_remove_module() modpost: abort if module name is too long
2017-09-13Fix up MAINTAINERS file sortingLinus Torvalds
Another merge window, another MAINTAINERS file disaster. People have serious problems with the alphabet and sorting, and poor Jérôme Glisse and Radim Krčmář get their names mangled by locale issues, turning them into some mangled mess (probably others do too, but those two stood out when sorting things again). And we now have two copies of the same 'AS3645A LED FLASH CONTROLLER DRIVER' in the tree and in the MAINTAINERS file, but that's a separate issue - the duplication is real, and I left them as two entries for the same name. This does not try to sort the actual section pattern entries, although I may end up doing that later. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-13Merge tag 'clk-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk updates from Stephen Boyd: "The diff is dominated by the Allwinner A10/A20 SoCs getting converted to the sunxi-ng framework. Otherwise, the heavy hitters are various drivers for SoCs like AT91, Amlogic, Renesas, and Rockchip. There are some other new clk drivers in here too but overall this is just a bunch of clk drivers for various different pieces of hardware and a collection of non-critical fixes for clk drivers. New Drivers: - Allwinner R40 SoCs - Renesas R-Car Gen3 USB 2.0 clock selector PHY - Atmel AT91 audio PLL - Uniphier PXs3 SoCs - ARC HSDK Board PLLs - AXS10X Board PLLs - STMicroelectronics STM32H743 SoCs Removed Drivers: - Non-compiling mb86s7x support Updates: - Allwinner A10/A20 SoCs converted to sunxi-ng framework - Allwinner H3 CPU clk fixes - Renesas R-Car D3 SoC - Renesas V2H and M3-W modules - Samsung Exynos5420/5422/5800 audio fixes - Rockchip fractional clk approximation fixes - Rockchip rk3126 SoC support within the rk3128 driver - Amlogic gxbb CEC32 and sd_emmc clks - Amlogic meson8b reset controller support - IDT VersaClock 5P49V5925/5P49V6901 support - Qualcomm MSM8996 SMMU clks - Various 'const' applications for struct clk_ops - si5351 PLL reset bugfix - Uniphier audio on LD11/LD20 and ethernet support on LD11/LD20/Pro4/PXs2 - Assorted Tegra clk driver fixes" * tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (120 commits) clk: si5351: fix PLL reset ASoC: atmel-classd: remove aclk clock ASoC: atmel-classd: remove aclk clock from DT binding clk: at91: clk-generated: make gclk determine audio_pll rate clk: at91: clk-generated: create function to find best_diff clk: at91: add audio pll clock drivers dt-bindings: clk: at91: add audio plls to the compatible list clk: at91: clk-generated: remove useless divisor loop clk: mb86s7x: Drop non-building driver clk: ti: check for null return in strrchr to avoid null dereferencing clk: Don't write error code into divider register clk: uniphier: add video input subsystem clock clk: uniphier: add audio system clock clk: stm32h7: Add stm32h743 clock driver clk: gate: expose clk_gate_ops::is_enabled clk: nxp: clk-lpc32xx: rename clk_gate_is_enabled() clk: uniphier: add PXs3 clock data clk: hi6220: change watchdog clock source clk: Kconfig: Name RK805 in Kconfig for COMMON_CLK_RK808 clk: cs2000: Add cs2000_set_saved_rate ...
2017-09-13Merge tag 'rtc-4.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux Pull RTC updates from Alexandre Belloni: "Subsystem: - remove .open() and .release() RTC ops - constify i2c_device_id New driver: - Realtek RTD1295 - Android emulator (goldfish) RTC Drivers: - ds1307: Beginning of a huge cleanup - s35390a: handle invalid RTC time - sun6i: external oscillator gate support" * tag 'rtc-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (40 commits) rtc: ds1307: use octal permissions rtc: ds1307: fix braces rtc: ds1307: fix alignments and blank lines rtc: ds1307: use BIT rtc: ds1307: use u32 rtc: ds1307: use sizeof rtc: ds1307: remove regs member rtc: Add Realtek RTD1295 dt-bindings: rtc: Add Realtek RTD1295 rtc: sun6i: Add support for the external oscillator gate rtc: goldfish: Add RTC driver for Android emulator dt-bindings: Add device tree binding for Goldfish RTC driver rtc: ds1307: add basic support for ds1341 chip rtc: ds1307: remove member nvram_offset from struct ds1307 rtc: ds1307: factor out offset to struct chip_desc rtc: ds1307: factor out rtc_ops to struct chip_desc rtc: ds1307: factor out irq_handler to struct chip_desc rtc: ds1307: improve irq setup rtc: ds1307: constify struct chip_desc variables rtc: ds1307: improve trickle charger initialization ...
2017-09-13Merge tag 'sound-fix-4.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Most of the commits are trivial cleanup patches, while one commit is a significant fix for the race at ALSA sequencer that was spotted by syzkaller" * tag 'sound-fix-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: seq: Cancel pending autoload work at unbinding device ALSA: firewire: Use common error handling code in snd_motu_stream_start_duplex() ALSA: asihpi: Kill BUG_ON() usages ALSA: core: Use %pS printk format for direct addresses ALSA: ymfpci: Use common error handling code in snd_ymfpci_create() ALSA: ymfpci: Use common error handling code in snd_card_ymfpci_probe() ALSA: 6fire: Use common error handling code in usb6fire_chip_probe() ALSA: usx2y: Use common error handling code in submit_urbs() ALSA: us122l: Use common error handling code in us122l_create_card() ALSA: hdspm: Use common error handling code in snd_hdspm_probe() ALSA: rme9652: Use common code in hdsp_get_iobox_version() ALSA: maestro3: Use common error handling code in two functions
2017-09-13Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull SCSI fixes from James Bottomley: "A tiny update: one patch corrects a Kconfig problem with the shift of the SAS SMP code to BSG and the other removes a vestige of user space target mode" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: scsi_transport_sas: select BLK_DEV_BSGLIB scsi: Remove Scsi_Host.uspace_req_q
2017-09-13Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: "Small collection of fixes that would be nice to have in -rc1. This contains: - NVMe pull request form Christoph, mostly with fixes for nvme-pci, host memory buffer in particular. - Error handling fixup for cgwb_create(), in case allocation of 'wb' fails. From Christophe Jaillet. - Ensure that trace_block_getrq() gets the 'dev' in an appropriate fashion, to avoid a potential NULL deref. From Greg Thelen. - Regression fix for dm-mq with blk-mq, fixing a problem with stacking IO schedulers. From me. - string.h fixup, fixing an issue with memcpy_and_pad(). This original change came in through an NVMe dependency, which is why I'm including it here. From Martin Wilck. - Fix potential int overflow in __blkdev_sectors_to_bio_pages(), from Mikulas. - MBR enable fix for sed-opal, from Scott" * 'for-linus' of git://git.kernel.dk/linux-block: block: directly insert blk-mq request from blk_insert_cloned_request() mm/backing-dev.c: fix an error handling path in 'cgwb_create()' string.h: un-fortify memcpy_and_pad nvme-pci: implement the HMB entry number and size limitations nvme-pci: propagate (some) errors from host memory buffer setup nvme-pci: use appropriate initial chunk size for HMB allocation nvme-pci: fix host memory buffer allocation fallback nvme: fix lightnvm check block: fix integer overflow in __blkdev_sectors_to_bio_pages() block: sed-opal: Set MBRDone on S3 resume path if TPER is MBREnabled block: tolerate tracing of NULL bio
2017-09-13Merge tag 'docs-4.14' of git://git.lwn.net/linuxLinus Torvalds
Pull documentation fixes from Jonathan Corbet: "A cleanup from Mauro that needed to wait for the media pull, plus a handful of other fixes that wandered in" * tag 'docs-4.14' of git://git.lwn.net/linux: kokr/memory-barriers.txt: Apply atomic_t.txt change kokr/doc: Update memory-barriers.txt for read-to-write dependencies docs-rst: don't require adjustbox anymore docs-rst: conf.py: only setup notice box colors if Sphinx < 1.6 docs-rst: conf.py: remove lscape from LaTeX preamble
2017-09-13Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse updates from Miklos Szeredi: "This fixes a regression (spotted by the Sandstorm.io folks) in the pid namespace handling introduced in 4.12. There's also a fix for honoring sync/dsync flags for pwritev2()" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: getattr cleanup fuse: honor iocb sync flags on write fuse: allow server to run in different pid_ns
2017-09-13Merge branch 'overlayfs-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull overlayfs updates from Miklos Szeredi: "This fixes d_ino correctness in readdir, which brings overlayfs on par with normal filesystems regarding inode number semantics, as long as all layers are on the same filesystem. There are also some bug fixes, one in particular (random ioctl's shouldn't be able to modify lower layers) that touches some vfs code, but of course no-op for non-overlay fs" * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: fix false positive ESTALE on lookup ovl: don't allow writing ioctl on lower layer ovl: fix relatime for directories vfs: add flags to d_real() ovl: cleanup d_real for negative ovl: constant d_ino for non-merge dirs ovl: constant d_ino across copy up ovl: fix readdir error value ovl: check snprintf return
2017-09-13x86/hyper-v: Remove duplicated HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED definitionVitaly Kuznetsov
Commits: 7dcf90e9e032 ("PCI: hv: Use vPCI protocol version 1.2") 628f54cc6451 ("x86/hyper-v: Support extended CPU ranges for TLB flush hypercalls") added the same definition and they came in through different trees. Fix the duplication. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: devel@linuxdriverproject.org Link: http://lkml.kernel.org/r/20170911150620.3998-1-vkuznets@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-13x86/hyper-V: Allocate the IDT entry early in bootK. Y. Srinivasan
Allocate the hypervisor callback IDT entry early in the boot sequence. The previous code would allocate the entry as part of registering the handler when the vmbus driver loaded, and this caused a problem for the IDT cleanup that Thomas is working on for v4.15. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: apw@canonical.com Cc: devel@linuxdriverproject.org Cc: gregkh@linuxfoundation.org Cc: jasowang@redhat.com Cc: olaf@aepfle.de Link: http://lkml.kernel.org/r/20170908231557.2419-1-kys@exchange.microsoft.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-13paravirt: Switch maintainerJuergen Gross
Jeremy Fitzhardinge is stepping down as a paravirt maintainer. I'll replace him. While at it, update the file list to the actual pattern. Signed-off-by: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: akataria@vmware.com Cc: chrisw@sous-sol.org Cc: jeremy@goop.org Cc: rusty@rustcorp.com.au Cc: virtualization@lists.linux-foundation.org Link: http://lkml.kernel.org/r/20170905143407.9227-1-jgross@suse.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-13x86/paravirt: Remove no longer used paravirt functionsJuergen Gross
With removal of lguest some of the paravirt functions are no longer needed: ->read_cr4() ->store_idt() ->set_pmd_at() ->set_pud_at() ->pte_update() Remove them. Signed-off-by: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: akataria@vmware.com Cc: boris.ostrovsky@oracle.com Cc: chrisw@sous-sol.org Cc: jeremy@goop.org Cc: rusty@rustcorp.com.au Cc: virtualization@lists.linux-foundation.org Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/20170904102527.25409-1-jgross@suse.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-13x86/mm/64: Initialize CR4.PCIDE earlyAndy Lutomirski
cpu_init() is weird: it's called rather late (after early identification and after most MMU state is initialized) on the boot CPU but is called extremely early (before identification) on secondary CPUs. It's called just late enough on the boot CPU that its CR4 value isn't propagated to mmu_cr4_features. Even if we put CR4.PCIDE into mmu_cr4_features, we'd hit two problems. First, we'd crash in the trampoline code. That's fixable, and I tried that. It turns out that mmu_cr4_features is totally ignored by secondary_start_64(), though, so even with the trampoline code fixed, it wouldn't help. This means that we don't currently have CR4.PCIDE reliably initialized before we start playing with cpu_tlbstate. This is very fragile and tends to cause boot failures if I make even small changes to the TLB handling code. Make it more robust: initialize CR4.PCIDE earlier on the boot CPU and propagate it to secondary CPUs in start_secondary(). ( Yes, this is ugly. I think we should have improved mmu_cr4_features to actually control CR4 during secondary bootup, but that would be fairly intrusive at this stage. ) Signed-off-by: Andy Lutomirski <luto@kernel.org> Reported-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Tested-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Fixes: 660da7c9228f ("x86/mm: Enable CR4.PCIDE on supported systems") Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-13x86/hibernate/64: Mask off CR3's PCID bits in the saved CR3Andy Lutomirski
Jiri reported a resume-from-hibernation failure triggered by PCID. The root cause appears to be rather odd. The hibernation asm restores a CR3 value that comes from the image header. If the image kernel has PCID on, it's entirely reasonable for this CR3 value to have one of the low 12 bits set. The restore code restores it with CR4.PCIDE=0, which means that those low 12 bits are accepted by the CPU but are either ignored or interpreted as a caching mode. This is odd, but still works. We blow up later when the image kernel restores CR4, though, since changing CR4.PCIDE with CR3[11:0] != 0 is illegal. Boom! FWIW, it's entirely unclear to me what's supposed to happen if a PAE kernel restores a non-PAE image or vice versa. Ditto for LA57. Reported-by: Jiri Kosina <jikos@kernel.org> Tested-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 660da7c9228f ("x86/mm: Enable CR4.PCIDE on supported systems") Link: http://lkml.kernel.org/r/18ca57090651a6341e97083883f9e814c4f14684.1504847163.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-13x86/mm: Get rid of VM_BUG_ON in switch_tlb_irqs_off()Andy Lutomirski
If we hit the VM_BUG_ON(), we're detecting a genuinely bad situation, but we're very unlikely to get a useful call trace. Make it a warning instead. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Jiri Kosina <jikos@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/3b4e06bbb382ca54a93218407c93925ff5871546.1504847163.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-13Merge tag 'perf-urgent-for-mingo-4.14-20170912' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: - Fix TUI progress bar when delta from new total from that of the previous update is greater than the progress "step" (screen width progress bar block)) (Jiri Olsa) - Make tools/lib/api make DEBUG=1 build use -D_FORTIFY_SOURCE=2 not to cripple debuginfo, just like tools/perf/ does (Jiri Olsa) - Avoid leaking the 'perf.data' file to workloads started from the 'perf record' command line by using the O_CLOEXEC open flag (Jiri Olsa) - Fix building when libunwind's 'unwind.h' file is present in the include path, clashing with tools/perf/util/unwind.h (Milian Wolff) - Check per .perfconfig section entry flag, not just per section (Taeung Song) - Support running perf binaries with a dash in their name, needed to run perf as an AppImage (Milian Wolff) - Wait for the right child by using waitpid() when running workloads from 'perf stat', also to fix using perf as an AppImage (Milian Wolff) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-12Merge tag 'f2fs-for-4.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "In this round, we've mostly tuned f2fs to provide better user experience for Android. Especially, we've worked on atomic write feature again with SQLite community in order to support it officially. And we added or modified several facilities to analyze and enhance IO behaviors. Major changes include: - add app/fs io stat - add inode checksum feature - support project/journalled quota - enhance atomic write with new ioctl() which exposes feature set - enhance background gc/discard/fstrim flows with new gc_urgent mode - add F2FS_IOC_FS{GET,SET}XATTR - fix some quota flows" * tag 'f2fs-for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (63 commits) f2fs: hurry up to issue discard after io interruption f2fs: fix to show correct discard_granularity in sysfs f2fs: detect dirty inode in evict_inode f2fs: clear radix tree dirty tag of pages whose dirty flag is cleared f2fs: speed up gc_urgent mode with SSR f2fs: better to wait for fstrim completion f2fs: avoid race in between read xattr & write xattr f2fs: make get_lock_data_page to handle encrypted inode f2fs: use generic terms used for encrypted block management f2fs: introduce f2fs_encrypted_file for clean-up Revert "f2fs: add a new function get_ssr_cost" f2fs: constify super_operations f2fs: fix to wake up all sleeping flusher f2fs: avoid race in between atomic_read & atomic_inc f2fs: remove unneeded parameter of change_curseg f2fs: update i_flags correctly f2fs: don't check inode's checksum if it was dirtied or writebacked f2fs: don't need to update inode checksum for recovery f2fs: trigger fdatasync for non-atomic_write file f2fs: fix to avoid race in between aio and gc ...
2017-09-12Merge tag 'ceph-for-4.14-rc1' of git://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph updates from Ilya Dryomov: "The highlights include: - a large series of fixes and improvements to the snapshot-handling code (Zheng Yan) - individual read/write OSD requests passed down to libceph are now limited to 16M in size to avoid hitting OSD-side limits (Zheng Yan) - encode MStatfs v2 message to allow for more accurate space usage reporting (Douglas Fuller) - switch to the new writeback error tracking infrastructure (Jeff Layton)" * tag 'ceph-for-4.14-rc1' of git://github.com/ceph/ceph-client: (35 commits) ceph: stop on-going cached readdir if mds revokes FILE_SHARED cap ceph: wait on writeback after writing snapshot data ceph: fix capsnap dirty pages accounting ceph: ignore wbc->range_{start,end} when write back snapshot data ceph: fix "range cyclic" mode writepages ceph: cleanup local variables in ceph_writepages_start() ceph: optimize pagevec iterating in ceph_writepages_start() ceph: make writepage_nounlock() invalidate page that beyonds EOF ceph: properly get capsnap's size in get_oldest_context() ceph: remove stale check in ceph_invalidatepage() ceph: queue cap snap only when snap realm's context changes ceph: handle race between vmtruncate and queuing cap snap ceph: fix message order check in handle_cap_export() ceph: fix NULL pointer dereference in ceph_flush_snaps() ceph: adjust 36 checks for NULL pointers ceph: delete an unnecessary return statement in update_dentry_lease() ceph: ENOMEM pr_err in __get_or_create_frag() is redundant ceph: check negative offsets in ceph_llseek() ceph: more accurate statfs ceph: properly set snap follows for cap reconnect ...
2017-09-12xfs: XFS_IS_REALTIME_INODE() should be false if no rt device presentRichard Wareing
If using a kernel with CONFIG_XFS_RT=y and we set the RHINHERIT flag on a directory in a filesystem that does not have a realtime device and create a new file in that directory, it gets marked as a real time file. When data is written and a fsync is issued, the filesystem attempts to flush a non-existent rt device during the fsync process. This results in a crash dereferencing a null buftarg pointer in xfs_blkdev_issue_flush(): BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: xfs_blkdev_issue_flush+0xd/0x20 ..... Call Trace: xfs_file_fsync+0x188/0x1c0 vfs_fsync_range+0x3b/0xa0 do_fsync+0x3d/0x70 SyS_fsync+0x10/0x20 do_syscall_64+0x4d/0xb0 entry_SYSCALL64_slow_path+0x25/0x25 Setting RT inode flags does not require special privileges so any unprivileged user can cause this oops to occur. To reproduce, confirm kernel is compiled with CONFIG_XFS_RT=y and run: # mkfs.xfs -f /dev/pmem0 # mount /dev/pmem0 /mnt/test # mkdir /mnt/test/foo # xfs_io -c 'chattr +t' /mnt/test/foo # xfs_io -f -c 'pwrite 0 5m' -c fsync /mnt/test/foo/bar Or just run xfstests with MKFS_OPTIONS="-d rtinherit=1" and wait. Kernels built with CONFIG_XFS_RT=n are not exposed to this bug. Fixes: f538d4da8d52 ("[XFS] write barrier support") Cc: <stable@vger.kernel.org> Signed-off-by: Richard Wareing <rwareing@fb.com> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-12Merge tag 'dma-mapping-4.14' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds
Pull dma-mapping updates from Christoph Hellwig: - removal of the old dma_alloc_noncoherent interface - remove unused flags to dma_declare_coherent_memory - restrict OF DMA configuration to specific physical busses - use the iommu mailing list for dma-mapping questions and patches * tag 'dma-mapping-4.14' of git://git.infradead.org/users/hch/dma-mapping: dma-coherent: fix dma_declare_coherent_memory() logic error ARM: imx: mx31moboard: Remove unused 'dma' variable dma-coherent: remove an unused variable MAINTAINERS: use the iommu list for the dma-mapping subsystem dma-coherent: remove the DMA_MEMORY_MAP and DMA_MEMORY_IO flags dma-coherent: remove the DMA_MEMORY_INCLUDES_CHILDREN flag of: restrict DMA configuration dma-mapping: remove dma_alloc_noncoherent and dma_free_noncoherent i825xx: switch to switch to dma_alloc_attrs au1000_eth: switch to dma_alloc_attrs sgiseeq: switch to dma_alloc_attrs dma-mapping: reduce dma_mapping_error inline bloat
2017-09-12Merge tag 'uuid-for-4.14' of git://git.infradead.org/users/hch/uuidLinus Torvalds
Pull uuid updates from Christoph Hellwig: "Just a single conversion to the new UUID API for this merge window" * tag 'uuid-for-4.14' of git://git.infradead.org/users/hch/uuid: efi: switch to use new generic UUID API
2017-09-12Merge tag 'selinux-pr-20170831' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux updates from Paul Moore: "A relatively quiet period for SELinux, 11 patches with only two/three having any substantive changes. These noteworthy changes include another tweak to the NNP/nosuid handling, per-file labeling for cgroups, and an object class fix for AF_UNIX/SOCK_RAW sockets; the rest of the changes are minor tweaks or administrative updates (Stephen's email update explains the file explosion in the diffstat). Everything passes the selinux-testsuite" [ Also a couple of small patches from the security tree from Tetsuo Handa for Tomoyo and LSM cleanup. The separation of security policy updates wasn't all that clean - Linus ] * tag 'selinux-pr-20170831' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: constify nf_hook_ops selinux: allow per-file labeling for cgroupfs lsm_audit: update my email address selinux: update my email address MAINTAINERS: update the NetLabel and Labeled Networking information selinux: use GFP_NOWAIT in the AVC kmem_caches selinux: Generalize support for NNP/nosuid SELinux domain transitions selinux: genheaders should fail if too many permissions are defined selinux: update the selinux info in MAINTAINERS credits: update Paul Moore's info selinux: Assign proper class to PF_UNIX/SOCK_RAW sockets tomoyo: Update URLs in Documentation/admin-guide/LSM/tomoyo.rst LSM: Remove security_task_create() hook.
2017-09-12Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Two fixes: dead code removal, plus a SME memory encryption fix on 32-bit kernels that crashed Xen guests" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu: Remove unused and undefined __generic_processor_info() declaration x86/mm: Make the SME mask a u64
2017-09-12Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Three fixes: - fix a suspend/resume cpusets bug - fix a !CONFIG_NUMA_BALANCING bug - fix a kerneldoc warning" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/fair: Fix nuisance kernel-doc warning sched/cpuset/pm: Fix cpuset vs. suspend-resume bugs sched/fair: Fix wake_affine_llc() balancing rules