summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-08-23Linux 5.9-rc2v5.9-rc2Linus Torvalds
2020-08-23Merge tag 'powerpc-5.9-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Add perf support for emitting extended registers for power10. - A fix for CPU hotplug on pseries, where on large/loaded systems we may not wait long enough for the CPU to be offlined, leading to crashes. - Addition of a raw cputable entry for Power10, which is not required to boot, but is required to make our PMU setup work correctly in guests. - Three fixes for the recent changes on 32-bit Book3S to move modules into their own segment for strict RWX. - A fix for a recent change in our powernv PCI code that could lead to crashes. - A change to our perf interrupt accounting to avoid soft lockups when using some events, found by syzkaller. - A change in the way we handle power loss events from the hypervisor on pseries. We no longer immediately shut down if we're told we're running on a UPS. - A few other minor fixes. Thanks to Alexey Kardashevskiy, Andreas Schwab, Aneesh Kumar K.V, Anju T Sudhakar, Athira Rajeev, Christophe Leroy, Frederic Barrat, Greg Kurz, Kajol Jain, Madhavan Srinivasan, Michael Neuling, Michael Roth, Nageswara R Sastry, Oliver O'Halloran, Thiago Jung Bauermann, Vaidyanathan Srinivasan, Vasant Hegde. * tag 'powerpc-5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/perf/hv-24x7: Move cpumask file to top folder of hv-24x7 driver powerpc/32s: Fix module loading failure when VMALLOC_END is over 0xf0000000 powerpc/pseries: Do not initiate shutdown when system is running on UPS powerpc/perf: Fix soft lockups due to missed interrupt accounting powerpc/powernv/pci: Fix possible crash when releasing DMA resources powerpc/pseries/hotplug-cpu: wait indefinitely for vCPU death powerpc/32s: Fix is_module_segment() when MODULES_VADDR is defined powerpc/kasan: Fix KASAN_SHADOW_START on BOOK3S_32 powerpc/fixmap: Fix the size of the early debug area powerpc/pkeys: Fix build error with PPC_MEM_KEYS disabled powerpc/kernel: Cleanup machine check function declarations powerpc: Add POWER10 raw mode cputable entry powerpc/perf: Add extended regs support for power10 platform powerpc/perf: Add support for outputting extended regs in perf intr_regs powerpc: Fix P10 PVR revision in /proc/cpuinfo for SMT4 cores
2020-08-23Merge tag 'x86-urgent-2020-08-23' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fix from Thomas Gleixner: "A single fix for x86 which removes the RDPID usage from the paranoid entry path and unconditionally uses LSL to retrieve the CPU number. RDPID depends on MSR_TSX_AUX. KVM has an optmization to avoid expensive MRS read/writes on VMENTER/EXIT. It caches the MSR values and restores them either when leaving the run loop, on preemption or when going out to user space. MSR_TSX_AUX is part of that lazy MSR set, so after writing the guest value and before the lazy restore any exception using the paranoid entry will read the guest value and use it as CPU number to retrieve the GSBASE value for the current CPU when FSGSBASE is enabled. As RDPID is only used in that particular entry path, there is no reason to burden VMENTER/EXIT with two extra MSR writes. Remove the RDPID optimization, which is not even backed by numbers from the paranoid entry path instead" * tag 'x86-urgent-2020-08-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/entry/64: Do not use RDPID in paranoid entry to accomodate KVM
2020-08-23Merge tag 'perf-urgent-2020-08-23' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 perf fix from Thomas Gleixner: "A single update for perf on x86 which has support for the broken down bandwith counters" * tag 'perf-urgent-2020-08-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/uncore: Add BW counters for GT, IA and IO breakdown
2020-08-23Merge tag 'efi-urgent-2020-08-23' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI fixes from Thomas Gleixner: - Enforce NX on RO data in mixed EFI mode - Destroy workqueue in an error handling path to prevent UAF - Stop argument parser at '--' which is the delimiter for init - Treat a NULL command line pointer as empty instead of dereferncing it unconditionally. - Handle an unterminated command line correctly - Cleanup the 32bit code leftovers and remove obsolete documentation * tag 'efi-urgent-2020-08-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Documentation: efi: remove description of efi=old_map efi/x86: Move 32-bit code into efi_32.c efi/libstub: Handle unterminated cmdline efi/libstub: Handle NULL cmdline efi/libstub: Stop parsing arguments at "--" efi: add missed destroy_workqueue when efisubsys_init fails efi/x86: Mark kernel rodata non-executable for mixed mode
2020-08-23Merge tag 'core-urgent-2020-08-23' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull entry fix from Thomas Gleixner: "A single bug fix for the common entry code. The transcription of the x86 version messed up the reload of the syscall number from pt_regs after ptrace and seccomp which breaks syscall number rewriting" * tag 'core-urgent-2020-08-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: core/entry: Respect syscall number rewrites
2020-08-23Merge tag 'edac_urgent_for_v5.9_rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC fix from Borislav Petkov: "A single fix correcting a reversed error severity determination check which lead to a recoverable error getting marked as fatal, by Tony Luck" * tag 'edac_urgent_for_v5.9_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/{i7core,sb,pnd2,skx}: Fix error event severity
2020-08-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds
Pull networking fixes from David Miller: "Nothing earth shattering here, lots of small fixes (f.e. missing RCU protection, bad ref counting, missing memset(), etc.) all over the place: 1) Use get_file_rcu() in task_file iterator, from Yonghong Song. 2) There are two ways to set remote source MAC addresses in macvlan driver, but only one of which validates things properly. Fix this. From Alvin Šipraga. 3) Missing of_node_put() in gianfar probing, from Sumera Priyadarsini. 4) Preserve device wanted feature bits across multiple netlink ethtool requests, from Maxim Mikityanskiy. 5) Fix rcu_sched stall in task and task_file bpf iterators, from Yonghong Song. 6) Avoid reset after device destroy in ena driver, from Shay Agroskin. 7) Missing memset() in netlink policy export reallocation path, from Johannes Berg. 8) Fix info leak in __smc_diag_dump(), from Peilin Ye. 9) Decapsulate ECN properly for ipv6 in ipv4 tunnels, from Mark Tomlinson. 10) Fix number of data stream negotiation in SCTP, from David Laight. 11) Fix double free in connection tracker action module, from Alaa Hleihel. 12) Don't allow empty NHA_GROUP attributes, from Nikolay Aleksandrov" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (46 commits) net: nexthop: don't allow empty NHA_GROUP bpf: Fix two typos in uapi/linux/bpf.h net: dsa: b53: check for timeout tipc: call rcu_read_lock() in tipc_aead_encrypt_done() net/sched: act_ct: Fix skb double-free in tcf_ct_handle_fragments() error flow net: sctp: Fix negotiation of the number of data streams. dt-bindings: net: renesas, ether: Improve schema validation gre6: Fix reception with IP6_TNL_F_RCV_DSCP_COPY hv_netvsc: Fix the queue_mapping in netvsc_vf_xmit() hv_netvsc: Remove "unlikely" from netvsc_select_queue bpf: selftests: global_funcs: Check err_str before strstr bpf: xdp: Fix XDP mode when no mode flags specified selftests/bpf: Remove test_align leftovers tools/resolve_btfids: Fix sections with wrong alignment net/smc: Prevent kernel-infoleak in __smc_diag_dump() sfc: fix build warnings on 32-bit net: phy: mscc: Fix a couple of spelling mistakes "spcified" -> "specified" libbpf: Fix map index used in error message net: gemini: Fix missing free_netdev() in error path of gemini_ethernet_port_probe() net: atlantic: Use readx_poll_timeout() for large timeout ...
2020-08-22Merge branch 'work.epoll' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull epoll fixes from Al Viro: "Fix reference counting and clean up exit paths" * 'work.epoll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: do_epoll_ctl(): clean the failure exits up a bit epoll: Keep a reference on files added to the check list
2020-08-22do_epoll_ctl(): clean the failure exits up a bitAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-08-22epoll: Keep a reference on files added to the check listMarc Zyngier
When adding a new fd to an epoll, and that this new fd is an epoll fd itself, we recursively scan the fds attached to it to detect cycles, and add non-epool files to a "check list" that gets subsequently parsed. However, this check list isn't completely safe when deletions can happen concurrently. To sidestep the issue, make sure that a struct file placed on the check list sees its f_count increased, ensuring that a concurrent deletion won't result in the file disapearing from under our feet. Cc: stable@vger.kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-08-22net: nexthop: don't allow empty NHA_GROUPNikolay Aleksandrov
Currently the nexthop code will use an empty NHA_GROUP attribute, but it requires at least 1 entry in order to function properly. Otherwise we end up derefencing null or random pointers all over the place due to not having any nh_grp_entry members allocated, nexthop code relies on having at least the first member present. Empty NHA_GROUP doesn't make any sense so just disallow it. Also add a WARN_ON for any future users of nexthop_create_group(). BUG: kernel NULL pointer dereference, address: 0000000000000080 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP CPU: 0 PID: 558 Comm: ip Not tainted 5.9.0-rc1+ #93 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014 RIP: 0010:fib_check_nexthop+0x4a/0xaa Code: 0f 84 83 00 00 00 48 c7 02 80 03 f7 81 c3 40 80 fe fe 75 12 b8 ea ff ff ff 48 85 d2 74 6b 48 c7 02 40 03 f7 81 c3 48 8b 40 10 <48> 8b 80 80 00 00 00 eb 36 80 78 1a 00 74 12 b8 ea ff ff ff 48 85 RSP: 0018:ffff88807983ba00 EFLAGS: 00010213 RAX: 0000000000000000 RBX: ffff88807983bc00 RCX: 0000000000000000 RDX: ffff88807983bc00 RSI: 0000000000000000 RDI: ffff88807bdd0a80 RBP: ffff88807983baf8 R08: 0000000000000dc0 R09: 000000000000040a R10: 0000000000000000 R11: ffff88807bdd0ae8 R12: 0000000000000000 R13: 0000000000000000 R14: ffff88807bea3100 R15: 0000000000000001 FS: 00007f10db393700(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000080 CR3: 000000007bd0f004 CR4: 00000000003706f0 Call Trace: fib_create_info+0x64d/0xaf7 fib_table_insert+0xf6/0x581 ? __vma_adjust+0x3b6/0x4d4 inet_rtm_newroute+0x56/0x70 rtnetlink_rcv_msg+0x1e3/0x20d ? rtnl_calcit.isra.0+0xb8/0xb8 netlink_rcv_skb+0x5b/0xac netlink_unicast+0xfa/0x17b netlink_sendmsg+0x334/0x353 sock_sendmsg_nosec+0xf/0x3f ____sys_sendmsg+0x1a0/0x1fc ? copy_msghdr_from_user+0x4c/0x61 ___sys_sendmsg+0x63/0x84 ? handle_mm_fault+0xa39/0x11b5 ? sockfd_lookup_light+0x72/0x9a __sys_sendmsg+0x50/0x6e do_syscall_64+0x54/0xbe entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f10dacc0bb7 Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb cd 66 0f 1f 44 00 00 8b 05 9a 4b 2b 00 85 c0 75 2e 48 63 ff 48 63 d2 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 b1 f2 2a 00 f7 d8 64 89 02 48 RSP: 002b:00007ffcbe628bf8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007ffcbe628f80 RCX: 00007f10dacc0bb7 RDX: 0000000000000000 RSI: 00007ffcbe628c60 RDI: 0000000000000003 RBP: 000000005f41099c R08: 0000000000000001 R09: 0000000000000008 R10: 00000000000005e9 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 00007ffcbe628d70 R15: 0000563a86c6e440 Modules linked in: CR2: 0000000000000080 CC: David Ahern <dsahern@gmail.com> Fixes: 430a049190de ("nexthop: Add support for nexthop groups") Reported-by: syzbot+a61aa19b0c14c8770bd9@syzkaller.appspotmail.com Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-22Merge tag 'kbuild-fixes-v5.9' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - move -Wsign-compare warning from W=2 to W=3 - fix the keyword _restrict to __restrict in genksyms - fix more bugs in qconf * tag 'kbuild-fixes-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kconfig: qconf: replace deprecated QString::sprintf() with QTextStream kconfig: qconf: remove redundant help in the info view kconfig: qconf: remove qInfo() to get back Qt4 support kconfig: qconf: remove unused colNr kconfig: qconf: fix the popup menu in the ConfigInfoView window kconfig: qconf: fix signal connection to invalid slots genksyms: keywords: Use __restrict not _restrict kbuild: remove redundant patterns in filter/filter-out extract-cert: add static to local data Makefile.extrawarn: Move sign-compare from W=2 to W=3
2020-08-22Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Allow booting of late secondary CPUs affected by erratum 1418040 (currently they are parked if none of the early CPUs are affected by this erratum). - Add the 32-bit vdso Makefile to the vdso_install rule so that 'make vdso_install' installs the 32-bit compat vdso when it is compiled. - Print a warning that untrusted guests without a CPU erratum workaround (Cortex-A57 832075) may deadlock the affected system. * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: ARM64: vdso32: Install vdso32 from vdso_install KVM: arm64: Print warning when cpu erratum can cause guests to deadlock arm64: Allow booting of late CPUs affected by erratum 1418040 arm64: Move handling of erratum 1418040 into C code
2020-08-22Merge tag 's390-5.9-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Vasily Gorbik: - a couple of fixes for storage key handling relevant for debugging - add cond_resched into potentially slow subchannels scanning loop - fixes for PF/VF linking and to ignore stale PCI configuration request events * tag 's390-5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/pci: fix PF/VF linking on hot plug s390/pci: re-introduce zpci_remove_device() s390/pci: fix zpci_bus_link_virtfn() s390/ptrace: fix storage key handling s390/runtime_instrumentation: fix storage key handling s390/pci: ignore stale configuration request event s390/cio: add cond_resched() in the slow_eval_known_fn() loop
2020-08-22Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: - PAE and PKU bugfixes for x86 - selftests fix for new binutils - MMU notifier fix for arm64 * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: arm64: Only reschedule if MMU_NOTIFIER_RANGE_BLOCKABLE is not set KVM: Pass MMU notifier range flags to kvm_unmap_hva_range() kvm: x86: Toggling CR4.PKE does not load PDPTEs in PAE mode kvm: x86: Toggling CR4.SMAP does not load PDPTEs in PAE mode KVM: x86: fix access code passed to gva_to_gpa selftests: kvm: Use a shorter encoding to clear RAX
2020-08-22Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "23 fixes in 5 drivers (qla2xxx, ufs, scsi_debug, fcoe, zfcp). The bulk of the changes are in qla2xxx and ufs and all are mostly small and definitely don't impact the core" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (23 commits) Revert "scsi: qla2xxx: Disable T10-DIF feature with FC-NVMe during probe" Revert "scsi: qla2xxx: Fix crash on qla2x00_mailbox_command" scsi: qla2xxx: Fix null pointer access during disconnect from subsystem scsi: qla2xxx: Check if FW supports MQ before enabling scsi: qla2xxx: Fix WARN_ON in qla_nvme_register_hba scsi: qla2xxx: Allow ql2xextended_error_logging special value 1 to be set anytime scsi: qla2xxx: Reduce noisy debug message scsi: qla2xxx: Fix login timeout scsi: qla2xxx: Indicate correct supported speeds for Mezz card scsi: qla2xxx: Flush I/O on zone disable scsi: qla2xxx: Flush all sessions on zone disable scsi: qla2xxx: Use MBX_TOV_SECONDS for mailbox command timeout values scsi: scsi_debug: Fix scp is NULL errors scsi: zfcp: Fix use-after-free in request timeout handlers scsi: ufs: No need to send Abort Task if the task in DB was cleared scsi: ufs: Clean up completed request without interrupt notification scsi: ufs: Improve interrupt handling for shared interrupts scsi: ufs: Fix interrupt error message for shared interrupts scsi: ufs-pci: Add quirk for broken auto-hibernate for Intel EHL scsi: ufs-mediatek: Fix incorrect time to wait link status ...
2020-08-22Merge tag 'devicetree-fixes-for-5.9-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree fixes from Rob Herring: "Another set of DT fixes: - restore range parsing error check - workaround PCI range parsing with missing 'device_type' now required - correct description of 'phy-connection-type' - fix erroneous matching on 'snps,dw-pcie' by 'intel,lgm-pcie' schema - a couple of grammar and whitespace fixes - update Shawn Guo's email" * tag 'devicetree-fixes-for-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: dt-bindings: vendor-prefixes: Remove trailing whitespace dt-bindings: net: correct description of phy-connection-type dt-bindings: PCI: intel,lgm-pcie: Fix matching on all snps,dw-pcie instances of: address: Work around missing device_type property in pcie nodes dt: writing-schema: Miscellaneous grammar fixes dt-bindings: Use Shawn Guo's preferred e-mail for i.MX bindings of/address: check for invalid range.cpu_addr
2020-08-21dt-bindings: vendor-prefixes: Remove trailing whitespaceGeert Uytterhoeven
Fixes: f516fb704d02fff2 ("dt-bindings: Whitespace clean-ups in schema files") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20200819092058.1526-1-geert+renesas@glider.be Signed-off-by: Rob Herring <robh@kernel.org>
2020-08-21KVM: arm64: Only reschedule if MMU_NOTIFIER_RANGE_BLOCKABLE is not setWill Deacon
When an MMU notifier call results in unmapping a range that spans multiple PGDs, we end up calling into cond_resched_lock() when crossing a PGD boundary, since this avoids running into RCU stalls during VM teardown. Unfortunately, if the VM is destroyed as a result of OOM, then blocking is not permitted and the call to the scheduler triggers the following BUG(): | BUG: sleeping function called from invalid context at arch/arm64/kvm/mmu.c:394 | in_atomic(): 1, irqs_disabled(): 0, non_block: 1, pid: 36, name: oom_reaper | INFO: lockdep is turned off. | CPU: 3 PID: 36 Comm: oom_reaper Not tainted 5.8.0 #1 | Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 | Call trace: | dump_backtrace+0x0/0x284 | show_stack+0x1c/0x28 | dump_stack+0xf0/0x1a4 | ___might_sleep+0x2bc/0x2cc | unmap_stage2_range+0x160/0x1ac | kvm_unmap_hva_range+0x1a0/0x1c8 | kvm_mmu_notifier_invalidate_range_start+0x8c/0xf8 | __mmu_notifier_invalidate_range_start+0x218/0x31c | mmu_notifier_invalidate_range_start_nonblock+0x78/0xb0 | __oom_reap_task_mm+0x128/0x268 | oom_reap_task+0xac/0x298 | oom_reaper+0x178/0x17c | kthread+0x1e4/0x1fc | ret_from_fork+0x10/0x30 Use the new 'flags' argument to kvm_unmap_hva_range() to ensure that we only reschedule if MMU_NOTIFIER_RANGE_BLOCKABLE is set in the notifier flags. Cc: <stable@vger.kernel.org> Fixes: 8b3405e345b5 ("kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd") Cc: Marc Zyngier <maz@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will@kernel.org> Message-Id: <20200811102725.7121-3-will@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-08-21KVM: Pass MMU notifier range flags to kvm_unmap_hva_range()Will Deacon
The 'flags' field of 'struct mmu_notifier_range' is used to indicate whether invalidate_range_{start,end}() are permitted to block. In the case of kvm_mmu_notifier_invalidate_range_start(), this field is not forwarded on to the architecture-specific implementation of kvm_unmap_hva_range() and therefore the backend cannot sensibly decide whether or not to block. Add an extra 'flags' parameter to kvm_unmap_hva_range() so that architectures are aware as to whether or not they are permitted to block. Cc: <stable@vger.kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will@kernel.org> Message-Id: <20200811102725.7121-2-will@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-08-21dt-bindings: net: correct description of phy-connection-typeMadalin Bucur
The phy-connection-type parameter is described in ePAPR 1.1: Specifies interface type between the Ethernet device and a physical layer (PHY) device. The value of this property is specific to the implementation. Signed-off-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Link: https://lore.kernel.org/r/1597917724-11127-1-git-send-email-madalin.bucur@oss.nxp.com Signed-off-by: Rob Herring <robh@kernel.org>
2020-08-21Merge tag 'io_uring-5.9-2020-08-21' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull io_uring fixes from Jens Axboe: - Make sure the head link cancelation includes async work - Get rid of kiocb_wait_page_queue_init(), makes no sense to have it as a separate function since you moved it into io_uring itself - io_import_iovec cleanups (Pavel, me) - Use system_unbound_wq for ring exit work, to avoid spawning tons of these if we have tons of rings exiting at the same time - Fix req->flags overflow flag manipulation (Pavel) * tag 'io_uring-5.9-2020-08-21' of git://git.kernel.dk/linux-block: io_uring: kill extra iovec=NULL in import_iovec() io_uring: comment on kfree(iovec) checks io_uring: fix racy req->flags modification io_uring: use system_unbound_wq for ring exit work io_uring: cleanup io_import_iovec() of pre-mapped request io_uring: get rid of kiocb_wait_page_queue_init() io_uring: find and cancel head link async work on files exit
2020-08-21dt-bindings: PCI: intel,lgm-pcie: Fix matching on all snps,dw-pcie instancesRob Herring
The intel,lgm-pcie binding is matching on all snps,dw-pcie instances which is wrong. Add a custom 'select' entry to fix this. Fixes: e54ea45a4955 ("dt-bindings: PCI: intel: Add YAML schemas for the PCIe RC controller") Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: linux-pci@vger.kernel.org Reviewed-by: Dilip Kota <eswara.kota@linux.intel.com> Signed-off-by: Rob Herring <robh@kernel.org>
2020-08-21Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "11 patches. Subsystems affected by this: misc, mm/hugetlb, mm/vmalloc, mm/misc, romfs, relay, uprobes, squashfs, mm/cma, mm/pagealloc" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm, page_alloc: fix core hung in free_pcppages_bulk() mm: include CMA pages in lowmem_reserve at boot squashfs: avoid bio_alloc() failure with 1Mbyte blocks uprobes: __replace_page() avoid BUG in munlock_vma_page() kernel/relay.c: fix memleak on destroy relay channel romfs: fix uninitialized memory leak in romfs_dev_read() mm/rodata_test.c: fix missing function declaration mm/vunmap: add cond_resched() in vunmap_pmd_range khugepaged: adjust VM_BUG_ON_MM() in __khugepaged_enter() hugetlb_cgroup: convert comma to semicolon mailmap: add Andi Kleen
2020-08-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Alexei Starovoitov says: ==================== pull-request: bpf 2020-08-21 The following pull-request contains BPF updates for your *net* tree. We've added 11 non-merge commits during the last 5 day(s) which contain a total of 12 files changed, 78 insertions(+), 24 deletions(-). The main changes are: 1) three fixes in BPF task iterator logic, from Yonghong. 2) fix for compressed dwarf sections in vmlinux, from Jiri. 3) fix xdp attach regression, from Andrii. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-21Merge tag 'riscv-for-linus-5.9-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - The CLINT driver has been split in two: one to handle the M-mode CLINT (memory mapped and used on NOMMU systems) and one to handle the S-mode CLINT (via SBI). - The addition of SiFive's drivers to rv32_defconfig * tag 'riscv-for-linus-5.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Add SiFive drivers to rv32_defconfig dt-bindings: timer: Add CLINT bindings RISC-V: Remove CLINT related code from timer and arch clocksource/drivers: Add CLINT timer driver RISC-V: Add mechanism to provide custom IPI operations
2020-08-21Merge tag 'for-linus-5.9-rc2-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "One build fix and a minor fix for suppressing a useless warning when booting a Xen dom0 via UEFI" * tag 'for-linus-5.9-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: Fix build error when CONFIG_ACPI is not set/enabled: efi: avoid error message when booting under Xen
2020-08-21Merge tag 'pm-5.9-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix a few issues in the operating performance points (OPP) framework. Specifics: - Fix re-enabling of resources in dev_pm_opp_set_rate() (Rajendra Nayak) - Fix OPP table reference counting in error paths (Stephen Boyd)" * tag 'pm-5.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: opp: Enable resources again if they were disabled earlier opp: Put opp table in dev_pm_opp_set_rate() if _set_opp_bw() fails opp: Put opp table in dev_pm_opp_set_rate() for empty tables
2020-08-21bpf: Fix two typos in uapi/linux/bpf.hTobias Klauser
Also remove trailing whitespaces in bpf_skb_get_tunnel_key example code. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200821133642.18870-1-tklauser@distanz.ch
2020-08-21net: dsa: b53: check for timeoutTom Rix
clang static analysis reports this problem b53_common.c:1583:13: warning: The left expression of the compound assignment is an uninitialized value. The computed value will also be garbage ent.port &= ~BIT(port); ~~~~~~~~ ^ ent is set by a successful call to b53_arl_read(). Unsuccessful calls are caught by an switch statement handling specific returns. b32_arl_read() calls b53_arl_op_wait() which fails with the unhandled -ETIMEDOUT. So add -ETIMEDOUT to the switch statement. Because b53_arl_op_wait() already prints out a message, do not add another one. Fixes: 1da6df85c6fb ("net: dsa: b53: Implement ARL add/del/dump operations") Signed-off-by: Tom Rix <trix@redhat.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-21ARM64: vdso32: Install vdso32 from vdso_installStephen Boyd
Add the 32-bit vdso Makefile to the vdso_install rule so that 'make vdso_install' installs the 32-bit compat vdso when it is compiled. Fixes: a7f71a2c8903 ("arm64: compat: Add vDSO") Signed-off-by: Stephen Boyd <swboyd@chromium.org> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Will Deacon <will@kernel.org> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Link: https://lore.kernel.org/r/20200818014950.42492-1-swboyd@chromium.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-08-21Merge tag 'ext4_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "Improvements to ext4's block allocator performance for very large file systems, especially when the file system or files which are highly fragmented. There is a new mount option, prefetch_block_bitmaps which will pull in the block bitmaps and set up the in-memory buddy bitmaps when the file system is initially mounted. Beyond that, a lot of bug fixes and cleanups. In particular, a number of changes to make ext4 more robust in the face of write errors or file system corruptions" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (46 commits) ext4: limit the length of per-inode prealloc list ext4: reorganize if statement of ext4_mb_release_context() ext4: add mb_debug logging when there are lost chunks ext4: Fix comment typo "the the". jbd2: clean up checksum verification in do_one_pass() ext4: change to use fallthrough macro ext4: remove unused parameter of ext4_generic_delete_entry function mballoc: replace seq_printf with seq_puts ext4: optimize the implementation of ext4_mb_good_group() ext4: delete invalid comments near ext4_mb_check_limits() ext4: fix typos in ext4_mb_regular_allocator() comment ext4: fix checking of directory entry validity for inline directories fs: prevent BUG_ON in submit_bh_wbc() ext4: correctly restore system zone info when remount fails ext4: handle add_system_zone() failure in ext4_setup_system_zone() ext4: fold ext4_data_block_valid_rcu() into the caller ext4: check journal inode extents more carefully ext4: don't allow overlapping system zones ext4: handle error of ext4_setup_system_zone() on remount ext4: delete the invalid BUGON in ext4_mb_load_buddy_gfp() ...
2020-08-21afs: Fix NULL deref in afs_dynroot_depopulate()David Howells
If an error occurs during the construction of an afs superblock, it's possible that an error occurs after a superblock is created, but before we've created the root dentry. If the superblock has a dynamic root (ie. what's normally mounted on /afs), the afs_kill_super() will call afs_dynroot_depopulate() to unpin any created dentries - but this will oops if the root hasn't been created yet. Fix this by skipping that bit of code if there is no root dentry. This leads to an oops looking like: general protection fault, ... KASAN: null-ptr-deref in range [0x0000000000000068-0x000000000000006f] ... RIP: 0010:afs_dynroot_depopulate+0x25f/0x529 fs/afs/dynroot.c:385 ... Call Trace: afs_kill_super+0x13b/0x180 fs/afs/super.c:535 deactivate_locked_super+0x94/0x160 fs/super.c:335 afs_get_tree+0x1124/0x1460 fs/afs/super.c:598 vfs_get_tree+0x89/0x2f0 fs/super.c:1547 do_new_mount fs/namespace.c:2875 [inline] path_mount+0x1387/0x2070 fs/namespace.c:3192 do_mount fs/namespace.c:3205 [inline] __do_sys_mount fs/namespace.c:3413 [inline] __se_sys_mount fs/namespace.c:3390 [inline] __x64_sys_mount+0x27f/0x300 fs/namespace.c:3390 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 which is oopsing on this line: inode_lock(root->d_inode); presumably because sb->s_root was NULL. Fixes: 0da0b7fd73e4 ("afs: Display manually added cells in dynamic root mount") Reported-by: syzbot+c1eff8205244ae7e11a6@syzkaller.appspotmail.com Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-21Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma fixes from Jason Gunthorpe: "One regression from 5.8 and a few bugs from earlier kernels: - Various spelling corrections in kernel prints - Bug fixes in hfi1 and bntx_re - Revert a 5.8 patch in hns - Batch update for Mellanox and Cumulus maintainers emails" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: MAINTAINERS: Update Mellanox and Cumulus Network addresses to new domain Revert "RDMA/hns: Reserve one sge in order to avoid local length error" RDMA/hfi1: Correct an interlock issue for TID RDMA WRITE request RDMA/bnxt_re: Do not add user qps to flushlist RDMA/core: Fix spelling mistake "Could't" -> "Couldn't" RDMA/usnic: Fix spelling mistake "transistion" -> "transition" RDMA/hns: Fix spelling mistake "epmty" -> "empty"
2020-08-21Merge tag 'sound-5.9-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A collection of small fixes over several drivers, but all are driver- specific and nothing looks scary. Slightly large changes are seen in ASoC qcom driver for the bugs that were revealed by the recent ASoC core change to report the invalid register access errors. Also ASoC fsl got a slight intensive change for the distortion fix. Others are only trivial fixes or device-specific quirks" * tag 'sound-5.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (25 commits) ALSA: hda: avoid reset of sdo_limit ALSA: hda/realtek: Add quirk for Samsung Galaxy Book Ion ALSA: usb-audio: ignore broken processing/extension unit ASoC: intel: Fix memleak in sst_media_open ASoC: wm8994: Avoid attempts to read unreadable registers ASoC: msm8916-wcd-analog: fix register Interrupt offset ASoC: wm8994: Prevent access to invalid VU register bits on WM1811 ALSA: hda/realtek: Add model alc298-samsung-headphone ALSA: usb-audio: Update documentation comment for MS2109 quirk ALSA: isa: fix spelling mistakes in the comments ALSA: usb-audio: Add capture support for Saffire 6 (USB 1.1) ALSA: hda/realtek: Add quirk for Samsung Galaxy Flex Book ASoC: q6routing: add dummy register read/write function ASoC: q6afe-dai: mark all widgets registers as SND_SOC_NOPM ASoC: Make soc_component_read() returning an error code again ASoC: amd: Replacing component->name with codec_dai->name. ASoC: fsl: Fix unused variable warning ASoC: tegra: tegra210_i2s: Fix compile warning with CONFIG_PM=n ASoC: tegra: tegra210_dmic: Fix compile warning with CONFIG_PM=n ASoC: tegra: tegra210_ahub: Fix compile warning with CONFIG_PM=n ...
2020-08-21Merge tag 'drm-fixes-2020-08-21' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "Regular fixes pull for rc2. Usual rc2 doesn't seem too busy, mainly i915 and amdgpu. I'd expect the usual uptick for rc3. amdgpu: - Fix allocation size - SR-IOV fixes - Vega20 SMU feature state caching fix - Fix custom pptable handling - Arcturus golden settings update - Several display fixes - Fixes for Navy Flounder - Misc display fixes - RAS fix amdkfd: - SDMA fix for renoir i915: - Fix device parameter usage for selftest mock i915 device - Fix LPSP capability debugfs NULL dereference - Fix buddy register pagemask table - Fix intel_atomic_check() non-negative return value - Fix selftests passing a random 0 into ilog2() - Fix TGL power well enable/disable ordering - Switch to PMU module refcounting - GVT fixes virtio: - Add missing dma_fence_put() in virtio_gpu_execbuffer_ioctl() - Fix memory leak in virtio_gpu_cleanup_object()" * tag 'drm-fixes-2020-08-21' of git://anongit.freedesktop.org/drm/drm: (34 commits) Revert "drm/amdgpu: disable gfxoff for navy_flounder" drm/i915/tgl: Make sure TC-cold is blocked before enabling TC AUX power wells drm/i915/selftests: Avoid passing a random 0 into ilog2 drm/i915: Fix wrong return value in intel_atomic_check() drm/i915: Update bw_buddy pagemask table drm/i915/display: Check for an LPSP encoder before dereferencing drm/i915: Copy default modparams to mock i915_device drm/i915: Provide the perf pmu.module drm/amd/display: fix pow() crashing when given base 0 drm/amd/display: Reset scrambling on Test Pattern drm/amd/display: fix dcn3 wide timing dsc validation drm/amd/display: Fix DFPstate hang due to view port changed drm/amd/display: Assign correct left shift drm/amd/display: Call DMUB for eDP power control drm/amdkfd: fix the wrong sdma instance query for renoir drm/amdgpu: parse ta firmware for navy_flounder drm/amdgpu: fix NULL pointer access issue when unloading driver drm/amdgpu: fix uninit-value in arcturus_log_thermal_throttling_event() drm/amdgpu: disable gfxoff for navy_flounder drm/amdgpu/display: use GFP_ATOMIC in dcn20_validate_bandwidth_internal ...
2020-08-21mm, page_alloc: fix core hung in free_pcppages_bulk()Charan Teja Reddy
The following race is observed with the repeated online, offline and a delay between two successive online of memory blocks of movable zone. P1 P2 Online the first memory block in the movable zone. The pcp struct values are initialized to default values,i.e., pcp->high = 0 & pcp->batch = 1. Allocate the pages from the movable zone. Try to Online the second memory block in the movable zone thus it entered the online_pages() but yet to call zone_pcp_update(). This process is entered into the exit path thus it tries to release the order-0 pages to pcp lists through free_unref_page_commit(). As pcp->high = 0, pcp->count = 1 proceed to call the function free_pcppages_bulk(). Update the pcp values thus the new pcp values are like, say, pcp->high = 378, pcp->batch = 63. Read the pcp's batch value using READ_ONCE() and pass the same to free_pcppages_bulk(), pcp values passed here are, batch = 63, count = 1. Since num of pages in the pcp lists are less than ->batch, then it will stuck in while(list_empty(list)) loop with interrupts disabled thus a core hung. Avoid this by ensuring free_pcppages_bulk() is called with proper count of pcp list pages. The mentioned race is some what easily reproducible without [1] because pcp's are not updated for the first memory block online and thus there is a enough race window for P2 between alloc+free and pcp struct values update through onlining of second memory block. With [1], the race still exists but it is very narrow as we update the pcp struct values for the first memory block online itself. This is not limited to the movable zone, it could also happen in cases with the normal zone (e.g., hotplug to a node that only has DMA memory, or no other memory yet). [1]: https://patchwork.kernel.org/patch/11696389/ Fixes: 5f8dcc21211a ("page-allocator: split per-cpu list into one-list-per-migrate-type") Signed-off-by: Charan Teja Reddy <charante@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Vinayak Menon <vinmenon@codeaurora.org> Cc: <stable@vger.kernel.org> [2.6+] Link: http://lkml.kernel.org/r/1597150703-19003-1-git-send-email-charante@codeaurora.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-21mm: include CMA pages in lowmem_reserve at bootDoug Berger
The lowmem_reserve arrays provide a means of applying pressure against allocations from lower zones that were targeted at higher zones. Its values are a function of the number of pages managed by higher zones and are assigned by a call to the setup_per_zone_lowmem_reserve() function. The function is initially called at boot time by the function init_per_zone_wmark_min() and may be called later by accesses of the /proc/sys/vm/lowmem_reserve_ratio sysctl file. The function init_per_zone_wmark_min() was moved up from a module_init to a core_initcall to resolve a sequencing issue with khugepaged. Unfortunately this created a sequencing issue with CMA page accounting. The CMA pages are added to the managed page count of a zone when cma_init_reserved_areas() is called at boot also as a core_initcall. This makes it uncertain whether the CMA pages will be added to the managed page counts of their zones before or after the call to init_per_zone_wmark_min() as it becomes dependent on link order. With the current link order the pages are added to the managed count after the lowmem_reserve arrays are initialized at boot. This means the lowmem_reserve values at boot may be lower than the values used later if /proc/sys/vm/lowmem_reserve_ratio is accessed even if the ratio values are unchanged. In many cases the difference is not significant, but for example an ARM platform with 1GB of memory and the following memory layout cma: Reserved 256 MiB at 0x0000000030000000 Zone ranges: DMA [mem 0x0000000000000000-0x000000002fffffff] Normal empty HighMem [mem 0x0000000030000000-0x000000003fffffff] would result in 0 lowmem_reserve for the DMA zone. This would allow userspace to deplete the DMA zone easily. Funnily enough $ cat /proc/sys/vm/lowmem_reserve_ratio would fix up the situation because as a side effect it forces setup_per_zone_lowmem_reserve. This commit breaks the link order dependency by invoking init_per_zone_wmark_min() as a postcore_initcall so that the CMA pages have the chance to be properly accounted in their zone(s) and allowing the lowmem_reserve arrays to receive consistent values. Fixes: bc22af74f271 ("mm: update min_free_kbytes from khugepaged after core initialization") Signed-off-by: Doug Berger <opendmb@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Jason Baron <jbaron@akamai.com> Cc: David Rientjes <rientjes@google.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/1597423766-27849-1-git-send-email-opendmb@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-21squashfs: avoid bio_alloc() failure with 1Mbyte blocksPhillip Lougher
This is a regression introduced by the patch "migrate from ll_rw_block usage to BIO". Bio_alloc() is limited to 256 pages (1 Mbyte). This can cause a failure when reading 1 Mbyte block filesystems. The problem is a datablock can be fully (or almost uncompressed), requiring 256 pages, but, because blocks are not aligned to page boundaries, it may require 257 pages to read. Bio_kmalloc() can handle 1024 pages, and so use this for the edge condition. Fixes: 93e72b3c612a ("squashfs: migrate from ll_rw_block usage to BIO") Reported-by: Nicolas Prochazka <nicolas.prochazka@gmail.com> Reported-by: Tomoatsu Shimada <shimada@walbrix.com> Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Guenter Roeck <groeck@chromium.org> Cc: Philippe Liard <pliard@google.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Adrien Schildknecht <adrien+dev@schischi.me> Cc: Daniel Rosenberg <drosen@google.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200815035637.15319-1-phillip@squashfs.org.uk Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-21uprobes: __replace_page() avoid BUG in munlock_vma_page()Hugh Dickins
syzbot crashed on the VM_BUG_ON_PAGE(PageTail) in munlock_vma_page(), when called from uprobes __replace_page(). Which of many ways to fix it? Settled on not calling when PageCompound (since Head and Tail are equals in this context, PageCompound the usual check in uprobes.c, and the prior use of FOLL_SPLIT_PMD will have cleared PageMlocked already). Fixes: 5a52c9df62b4 ("uprobe: use FOLL_SPLIT_PMD instead of FOLL_SPLIT") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: <stable@vger.kernel.org> [5.4+] Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008161338360.20413@eggly.anvils Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-21kernel/relay.c: fix memleak on destroy relay channelWei Yongjun
kmemleak report memory leak as follows: unreferenced object 0x607ee4e5f948 (size 8): comm "syz-executor.1", pid 2098, jiffies 4295031601 (age 288.468s) hex dump (first 8 bytes): 00 00 00 00 00 00 00 00 ........ backtrace: relay_open kernel/relay.c:583 [inline] relay_open+0xb6/0x970 kernel/relay.c:563 do_blk_trace_setup+0x4a8/0xb20 kernel/trace/blktrace.c:557 __blk_trace_setup+0xb6/0x150 kernel/trace/blktrace.c:597 blk_trace_ioctl+0x146/0x280 kernel/trace/blktrace.c:738 blkdev_ioctl+0xb2/0x6a0 block/ioctl.c:613 block_ioctl+0xe5/0x120 fs/block_dev.c:1871 vfs_ioctl fs/ioctl.c:48 [inline] __do_sys_ioctl fs/ioctl.c:753 [inline] __se_sys_ioctl fs/ioctl.c:739 [inline] __x64_sys_ioctl+0x170/0x1ce fs/ioctl.c:739 do_syscall_64+0x33/0x40 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 'chan->buf' is malloced in relay_open() by alloc_percpu() but not free while destroy the relay channel. Fix it by adding free_percpu() before return from relay_destroy_channel(). Fixes: 017c59c042d0 ("relay: Use per CPU constructs for the relay channel buffer pointers") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: David Rientjes <rientjes@google.com> Cc: Michel Lespinasse <walken@google.com> Cc: Daniel Axtens <dja@axtens.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Akash Goel <akash.goel@intel.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200817122826.48518-1-weiyongjun1@huawei.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-21romfs: fix uninitialized memory leak in romfs_dev_read()Jann Horn
romfs has a superblock field that limits the size of the filesystem; data beyond that limit is never accessed. romfs_dev_read() fetches a caller-supplied number of bytes from the backing device. It returns 0 on success or an error code on failure; therefore, its API can't represent short reads, it's all-or-nothing. However, when romfs_dev_read() detects that the requested operation would cross the filesystem size limit, it currently silently truncates the requested number of bytes. This e.g. means that when the content of a file with size 0x1000 starts one byte before the filesystem size limit, ->readpage() will only fill a single byte of the supplied page while leaving the rest uninitialized, leaking that uninitialized memory to userspace. Fix it by returning an error code instead of truncating the read when the requested read operation would go beyond the end of the filesystem. Fixes: da4458bda237 ("NOMMU: Make it possible for RomFS to use MTD devices directly") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: David Howells <dhowells@redhat.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200818013202.2246365-1-jannh@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-21mm/rodata_test.c: fix missing function declarationLeon Romanovsky
The compilation with CONFIG_DEBUG_RODATA_TEST set produces the following warning due to the missing include. mm/rodata_test.c:15:6: warning: no previous prototype for 'rodata_test' [-Wmissing-prototypes] 15 | void rodata_test(void) | ^~~~~~~~~~~ Fixes: 2959a5f726f6 ("mm: add arch-independent testcases for RODATA") Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lkml.kernel.org/r/20200819080026.918134-1-leon@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-21mm/vunmap: add cond_resched() in vunmap_pmd_rangeAneesh Kumar K.V
Like zap_pte_range add cond_resched so that we can avoid softlockups as reported below. On non-preemptible kernel with large I/O map region (like the one we get when using persistent memory with sector mode), an unmap of the namespace can report below softlockups. 22724.027334] watchdog: BUG: soft lockup - CPU#49 stuck for 23s! [ndctl:50777] NIP [c0000000000dc224] plpar_hcall+0x38/0x58 LR [c0000000000d8898] pSeries_lpar_hpte_invalidate+0x68/0xb0 Call Trace: flush_hash_page+0x114/0x200 hpte_need_flush+0x2dc/0x540 vunmap_page_range+0x538/0x6f0 free_unmap_vmap_area+0x30/0x70 remove_vm_area+0xfc/0x140 __vunmap+0x68/0x270 __iounmap.part.0+0x34/0x60 memunmap+0x54/0x70 release_nodes+0x28c/0x300 device_release_driver_internal+0x16c/0x280 unbind_store+0x124/0x170 drv_attr_store+0x44/0x60 sysfs_kf_write+0x64/0x90 kernfs_fop_write+0x1b0/0x290 __vfs_write+0x3c/0x70 vfs_write+0xd8/0x260 ksys_write+0xdc/0x130 system_call+0x5c/0x70 Reported-by: Harish Sriram <harish@linux.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200807075933.310240-1-aneesh.kumar@linux.ibm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-21khugepaged: adjust VM_BUG_ON_MM() in __khugepaged_enter()Hugh Dickins
syzbot crashes on the VM_BUG_ON_MM(khugepaged_test_exit(mm), mm) in __khugepaged_enter(): yes, when one thread is about to dump core, has set core_state, and is waiting for others, another might do something calling __khugepaged_enter(), which now crashes because I lumped the core_state test (known as "mmget_still_valid") into khugepaged_test_exit(). I still think it's best to lump them together, so just in this exceptional case, check mm->mm_users directly instead of khugepaged_test_exit(). Fixes: bbe98f9cadff ("khugepaged: khugepaged_test_exit() check mmget_still_valid()") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Yang Shi <shy828301@gmail.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Song Liu <songliubraving@fb.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Eric Dumazet <edumazet@google.com> Cc: <stable@vger.kernel.org> [4.8+] Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008141503370.18085@eggly.anvils Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-21hugetlb_cgroup: convert comma to semicolonXu Wang
Replace a comma between expression statements by a semicolon. Fixes: faced7e0806cf4 ("mm: hugetlb controller for cgroups v2") Signed-off-by: Xu Wang <vulab@iscas.ac.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Tejun Heo <tj@kernel.org> Cc: Giuseppe Scrivano <gscrivan@redhat.com> Link: http://lkml.kernel.org/r/20200818064333.21759-1-vulab@iscas.ac.cn Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-21mailmap: add Andi KleenNick Desaulniers
I keep getting bounce back from the suse.de address. Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kees Cook <keescook@chromium.org> Cc: Quentin Perret <qperret@qperret.net> Link: http://lkml.kernel.org/r/20200818203214.659955-1-ndesaulniers@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-21core/entry: Respect syscall number rewritesThomas Gleixner
The transcript of the x86 entry code to the generic version failed to reload the syscall number from ptregs after ptrace and seccomp have run, which both can modify the syscall number in ptregs. It returns the original syscall number instead which is obviously not the right thing to do. Reload the syscall number to fix that. Fixes: 142781e108b1 ("entry: Provide generic syscall entry functionality") Reported-by: Kyle Huey <me@kylehuey.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Kyle Huey <me@kylehuey.com> Tested-by: Kees Cook <keescook@chromium.org> Acked-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/87blj6ifo8.fsf@nanos.tec.linutronix.de
2020-08-21x86/entry/64: Do not use RDPID in paranoid entry to accomodate KVMSean Christopherson
KVM has an optmization to avoid expensive MRS read/writes on VMENTER/EXIT. It caches the MSR values and restores them either when leaving the run loop, on preemption or when going out to user space. The affected MSRs are not required for kernel context operations. This changed with the recently introduced mechanism to handle FSGSBASE in the paranoid entry code which has to retrieve the kernel GSBASE value by accessing per CPU memory. The mechanism needs to retrieve the CPU number and uses either LSL or RDPID if the processor supports it. Unfortunately RDPID uses MSR_TSC_AUX which is in the list of cached and lazily restored MSRs, which means between the point where the guest value is written and the point of restore, MSR_TSC_AUX contains a random number. If an NMI or any other exception which uses the paranoid entry path happens in such a context, then RDPID returns the random guest MSR_TSC_AUX value. As a consequence this reads from the wrong memory location to retrieve the kernel GSBASE value. Kernel GS is used to for all regular this_cpu_*() operations. If the GSBASE in the exception handler points to the per CPU memory of a different CPU then this has the obvious consequences of data corruption and crashes. As the paranoid entry path is the only place which accesses MSR_TSX_AUX (via RDPID) and the fallback via LSL is not significantly slower, remove the RDPID alternative from the entry path and always use LSL. The alternative would be to write MSR_TSC_AUX on every VMENTER and VMEXIT which would be inflicting massive overhead on that code path. [ tglx: Rewrote changelog ] Fixes: eaad981291ee3 ("x86/entry/64: Introduce the FIND_PERCPU_BASE macro") Reported-by: Tom Lendacky <thomas.lendacky@amd.com> Debugged-by: Tom Lendacky <thomas.lendacky@amd.com> Suggested-by: Andy Lutomirski <luto@kernel.org> Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20200821105229.18938-1-pbonzini@redhat.com