summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2024-05-12Merge tag 'edac_urgent_for_v6.9' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC fix from Borislav Petkov: - Fix a race condition when clearing error count bits and toggling the error interrupt throug the same register, in synopsys_edac * tag 'edac_urgent_for_v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/synopsys: Fix ECC status and IRQ control race condition
2024-05-10Merge tag 'drm-fixes-2024-05-11' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm fixes from Dave Airlie: "This should be the last set of fixes for 6.9, i915, xe and amdgpu are the bulk here, one of the previous nouveau fixes turned up an issue, so reverting it, otherwise one core and a couple of meson fixes. core: - fix connector debugging output i915: - Automate CCS Mode setting during engine resets - Fix audio time stamp programming for DP - Fix parsing backlight BDB data xe: - Fix use zero-length element array - Move more from system wq to ordered private wq - Do not ignore return for drmm_mutex_init amdgpu: - DCN 3.5 fix - MST DSC fixes - S0i3 fix - S4 fix - HDP MMIO mapping fix - Fix a regression in visible vram handling amdkfd: - Spatial partition fix meson: - dw-hdmi: power-up fixes - dw-hdmi: add badngap setting for g12 nouveau: - revert SG_DEBUG fix that has a side effect" * tag 'drm-fixes-2024-05-11' of https://gitlab.freedesktop.org/drm/kernel: Revert "drm/nouveau/firmware: Fix SG_DEBUG error with nvkm_firmware_ctor()" drm/amdgpu: Fix comparison in amdgpu_res_cpu_visible drm/amdkfd: don't allow mapping the MMIO HDP page with large pages drm/xe: Use ordered WQ for G2H handler drm/xe/guc: Check error code when initializing the CT mutex drm/xe/ads: Use flexible-array Revert "drm/amdkfd: Add partition id field to location_id" dm/amd/pm: Fix problems with reboot/shutdown for some SMU 13.0.4/13.0.11 users drm/amd/display: MST DSC check for older devices drm/amd/display: Fix idle optimization checks for multi-display and dual eDP drm/amd/display: Fix DSC-re-computing drm/amd/display: Enable urgent latency adjustments for DCN35 drm/connector: Add \n to message about demoting connector force-probes drm/i915/bios: Fix parsing backlight BDB data drm/i915/audio: Fix audio time stamp programming for DP drm/i915/gt: Automate CCS Mode setting during engine resets drm/meson: dw-hdmi: add bandgap setting for g12 drm/meson: dw-hdmi: power up phy on device init
2024-05-11Revert "drm/nouveau/firmware: Fix SG_DEBUG error with nvkm_firmware_ctor()"Dave Airlie
This reverts commit 52a6947bf576b97ff8e14bb0a31c5eaf2d0d96e2. This causes loading failures in [ 0.367379] nouveau 0000:01:00.0: NVIDIA GP104 (134000a1) [ 0.474499] nouveau 0000:01:00.0: bios: version 86.04.50.80.13 [ 0.474620] nouveau 0000:01:00.0: pmu: firmware unavailable [ 0.474977] nouveau 0000:01:00.0: fb: 8192 MiB GDDR5 [ 0.484371] nouveau 0000:01:00.0: sec2(acr): mbox 00000001 00000000 [ 0.484377] nouveau 0000:01:00.0: sec2(acr):load: boot failed: -5 [ 0.484379] nouveau 0000:01:00.0: acr: init failed, -5 [ 0.484466] nouveau 0000:01:00.0: init failed with -5 [ 0.484468] nouveau: DRM-master:00000000:00000080: init failed with -5 [ 0.484470] nouveau 0000:01:00.0: DRM-master: Device allocation failed: -5 [ 0.485078] nouveau 0000:01:00.0: probe with driver nouveau failed with error -50 I tried tracking it down but ran out of time this week, will revisit next week. Reported-by: Dan Moulding <dan@danm.net> Cc: stable@vger.kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com>
2024-05-11Merge tag 'drm-misc-fixes-2024-05-10' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: core: - fix connector debugging output meson: - dw-hdmi: power-up fixes - dw-hdmi: add badngap setting for g12 Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240510072027.GA9131@linux.fritz.box
2024-05-10Merge tag 'gpio-fixes-for-v6.9' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fixes from Bartosz Golaszewski: "Some last-minute fixes for this release from the GPIO subsystem. The first two address a regression in performance reported to me after the conversion to using SRCU in GPIOLIB that was merged during the v6.9 merge window. The second patch is not technically a fix but since after the first one we no longer need to use a per-descriptor SRCU struct, I think it's worth to simplify the code before it gets released on Sunday. The next two commits fix two memory issues: one use-after-free bug and one instance of possibly leaking kernel stack memory to user-space. Summary: - fix a performance regression in GPIO requesting and releasing after the conversion to SRCU - fix a use-after-free bug due to a race-condition - fix leaking stack memory to user-space in a GPIO uABI corner case" * tag 'gpio-fixes-for-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: gpiolib: cdev: fix uninitialised kfifo gpiolib: cdev: Fix use after free in lineinfo_changed_notify gpiolib: use a single SRCU struct for all GPIO descriptors gpiolib: fix the speed of descriptor label setting with SRCU
2024-05-11Merge tag 'amd-drm-fixes-6.9-2024-05-10' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.9-2024-05-10: amdgpu: - DCN 3.5 fix - MST DSC fixes - S0i3 fix - S4 fix - HDP MMIO mapping fix - Fix a regression in visible vram handling amdkfd: - Spatial partition fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240510171110.1394940-1-alexander.deucher@amd.com
2024-05-10Merge tag 'block-6.9-20240510' of git://git.kernel.dk/linuxLinus Torvalds
Pull block fixes from Jens Axboe: - NVMe pull request via Keith: - nvme target fixes (Sagi, Dan, Maurizo) - new vendor quirk for broken MSI (Sean) - Virtual boundary fix for a regression in this merge window (Ming) * tag 'block-6.9-20240510' of git://git.kernel.dk/linux: nvmet-rdma: fix possible bad dereference when freeing rsps nvmet: prevent sprintf() overflow in nvmet_subsys_nsid_exists() nvmet: make nvmet_wq unbound nvmet-auth: return the error code to the nvmet_auth_ctrl_hash() callers nvme-pci: Add quirk for broken MSIs block: set default max segment size in case of virt_boundary
2024-05-10Merge tag 'spi-fix-v6.9-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "Two device specific fixes here, one avoiding glitches on chip select with the STM32 driver and one for incorrectly configured clocks on the Microchip QSPI controller" * tag 'spi-fix-v6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: microchip-core-qspi: fix setting spi bus clock rate spi: stm32: enable controller before asserting CS
2024-05-10Merge tag 'regulator-fix-v6.9-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fixes from Mark Brown: "Two fixes here, one from Johan which fixes error handling when we attempt to create duplicate debugfs files and one for an incorrect specification of ramp_delay with the rtq2208" * tag 'regulator-fix-v6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: core: fix debugfs creation regression regulator: rtq2208: Fix the BUCK ramp_delay range to maximum of 16mVstep/us
2024-05-10Merge tag 'iommu-fixes-v6.9-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fixes from Joerg Roedel: - Fix offset miscalculation on ARM-SMMU driver - AMD IOMMU fix for initializing state of untrusted devices * tag 'iommu-fixes-v6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/arm-smmu: Use the correct type in nvidia_smmu_context_fault() iommu/amd: Enhance def_domain_type to handle untrusted device
2024-05-10drm/amdgpu: Fix comparison in amdgpu_res_cpu_visibleMichel Dänzer
It incorrectly claimed a resource isn't CPU visible if it's located at the very end of CPU visible VRAM. Fixes: a6ff969fe9cb ("drm/amdgpu: fix visible VRAM handling during faults") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3343 Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reported-and-Tested-by: Jeremy Day <jsday@noreason.ca> Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> CC: stable@vger.kernel.org
2024-05-10drm/amdkfd: don't allow mapping the MMIO HDP page with large pagesAlex Deucher
We don't get the right offset in that case. The GPU has an unused 4K area of the register BAR space into which you can remap registers. We remap the HDP flush registers into this space to allow userspace (CPU or GPU) to flush the HDP when it updates VRAM. However, on systems with >4K pages, we end up exposing PAGE_SIZE of MMIO space. Fixes: d8e408a82704 ("drm/amdkfd: Expose HDP registers to user space") Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-05-10gpiolib: cdev: fix uninitialised kfifoKent Gibson
If a line is requested with debounce, and that results in debouncing in software, and the line is subsequently reconfigured to enable edge detection then the allocation of the kfifo to contain edge events is overlooked. This results in events being written to and read from an uninitialised kfifo. Read events are returned to userspace. Initialise the kfifo in the case where the software debounce is already active. Fixes: 65cff7046406 ("gpiolib: cdev: support setting debounce") Signed-off-by: Kent Gibson <warthog618@gmail.com> Link: https://lore.kernel.org/r/20240510065342.36191-1-warthog618@gmail.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-05-10iommu/arm-smmu: Use the correct type in nvidia_smmu_context_fault()Jason Gunthorpe
This was missed because of the function pointer indirection. nvidia_smmu_context_fault() is also installed as a irq function, and the 'void *' was changed to a struct arm_smmu_domain. Since the iommu_domain is embedded at a non-zero offset this causes nvidia_smmu_context_fault() to miscompute the offset. Fixup the types. Unable to handle kernel NULL pointer dereference at virtual address 0000000000000120 Mem abort info: ESR = 0x0000000096000004 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x04: level 0 translation fault Data abort info: ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 CM = 0, WnR = 0, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 user pgtable: 4k pages, 48-bit VAs, pgdp=0000000107c9f000 [0000000000000120] pgd=0000000000000000, p4d=0000000000000000 Internal error: Oops: 0000000096000004 [#1] SMP Modules linked in: CPU: 1 PID: 47 Comm: kworker/u25:0 Not tainted 6.9.0-0.rc7.58.eln136.aarch64 #1 Hardware name: Unknown NVIDIA Jetson Orin NX/NVIDIA Jetson Orin NX, BIOS 3.1-32827747 03/19/2023 Workqueue: events_unbound deferred_probe_work_func pstate: 604000c9 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : nvidia_smmu_context_fault+0x1c/0x158 lr : __free_irq+0x1d4/0x2e8 sp : ffff80008044b6f0 x29: ffff80008044b6f0 x28: ffff000080a60b18 x27: ffffd32b5172e970 x26: 0000000000000000 x25: ffff0000802f5aac x24: ffff0000802f5a30 x23: ffff0000802f5b60 x22: 0000000000000057 x21: 0000000000000000 x20: ffff0000802f5a00 x19: ffff000087d4cd80 x18: ffffffffffffffff x17: 6234362066666666 x16: 6630303078302d30 x15: ffff00008156d888 x14: 0000000000000000 x13: ffff0000801db910 x12: ffff00008156d6d0 x11: 0000000000000003 x10: ffff0000801db918 x9 : ffffd32b50f94d9c x8 : 1fffe0001032fda1 x7 : ffff00008197ed00 x6 : 000000000000000f x5 : 000000000000010e x4 : 000000000000010e x3 : 0000000000000000 x2 : ffffd32b51720cd8 x1 : ffff000087e6f700 x0 : 0000000000000057 Call trace: nvidia_smmu_context_fault+0x1c/0x158 __free_irq+0x1d4/0x2e8 free_irq+0x3c/0x80 devm_free_irq+0x64/0xa8 arm_smmu_domain_free+0xc4/0x158 iommu_domain_free+0x44/0xa0 iommu_deinit_device+0xd0/0xf8 __iommu_group_remove_device+0xcc/0xe0 iommu_bus_notifier+0x64/0xa8 notifier_call_chain+0x78/0x148 blocking_notifier_call_chain+0x4c/0x90 bus_notify+0x44/0x70 device_del+0x264/0x3e8 pci_remove_bus_device+0x84/0x120 pci_remove_root_bus+0x5c/0xc0 dw_pcie_host_deinit+0x38/0xe0 tegra_pcie_config_rp+0xc0/0x1f0 tegra_pcie_dw_probe+0x34c/0x700 platform_probe+0x70/0xe8 really_probe+0xc8/0x3a0 __driver_probe_device+0x84/0x160 driver_probe_device+0x44/0x130 __device_attach_driver+0xc4/0x170 bus_for_each_drv+0x90/0x100 __device_attach+0xa8/0x1c8 device_initial_probe+0x1c/0x30 bus_probe_device+0xb0/0xc0 deferred_probe_work_func+0xbc/0x120 process_one_work+0x194/0x490 worker_thread+0x284/0x3b0 kthread+0xf4/0x108 ret_from_fork+0x10/0x20 Code: a9b97bfd 910003fd a9025bf5 f85a0035 (b94122a1) Cc: stable@vger.kernel.org Fixes: e0976331ad11 ("iommu/arm-smmu: Pass arm_smmu_domain to internal functions") Reported-by: Jerry Snitselaar <jsnitsel@redhat.com> Closes: https://lore.kernel.org/all/jto5e3ili4auk6sbzpnojdvhppgwuegir7mpd755anfhwcbkfz@2u5gh7bxb4iv Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Jerry Snitselaar <jsnitsel@redhat.com> Acked-by: Jerry Snitselaar <jsnitsel@redhat.com> Link: https://lore.kernel.org/r/0-v1-24ce064de41f+4ac-nvidia_smmu_fault_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-05-10Merge tag 'drm-xe-fixes-2024-05-09' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes - Fix use zero-length element array - Move more from system wq to ordered private wq - Do not ignore return for drmm_mutex_init Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/c3rduifdp5wipkljdpuq4x6uowkc2uyzgdoft4txvp6mgvzjaj@7zw7c6uw4wrf
2024-05-09Merge tag 'hwmon-for-v6.9-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: - pmbus/ucd9000: Increase chip access delay to avoid random access errors - corsair-cpro: Protect kernel code against parallel hidraw access from userspace * tag 'hwmon-for-v6.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (pmbus/ucd9000) Increase delay from 250 to 500us hwmon: (corsair-cpro) Protect ccp->wait_input_report with a spinlock hwmon: (corsair-cpro) Use complete_all() instead of complete() in ccp_raw_event() hwmon: (corsair-cpro) Use a separate buffer for sending commands
2024-05-09hwmon: (pmbus/ucd9000) Increase delay from 250 to 500usLakshmi Yadlapati
Following the failure observed with a delay of 250us, experiments were conducted with various delays. It was found that a delay of 350us effectively mitigated the issue. To provide a more optimal solution while still allowing a margin for stability, the delay is being adjusted to 500us. Signed-off-by: Lakshmi Yadlapati <lakshmiy@us.ibm.com> Link: https://lore.kernel.org/r/20240507194603.1305750-1-lakshmiy@us.ibm.com Fixes: 8d655e6523764 ("hwmon: (ucd90320) Add minimum delay between bus accesses") Reviewed-by: Eddie James <eajames@linux.ibm.com> Cc: stable@vger.kernel.org Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-05-09Merge tag 'net-6.9-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bluetooth and IPsec. The bridge patch is actually a follow-up to a recent fix in the same area. We have a pending v6.8 AF_UNIX regression; it should be solved soon, but not in time for this PR. Current release - regressions: - eth: ks8851: Queue RX packets in IRQ handler instead of disabling BHs - net: bridge: fix corrupted ethernet header on multicast-to-unicast Current release - new code bugs: - xfrm: fix possible bad pointer derferencing in error path Previous releases - regressionis: - core: fix out-of-bounds access in ops_init - ipv6: - fix potential uninit-value access in __ip6_make_skb() - fib6_rules: avoid possible NULL dereference in fib6_rule_action() - tcp: use refcount_inc_not_zero() in tcp_twsk_unique(). - rtnetlink: correct nested IFLA_VF_VLAN_LIST attribute validation - rxrpc: fix congestion control algorithm - bluetooth: - l2cap: fix slab-use-after-free in l2cap_connect() - msft: fix slab-use-after-free in msft_do_close() - eth: hns3: fix kernel crash when devlink reload during initialization - eth: dsa: mv88e6xxx: add phylink_get_caps for the mv88e6320/21 family Previous releases - always broken: - xfrm: preserve vlan tags for transport mode software GRO - tcp: defer shutdown(SEND_SHUTDOWN) for TCP_SYN_RECV sockets - eth: hns3: keep using user config after hardware reset" * tag 'net-6.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (47 commits) net: dsa: mv88e6xxx: read cmode on mv88e6320/21 serdes only ports net: dsa: mv88e6xxx: add phylink_get_caps for the mv88e6320/21 family net: hns3: fix kernel crash when devlink reload during initialization net: hns3: fix port vlan filter not disabled issue net: hns3: use appropriate barrier function after setting a bit value net: hns3: release PTP resources if pf initialization failed net: hns3: change type of numa_node_mask as nodemask_t net: hns3: direct return when receive a unknown mailbox message net: hns3: using user configure after hardware reset net/smc: fix neighbour and rtable leak in smc_ib_find_route() ipv6: prevent NULL dereference in ip6_output() hsr: Simplify code for announcing HSR nodes timer setup ipv6: fib6_rules: avoid possible NULL dereference in fib6_rule_action() dt-bindings: net: mediatek: remove wrongly added clocks and SerDes rxrpc: Only transmit one ACK per jumbo packet received rxrpc: Fix congestion control algorithm selftests: test_bridge_neigh_suppress.sh: Fix failures due to duplicate MAC ipv6: Fix potential uninit-value access in __ip6_make_skb() net: phy: marvell-88q2xxx: add support for Rev B1 and B2 appletalk: Improve handling of broadcast packets ...
2024-05-09regulator: core: fix debugfs creation regressionJohan Hovold
regulator_get() may sometimes be called more than once for the same consumer device, something which before commit dbe954d8f163 ("regulator: core: Avoid debugfs: Directory ... already present! error") resulted in errors being logged. A couple of recent commits broke the handling of such cases so that attributes are now erroneously created in the debugfs root directory the second time a regulator is requested and the log is filled with errors like: debugfs: File 'uA_load' in directory '/' already present! debugfs: File 'min_uV' in directory '/' already present! debugfs: File 'max_uV' in directory '/' already present! debugfs: File 'constraint_flags' in directory '/' already present! on any further calls. Fixes: 2715bb11cfff ("regulator: core: Fix more error checking for debugfs_create_dir()") Fixes: 08880713ceec ("regulator: core: Streamline debugfs operations") Cc: stable@vger.kernel.org Cc: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Link: https://lore.kernel.org/r/20240509133304.8883-1-johan+linaro@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-09Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds
Pull dentry leak fix from Al Viro: "Dentry leak fix in the qibfs driver that I forgot to send a pull request for ;-/ My apologies - it actually sat in vfs.git#fixes for more than two months..." * tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: qibfs: fix dentry leak
2024-05-09drm/xe: Use ordered WQ for G2H handlerMatthew Brost
System work queues are shared, use a dedicated work queue for G2H processing to avoid G2H processing getting block behind system tasks. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: <stable@vger.kernel.org> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240506034758.3697397-1-matthew.brost@intel.com (cherry picked from commit 50aec9665e0babd62b9eee4e613d9a1ef8d2b7de) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-05-09drm/xe/guc: Check error code when initializing the CT mutexDaniele Ceraolo Spurio
The initialization via drmm_mutex_init can fail, so we need to check the return code and escalate the failure. The mutex initialization has been moved after all the other init steps that can't fail, so we're always guaranteed to have those done and don't have to check in the cleanup code. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240321195512.274210-1-daniele.ceraolospurio@intel.com (cherry picked from commit b4abeb5545bb3ddcdda3c19067680ad0b2259be4) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-05-09drm/xe/ads: Use flexible-arrayLucas De Marchi
Zero-length arrays are deprecated and flexible arrays should be used instead: https://www.kernel.org/doc/html/v6.9-rc7/process/deprecated.html#zero-length-and-one-element-arrays Reported-by: kernel test robot <lkp@intel.com> Reported-by: Julia Lawall <julia.lawall@inria.fr> Closes: https://lore.kernel.org/r/202405051824.AmjAI5Pg-lkp@intel.com/ Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240506141917.205714-1-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit ee7284230644e21fef0e38fc5bf8f907b6bb7f7c) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-05-09gpiolib: cdev: Fix use after free in lineinfo_changed_notifyZhongqiu Han
The use-after-free issue occurs as follows: when the GPIO chip device file is being closed by invoking gpio_chrdev_release(), watched_lines is freed by bitmap_free(), but the unregistration of lineinfo_changed_nb notifier chain failed due to waiting write rwsem. Additionally, one of the GPIO chip's lines is also in the release process and holds the notifier chain's read rwsem. Consequently, a race condition leads to the use-after-free of watched_lines. Here is the typical stack when issue happened: [free] gpio_chrdev_release() --> bitmap_free(cdev->watched_lines) <-- freed --> blocking_notifier_chain_unregister() --> down_write(&nh->rwsem) <-- waiting rwsem --> __down_write_common() --> rwsem_down_write_slowpath() --> schedule_preempt_disabled() --> schedule() [use] st54spi_gpio_dev_release() --> gpio_free() --> gpiod_free() --> gpiod_free_commit() --> gpiod_line_state_notify() --> blocking_notifier_call_chain() --> down_read(&nh->rwsem); <-- held rwsem --> notifier_call_chain() --> lineinfo_changed_notify() --> test_bit(xxxx, cdev->watched_lines) <-- use after free The side effect of the use-after-free issue is that a GPIO line event is being generated for userspace where it shouldn't. However, since the chrdev is being closed, userspace won't have the chance to read that event anyway. To fix the issue, call the bitmap_free() function after the unregistration of lineinfo_changed_nb notifier chain. Fixes: 51c1064e82e7 ("gpiolib: add new ioctl() for monitoring changes in line info") Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com> Link: https://lore.kernel.org/r/20240505141156.2944912-1-quic_zhonhan@quicinc.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-05-09gpiolib: use a single SRCU struct for all GPIO descriptorsBartosz Golaszewski
We used a per-descriptor SRCU struct in order to not impose a wait with synchronize_srcu() for descriptor X on read-only operations of descriptor Y. Now that we no longer call synchronize_srcu() on descriptor label change but only when releasing descriptor resources, we can use a single SRCU structure for all GPIO descriptors in a given chip. Suggested-by: "Paul E. McKenney" <paulmck@kernel.org> Acked-by: "Paul E. McKenney" <paulmck@kernel.org> Link: https://lore.kernel.org/r/20240507172414.28513-1-brgl@bgdev.pl Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-05-09net: dsa: mv88e6xxx: read cmode on mv88e6320/21 serdes only portsSteffen Bätz
On the mv88e6320 and 6321 switch family, port 0/1 are serdes only ports. Modified the mv88e6352_get_port4_serdes_cmode function to pass a port number since the register set of the 6352 is equal on the 6320/21. Signed-off-by: Steffen Bätz <steffen@innosonix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Fabio Estevam <festevam@gmail.com> Link: https://lore.kernel.org/r/20240508072944.54880-3-steffen@innosonix.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-05-09net: dsa: mv88e6xxx: add phylink_get_caps for the mv88e6320/21 familySteffen Bätz
As of commit de5c9bf40c45 ("net: phylink: require supported_interfaces to be filled") Marvell 88e6320/21 switches fail to be probed: ... mv88e6085 30be0000.ethernet-1:00: phylink: error: empty supported_interfaces error creating PHYLINK: -22 ... The problem stems from the use of mv88e6185_phylink_get_caps() to get the device capabilities. Since there are serdes only ports 0/1 included, create a new dedicated phylink_get_caps for the 6320 and 6321 to properly support their set of capabilities. Fixes: de5c9bf40c45 ("net: phylink: require supported_interfaces to be filled") Signed-off-by: Steffen Bätz <steffen@innosonix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Fabio Estevam <festevam@gmail.com> Link: https://lore.kernel.org/r/20240508072944.54880-2-steffen@innosonix.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-05-09net: hns3: fix kernel crash when devlink reload during initializationYonglong Liu
The devlink reload process will access the hardware resources, but the register operation is done before the hardware is initialized. So, processing the devlink reload during initialization may lead to kernel crash. This patch fixes this by registering the devlink after hardware initialization. Fixes: cd6242991d2e ("net: hns3: add support for registering devlink for VF") Fixes: 93305b77ffcb ("net: hns3: fix kernel crash when devlink reload during pf initialization") Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-05-09net: hns3: fix port vlan filter not disabled issueYonglong Liu
According to hardware limitation, for device support modify VLAN filter state but not support bypass port VLAN filter, it should always disable the port VLAN filter. but the driver enables port VLAN filter when initializing, if there is no VLAN(except VLAN 0) id added, the driver will disable it in service task. In most time, it works fine. But there is a time window before the service task shceduled and net device being registered. So if user adds VLAN at this time, the driver will not update the VLAN filter state, and the port VLAN filter remains enabled. To fix the problem, if support modify VLAN filter state but not support bypass port VLAN filter, set the port vlan filter to "off". Fixes: 184cd221a863 ("net: hns3: disable port VLAN filter when support function level VLAN filter control") Fixes: 2ba306627f59 ("net: hns3: add support for modify VLAN filter state") Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-05-09net: hns3: use appropriate barrier function after setting a bit valuePeiyang Wang
There is a memory barrier in followed case. When set the port down, hclgevf_set_timmer will set DOWN in state. Meanwhile, the service task has different behaviour based on whether the state is DOWN. Thus, to make sure service task see DOWN, use smp_mb__after_atomic after calling set_bit(). CPU0 CPU1 ========================== =================================== hclgevf_set_timer_task() hclgevf_periodic_service_task() set_bit(DOWN,state) test_bit(DOWN,state) pf also has this issue. Fixes: ff200099d271 ("net: hns3: remove unnecessary work in hclgevf_main") Fixes: 1c6dfe6fc6f7 ("net: hns3: remove mailbox and reset work in hclge_main") Signed-off-by: Peiyang Wang <wangpeiyang1@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-05-09net: hns3: release PTP resources if pf initialization failedPeiyang Wang
During the PF initialization process, hclge_update_port_info may return an error code for some reason. At this point, the ptp initialization has been completed. To void memory leaks, the resources that are applied by ptp should be released. Therefore, when hclge_update_port_info returns an error code, hclge_ptp_uninit is called to release the corresponding resources. Fixes: eaf83ae59e18 ("net: hns3: add querying fec ability from firmware") Signed-off-by: Peiyang Wang <wangpeiyang1@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-05-09net: hns3: change type of numa_node_mask as nodemask_tPeiyang Wang
It provides nodemask_t to describe the numa node mask in kernel. To improve transportability, change the type of numa_node_mask as nodemask_t. Fixes: 38caee9d3ee8 ("net: hns3: Add support of the HNAE3 framework") Signed-off-by: Peiyang Wang <wangpeiyang1@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-05-09net: hns3: direct return when receive a unknown mailbox messageJian Shen
Currently, the driver didn't return when receive a unknown mailbox message, and continue checking whether need to generate a response. It's unnecessary and may be incorrect. Fixes: bb5790b71bad ("net: hns3: refactor mailbox response scheme between PF and VF") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-05-09net: hns3: using user configure after hardware resetPeiyang Wang
When a reset occurring, it's supposed to recover user's configuration. Currently, the port info(speed, duplex and autoneg) is stored in hclge_mac and will be scheduled updated. Consider the case that reset was happened consecutively. During the first reset, the port info is configured with a temporary value cause the PHY is reset and looking for best link config. Second reset start and use pervious configuration which is not the user's. The specific process is as follows: +------+ +----+ +----+ | USER | | PF | | HW | +---+--+ +-+--+ +-+--+ | ethtool --reset | | +------------------->| reset command | | ethtool --reset +-------------------->| +------------------->| +---+ | +---+ | | | | |reset currently | | HW RESET | | |and wait to do | | | |<--+ | | | | send pervious cfg |<--+ | | (1000M FULL AN_ON) | | +-------------------->| | | read cfg(time task) | | | (10M HALF AN_OFF) +---+ | |<--------------------+ | cfg take effect | | reset command |<--+ | +-------------------->| | | +---+ | | send pervious cfg | | HW RESET | | (10M HALF AN_OFF) |<--+ | +-------------------->| | | read cfg(time task) | | | (10M HALF AN_OFF) +---+ | |<--------------------+ | cfg take effect | | | | | | read cfg(time task) |<--+ | | (10M HALF AN_OFF) | | |<--------------------+ | | | v v v To avoid aboved situation, this patch introduced req_speed, req_duplex, req_autoneg to store user's configuration and it only be used after hardware reset and to recover user's configuration Fixes: f5f2b3e4dcc0 ("net: hns3: add support for imp-controlled PHYs") Signed-off-by: Peiyang Wang <wangpeiyang1@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-05-09spi: microchip-core-qspi: fix setting spi bus clock rateConor Dooley
Before ORing the new clock rate with the control register value read from the hardware, the existing clock rate needs to be masked off as otherwise the existing value will interfere with the new one. CC: stable@vger.kernel.org Fixes: 8596124c4c1b ("spi: microchip-core-qspi: Add support for microchip fpga qspi controllers") Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Tudor Ambarus <tudor.ambarus@linaro.org> Link: https://lore.kernel.org/r/20240508-fox-unpiloted-b97e1535627b@spud Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-08Revert "drm/amdkfd: Add partition id field to location_id"Lijo Lazar
This reverts commit c37ce764cd492f044dcdbb39616298f02b0dbc7f. RCCL library is currently not treating spatial partitions differently, hence this change is causing issues. Revert temporarily till RCCL implementation is ready for spatial partitions. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Jonathan Kim <jonathan.kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-08dm/amd/pm: Fix problems with reboot/shutdown for some SMU 13.0.4/13.0.11 usersMario Limonciello
Limit the workaround introduced by commit 31729e8c21ec ("drm/amd/pm: fixes a random hang in S4 for SMU v13.0.4/11") to only run in the s4 path. Cc: Tim Huang <Tim.Huang@amd.com> Fixes: 31729e8c21ec ("drm/amd/pm: fixes a random hang in S4 for SMU v13.0.4/11") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3351 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-08drm/amd/display: MST DSC check for older devicesAgustin Gutierrez
[Why] Some older MST hubs do not report DPCD registers according to specification. [How] This change re-applies commit c53655545141 ("drm/amd/display: dsc mst re-compute pbn for changes on hub"). With an additional check for these older MST devices. Reviewed-by: Swapnil Patel <swapnil.patel@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Agustin Gutierrez <agustin.gutierrez@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-08drm/amd/display: Fix idle optimization checks for multi-display and dual eDPNicholas Kazlauskas
[Why] Idle optimizations are blocked if there's more than one eDP connector on the board - blocking S0i3 and IPS2 for static screen. [How] Fix the checks to correctly detect number of active eDP. Also restrict the eDP support to panels that have correct feature support. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Charlene Liu <charlene.liu@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-08drm/amd/display: Fix DSC-re-computingAgustin Gutierrez
[Why] This fixes a bug introduced by commit c53655545141 ("drm/amd/display: dsc mst re-compute pbn for changes on hub"). The change caused light-up issues with a second display that required DSC on some MST docks. [How] Use Virtual DPCD for DSC caps in MST case. [Limitations] This change only affects MST DSC devices that follow specifications additional changes are required to check for old MST DSC devices such as ones which do not check for Virtual DPCD registers. Reviewed-by: Swapnil Patel <swapnil.patel@amd.com> Reviewed-by: Hersen Wu <hersenxs.wu@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Agustin Gutierrez <agustin.gutierrez@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-08drm/amd/display: Enable urgent latency adjustments for DCN35Nicholas Susanto
[Why] Underflow occurs when running Netflix in a 4k144 eDP + 4k60 HDMI FRL setup. It is caused by latency varying based on the DCFCLK/FCLK state. [How] Enable urgent latency adjustment and match the reference to existing ASIC that also see increased latency at low FCLK. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Nicholas Susanto <nicholas.susanto@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-08Merge tag 'soc-fixes-6.9-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "These are a couple of last minute fixes that came in over the previous week, addressing: - A pin configuration bug on a qualcomm board that caused issues with ethernet and mmc - Two minor code fixes for misleading console output in the microchip firmware driver - A build warning in the sifive cache driver" * tag 'soc-fixes-6.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: firmware: microchip: clarify that sizes and addresses are in hex firmware: microchip: don't unconditionally print validation success arm64: dts: qcom: sa8155p-adp: fix SDHC2 CD pin configuration cache: sifive_ccache: Silence unused variable warning
2024-05-08Merge tag 'pci-v6.9-fixes-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci fixes from Bjorn Helgaas: - Update kernel-parameters doc to describe "pcie_aspm=off" more accurately (Bjorn Helgaas) - Restore the parent's (not the child's) ASPM state to the parent during resume, which fixes a reboot during resume (Kai-Heng Feng) * tag 'pci-v6.9-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: PCI/ASPM: Restore parent state to parent, child state to child PCI/ASPM: Clarify that pcie_aspm=off means leave ASPM untouched
2024-05-08nvmet-rdma: fix possible bad dereference when freeing rspsSagi Grimberg
It is possible that the host connected and saw a cm established event and started sending nvme capsules on the qp, however the ctrl did not yet see an established event. This is why the rsp_wait_list exists (for async handling of these cmds, we move them to a pending list). Furthermore, it is possible that the ctrl cm times out, resulting in a connect-error cm event. in this case we hit a bad deref [1] because in nvmet_rdma_free_rsps we assume that all the responses are in the free list. We are freeing the cmds array anyways, so don't even bother to remove the rsp from the free_list. It is also guaranteed that we are not racing anything when we are releasing the queue so no other context accessing this array should be running. [1]: -- Workqueue: nvmet-free-wq nvmet_rdma_free_queue_work [nvmet_rdma] [...] pc : nvmet_rdma_free_rsps+0x78/0xb8 [nvmet_rdma] lr : nvmet_rdma_free_queue_work+0x88/0x120 [nvmet_rdma] Call trace: nvmet_rdma_free_rsps+0x78/0xb8 [nvmet_rdma] nvmet_rdma_free_queue_work+0x88/0x120 [nvmet_rdma] process_one_work+0x1ec/0x4a0 worker_thread+0x48/0x490 kthread+0x158/0x160 ret_from_fork+0x10/0x18 -- Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-05-08nvmet: prevent sprintf() overflow in nvmet_subsys_nsid_exists()Dan Carpenter
The nsid value is a u32 that comes from nvmet_req_find_ns(). It's endian data and we're on an error path and both of those raise red flags. So let's make this safer. 1) Make the buffer large enough for any u32. 2) Remove the unnecessary initialization. 3) Use snprintf() instead of sprintf() for even more safety. 4) The sprintf() function returns the number of bytes printed, not counting the NUL terminator. It is impossible for the return value to be <= 0 so delete that. Fixes: 505363957fad ("nvmet: fix nvme status code when namespace is disabled") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-05-08net: phy: marvell-88q2xxx: add support for Rev B1 and B2Gregor Herburger
Different revisions of the Marvell 88q2xxx phy needs different init sequences. Add init sequence for Rev B1 and Rev B2. Rev B2 init sequence skips one register write. Tested-by: Dimitri Fedrau <dima.fedrau@gmail.com> Signed-off-by: Gregor Herburger <gregor.herburger@ew.tq-group.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-07drm/connector: Add \n to message about demoting connector force-probesDouglas Anderson
The debug print clearly lacks a \n at the end. Add it. Fixes: 8f86c82aba8b ("drm/connector: demote connector force-probes for non-master clients") Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Reviewed-by: Simon Ser <contact@emersion.fr> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240502153234.1.I2052f01c8d209d9ae9c300b87c6e4f60bd3cc99e@changeid
2024-05-07gpiolib: fix the speed of descriptor label setting with SRCUBartosz Golaszewski
Commit 1f2bcb8c8ccd ("gpio: protect the descriptor label with SRCU") caused a massive drop in performance of requesting GPIO lines due to the call to synchronize_srcu() on each label change. Rework the code to not wait until all read-only users are done with reading the label but instead atomically replace the label pointer and schedule its release after all read-only critical sections are done. To that end wrap the descriptor label in a struct that also contains the rcu_head struct required for deferring tasks using call_srcu() and stop using kstrdup_const() as we're required to allocate memory anyway. Just allocate enough for the label string and rcu_head in one go. Reported-by: Neil Armstrong <neil.armstrong@linaro.org> Closes: https://lore.kernel.org/linux-gpio/CAMRc=Mfig2oooDQYTqo23W3PXSdzhVO4p=G4+P8y1ppBOrkrJQ@mail.gmail.com/ Fixes: 1f2bcb8c8ccd ("gpio: protect the descriptor label with SRCU") Suggested-by: "Paul E. McKenney" <paulmck@kernel.org> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-QRD Acked-by: "Paul E. McKenney" <paulmck@kernel.org> Link: https://lore.kernel.org/r/20240507121346.16969-1-brgl@bgdev.pl Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-05-07nvmet: make nvmet_wq unboundSagi Grimberg
When deleting many controllers one-by-one, it takes a very long time as these work elements may serialize as they are scheduled on the executing cpu instead of spreading. In general nvmet_wq can definitely be used for long standing work elements so its better to make it unbound regardless. Signed-off-by: Sagi Grimberg <sagi.grimberg@vastdata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-05-07nvmet-auth: return the error code to the nvmet_auth_ctrl_hash() callersMaurizio Lombardi
If nvmet_auth_ctrl_hash() fails, return the error code to its callers Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>