summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm
AgeCommit message (Collapse)Author
2024-07-08drm/bridge: adv7511: Fix Intermittent EDID failuresAdam Ford
In the process of adding support for shared IRQ pins, a scenario was accidentally created where adv7511_irq_process returned prematurely causing the EDID to fail randomly. Since the interrupt handler is broken up into two main helper functions, update both of them to treat the helper functions as IRQ handlers. These IRQ routines process their respective tasks as before, but if they determine that actual work was done, mark the respective IRQ status accordingly, and delay the check until everything has been processed. This should guarantee the helper functions don't return prematurely while still returning proper values of either IRQ_HANDLED or IRQ_NONE. Reported-by: Liu Ying <victor.liu@nxp.com> Fixes: f3d9683346d6 ("drm/bridge: adv7511: Allow IRQ to share GPIO pins") Signed-off-by: Adam Ford <aford173@gmail.com> Tested-by: Liu Ying <victor.liu@nxp.com> # i.MX8MP EVK ADV7535 EDID retrieval w/o IRQ Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240630221931.1650565-1-aford173@gmail.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> V3: Remove unnecessary declaration of ret by evaluating the return code of regmap_read directly. V2: Fix uninitialized cec_status Cut back a little on error handling to return either IRQ_NONE or IRQ_HANDLED.
2024-07-05Merge tag 'amd-drm-fixes-6.10-2024-07-03' of ↵Daniel Vetter
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.10-2024-07-03: amdgpu: - Freesync fixes - DML1 bandwidth fix - DCN 3.5 fixes - DML2 fix - Silence an UBSAN warning radeon: - GPUVM fix Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240703184723.1981997-1-alexander.deucher@amd.com
2024-07-04Merge tag 'drm-misc-fixes-2024-07-04' of ↵Daniel Vetter
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes drm-misc-fixes for v6.10-rc7: - Add panel quirks. - Firmware sysfb refcount fix. - Another null pointer mode deref fix for nouveau. - Panthor sync and uobj fixes. - Fix fbdev regression since v6.7. - Delay free imported bo in ttm to fix lockdep splat. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ffba0c63-2798-40b6-948d-361cd3b14e9f@linux.intel.com
2024-07-04Merge tag 'drm-xe-fixes-2024-07-04' of ↵Daniel Vetter
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Driver Changes: - One copy/paste mistake fix. - One error path fix causing an error pointer dereference. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZoZ-wD66lgjiNh72@fedora
2024-07-04drm/xe/mcr: Avoid clobbering DSS steeringMatt Roper
A couple copy/paste mistakes in the code that selects steering targets for OADDRM and INSTANCE0 unintentionally clobbered the steering target for DSS ranges in some cases. The OADDRM/INSTANCE0 values were also not assigned as intended, although that mistake wound up being harmless since the desired values for those specific ranges were '0' which the kzalloc of the GT structure should have already taken care of implicitly. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240626210536.1620176-2-matthew.d.roper@intel.com (cherry picked from commit 4f82ac6102788112e599a6074d2c1f2afce923df) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-07-04drm/xe: fix error handling in xe_migrate_update_pgtablesMatthew Auld
Don't call drm_suballoc_free with sa_bo pointing to PTR_ERR. References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2120 Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240620102025.127699-2-matthew.auld@intel.com (cherry picked from commit ce6b63336f79ec5f3996de65f452330e395f99ae) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-07-04drm/ttm: Always take the bo delayed cleanup path for imported bosThomas Hellström
Bos can be put with multiple unrelated dma-resv locks held. But imported bos attempt to grab the bo dma-resv during dma-buf detach that typically happens during cleanup. That leads to lockde splats similar to the below and a potential ABBA deadlock. Fix this by always taking the delayed workqueue cleanup path for imported bos. Requesting stable fixes from when the Xe driver was introduced, since its usage of drm_exec and wide vm dma_resvs appear to be the first reliable trigger of this. [22982.116427] ============================================ [22982.116428] WARNING: possible recursive locking detected [22982.116429] 6.10.0-rc2+ #10 Tainted: G U W [22982.116430] -------------------------------------------- [22982.116430] glxgears:sh0/5785 is trying to acquire lock: [22982.116431] ffff8c2bafa539a8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: dma_buf_detach+0x3b/0xf0 [22982.116438] but task is already holding lock: [22982.116438] ffff8c2d9aba6da8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: drm_exec_lock_obj+0x49/0x2b0 [drm_exec] [22982.116442] other info that might help us debug this: [22982.116442] Possible unsafe locking scenario: [22982.116443] CPU0 [22982.116444] ---- [22982.116444] lock(reservation_ww_class_mutex); [22982.116445] lock(reservation_ww_class_mutex); [22982.116447] *** DEADLOCK *** [22982.116447] May be due to missing lock nesting notation [22982.116448] 5 locks held by glxgears:sh0/5785: [22982.116449] #0: ffff8c2d9aba58c8 (&xef->vm.lock){+.+.}-{3:3}, at: xe_file_close+0xde/0x1c0 [xe] [22982.116507] #1: ffff8c2e28cc8480 (&vm->lock){++++}-{3:3}, at: xe_vm_close_and_put+0x161/0x9b0 [xe] [22982.116578] #2: ffff8c2e31982970 (&val->lock){.+.+}-{3:3}, at: xe_validation_ctx_init+0x6d/0x70 [xe] [22982.116647] #3: ffffacdc469478a8 (reservation_ww_class_acquire){+.+.}-{0:0}, at: xe_vma_destroy_unlocked+0x7f/0xe0 [xe] [22982.116716] #4: ffff8c2d9aba6da8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: drm_exec_lock_obj+0x49/0x2b0 [drm_exec] [22982.116719] stack backtrace: [22982.116720] CPU: 8 PID: 5785 Comm: glxgears:sh0 Tainted: G U W 6.10.0-rc2+ #10 [22982.116721] Hardware name: ASUS System Product Name/PRIME B560M-A AC, BIOS 2001 02/01/2023 [22982.116723] Call Trace: [22982.116724] <TASK> [22982.116725] dump_stack_lvl+0x77/0xb0 [22982.116727] __lock_acquire+0x1232/0x2160 [22982.116730] lock_acquire+0xcb/0x2d0 [22982.116732] ? dma_buf_detach+0x3b/0xf0 [22982.116734] ? __lock_acquire+0x417/0x2160 [22982.116736] __ww_mutex_lock.constprop.0+0xd0/0x13b0 [22982.116738] ? dma_buf_detach+0x3b/0xf0 [22982.116741] ? dma_buf_detach+0x3b/0xf0 [22982.116743] ? ww_mutex_lock+0x2b/0x90 [22982.116745] ww_mutex_lock+0x2b/0x90 [22982.116747] dma_buf_detach+0x3b/0xf0 [22982.116749] drm_prime_gem_destroy+0x2f/0x40 [drm] [22982.116775] xe_ttm_bo_destroy+0x32/0x220 [xe] [22982.116818] ? __mutex_unlock_slowpath+0x3a/0x290 [22982.116821] drm_exec_unlock_all+0xa1/0xd0 [drm_exec] [22982.116823] drm_exec_fini+0x12/0xb0 [drm_exec] [22982.116824] xe_validation_ctx_fini+0x15/0x40 [xe] [22982.116892] xe_vma_destroy_unlocked+0xb1/0xe0 [xe] [22982.116959] xe_vm_close_and_put+0x41a/0x9b0 [xe] [22982.117025] ? xa_find+0xe3/0x1e0 [22982.117028] xe_file_close+0x10a/0x1c0 [xe] [22982.117074] drm_file_free+0x22a/0x280 [drm] [22982.117099] drm_release_noglobal+0x22/0x70 [drm] [22982.117119] __fput+0xf1/0x2d0 [22982.117122] task_work_run+0x59/0x90 [22982.117125] do_exit+0x330/0xb40 [22982.117127] do_group_exit+0x36/0xa0 [22982.117129] get_signal+0xbd2/0xbe0 [22982.117131] arch_do_signal_or_restart+0x3e/0x240 [22982.117134] syscall_exit_to_user_mode+0x1e7/0x290 [22982.117137] do_syscall_64+0xa1/0x180 [22982.117139] ? lock_acquire+0xcb/0x2d0 [22982.117140] ? __set_task_comm+0x28/0x1e0 [22982.117141] ? find_held_lock+0x2b/0x80 [22982.117144] ? __set_task_comm+0xe1/0x1e0 [22982.117145] ? lock_release+0xca/0x290 [22982.117147] ? __do_sys_prctl+0x245/0xab0 [22982.117149] ? lockdep_hardirqs_on_prepare+0xde/0x190 [22982.117150] ? syscall_exit_to_user_mode+0xb0/0x290 [22982.117152] ? do_syscall_64+0xa1/0x180 [22982.117154] ? __lock_acquire+0x417/0x2160 [22982.117155] ? reacquire_held_locks+0xd1/0x1f0 [22982.117156] ? do_user_addr_fault+0x30c/0x790 [22982.117158] ? lock_acquire+0xcb/0x2d0 [22982.117160] ? find_held_lock+0x2b/0x80 [22982.117162] ? do_user_addr_fault+0x357/0x790 [22982.117163] ? lock_release+0xca/0x290 [22982.117164] ? do_user_addr_fault+0x361/0x790 [22982.117166] ? trace_hardirqs_off+0x4b/0xc0 [22982.117168] ? clear_bhb_loop+0x45/0xa0 [22982.117170] ? clear_bhb_loop+0x45/0xa0 [22982.117172] ? clear_bhb_loop+0x45/0xa0 [22982.117174] entry_SYSCALL_64_after_hwframe+0x76/0x7e [22982.117176] RIP: 0033:0x7f943d267169 [22982.117192] Code: Unable to access opcode bytes at 0x7f943d26713f. [22982.117193] RSP: 002b:00007f9430bffc80 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca [22982.117195] RAX: fffffffffffffe00 RBX: 0000000000000000 RCX: 00007f943d267169 [22982.117196] RDX: 0000000000000000 RSI: 0000000000000189 RDI: 00005622f89579d0 [22982.117197] RBP: 00007f9430bffcb0 R08: 0000000000000000 R09: 00000000ffffffff [22982.117198] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [22982.117199] R13: 0000000000000000 R14: 0000000000000000 R15: 00005622f89579d0 [22982.117202] </TASK> Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Christian König <christian.koenig@amd.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: dri-devel@lists.freedesktop.org Cc: intel-xe@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v6.8+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240628153848.4989-1-thomas.hellstrom@linux.intel.com
2024-07-03drm/fbdev-generic: Fix framebuffer on big endian devicesThomas Huth
Starting with kernel 6.7, the framebuffer text console is not working anymore with the virtio-gpu device on s390x hosts. Such big endian fb devices are usinga different pixel ordering than little endian devices, e.g. DRM_FORMAT_BGRX8888 instead of DRM_FORMAT_XRGB8888. This used to work fine as long as drm_client_buffer_addfb() was still calling drm_mode_addfb() which called drm_driver_legacy_fb_format() internally to get the right format. But drm_client_buffer_addfb() has recently been reworked to call drm_mode_addfb2() instead with the format value that has been passed to it as a parameter (see commit 6ae2ff23aa43 ("drm/client: Convert drm_client_buffer_addfb() to drm_mode_addfb2()"). That format parameter is determined in drm_fbdev_generic_helper_fb_probe() via the drm_mode_legacy_fb_format() function - which only generates formats suitable for little endian devices. So to fix this issue switch to drm_driver_legacy_fb_format() here instead to take the device endianness into consideration. Fixes: 6ae2ff23aa43 ("drm/client: Convert drm_client_buffer_addfb() to drm_mode_addfb2()") Closes: https://issues.redhat.com/browse/RHEL-45158 Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240627173530.460615-1-thuth@redhat.com
2024-07-03drm/panthor: Fix sync-only jobsBoris Brezillon
A sync-only job is meant to provide a synchronization point on a queue, so we can't return a NULL fence there, we have to add a signal operation to the command stream which executes after all other previously submitted jobs are done. v2: - Fixed a UAF bug - Added R-bs Fixes: de8548813824 ("drm/panthor: Add the scheduler logical block") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240703071640.231278-3-boris.brezillon@collabora.com
2024-07-03drm/panthor: Don't check the array stride on empty uobj arraysBoris Brezillon
The user is likely to leave all the drm_panthor_obj_array fields to zero when the array is empty, which will cause an EINVAL failure. v2: - Added R-bs Fixes: 4bdca1150792 ("drm/panthor: Add the driver frontend block") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240703071640.231278-2-boris.brezillon@collabora.com
2024-07-02drm/amdgpu/atomfirmware: silence UBSAN warningAlex Deucher
This is a variable sized array. Link: https://lists.freedesktop.org/archives/amd-gfx/2024-June/110420.html Tested-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-07-01drm/radeon: check bo_va->bo is non-NULL before using itPierre-Eric Pelloux-Prayer
The call to radeon_vm_clear_freed might clear bo_va->bo, so we have to check it before dereferencing it. Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-01drm/amd/display: Fix array-index-out-of-bounds in dml2/FCLKChangeSupportRoman Li
[Why] Potential out of bounds access in dml2_calculate_rq_and_dlg_params() because the value of out_lowest_state_idx used as an index for FCLKChangeSupport array can be greater than 1. [How] Currently dml2 core specifies identical values for all FCLKChangeSupport elements. Always use index 0 in the condition to avoid out of bounds access. Acked-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Jerry Zuo <jerry.zuo@amd.com> Signed-off-by: Roman Li <Roman.Li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-01drm/amd/display: Update efficiency bandwidth for dcn351Fangzhi Zuo
Fix 4k240 underflow on dcn351 Acked-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-01drm/amd/display: Fix refresh rate range for some panelTom Chung
[Why] Some of the panels does not have the refresh rate range info in base EDID and only have the refresh rate range info in DisplayID block. It will cause the max/min freesync refresh rate set to 0. [How] Try to parse the refresh rate range info from DisplayID if the max/min refresh rate is 0. Reviewed-by: Sun peng Li <sunpeng.li@amd.com> Signed-off-by: Jerry Zuo <jerry.zuo@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-01drm/amd/display: Account for cursor prefetch BW in DML1 mode supportAlvin Lee
[Description] We need to ensure to take into account cursor prefetch BW in mode support or we may pass ModeQuery but fail an actual flip which will cause a hang. Flip may fail because the cursor_pre_bw is populated during mode programming (and mode programming is never called prior to ModeQuery). Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com> Reviewed-by: Nevenko Stupar <nevenko.stupar@amd.com> Signed-off-by: Jerry Zuo <jerry.zuo@amd.com> Signed-off-by: Alvin Lee <alvin.lee2@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-01drm/amd/display: Add refresh rate range checkTom Chung
[Why] We only enable the VRR while monitor usable refresh rate range is greater than 10 Hz. But we did not check the range in DRM_EDID_FEATURE_CONTINUOUS_FREQ case. [How] Add a refresh rate range check before set the freesync_capable flag in DRM_EDID_FEATURE_CONTINUOUS_FREQ case. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Jerry Zuo <jerry.zuo@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-01drm/amd/display: Reset freesync config before update new stateTom Chung
[Why] Sometimes the new_crtc_state->vrr_infopacket did not sync up with the current state. It will affect the update_freesync_state_on_stream() does not update the state correctly. [How] Reset the freesync config before get_freesync_config_for_crtc() to make sure we have the correct new_crtc_state for VRR. Reviewed-by: Sun peng Li <sunpeng.li@amd.com> Signed-off-by: Jerry Zuo <jerry.zuo@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-01drm: panel-orientation-quirks: Add labels for both Valve Steam Deck revisionsMatthew Schwartz
This accounts for the existence of two Steam Deck revisions instead of a single revision Signed-off-by: Matthew Schwartz <mattschwartz@gwu.edu> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240628205822.348402-3-mattschwartz@gwu.edu
2024-07-01drm: panel-orientation-quirks: Add quirk for Valve GalileoJohn Schoenick
Valve's Steam Deck Galileo revision has a 800x1280 OLED panel Cc: stable@vger.kernel.org # 6.1+ Signed-off-by: John Schoenick <johns@valvesoftware.com> Signed-off-by: Matthew Schwartz <mattschwartz@gwu.edu> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240628205822.348402-2-mattschwartz@gwu.edu
2024-07-01drm/i915/display: For MTL+ platforms skip mg dp programmingImre Deak
For MTL+ platforms we use PICA chips for Type-C support and hence mg programming is not needed. Fixes issue with drm warn of TC port not being in legacy mode. Cc: stable@vger.kernel.org Signed-off-by: Mika Kahola <mika.kahola@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240625111840.597574-1-mika.kahola@intel.com (cherry picked from commit aaf9dc86bd806458f848c39057d59e5aa652a399) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-06-28drm/nouveau: fix null pointer dereference in nouveau_connector_get_modesMa Ke
In nouveau_connector_get_modes(), the return value of drm_mode_duplicate() is assigned to mode, which will lead to a possible NULL pointer dereference on failure of drm_mode_duplicate(). Add a check to avoid npd. Cc: stable@vger.kernel.org Fixes: 6ee738610f41 ("drm/nouveau: Add DRM driver for NVIDIA GPUs") Signed-off-by: Ma Ke <make24@iscas.ac.cn> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240627074204.3023776-1-make24@iscas.ac.cn
2024-06-28drm/drm_file: Fix pid refcounting raceJann Horn
<maarten.lankhorst@linux.intel.com>, Maxime Ripard <mripard@kernel.org>, Thomas Zimmermann <tzimmermann@suse.de> filp->pid is supposed to be a refcounted pointer; however, before this patch, drm_file_update_pid() only increments the refcount of a struct pid after storing a pointer to it in filp->pid and dropping the dev->filelist_mutex, making the following race possible: process A process B ========= ========= begin drm_file_update_pid mutex_lock(&dev->filelist_mutex) rcu_replace_pointer(filp->pid, <pid B>, 1) mutex_unlock(&dev->filelist_mutex) begin drm_file_update_pid mutex_lock(&dev->filelist_mutex) rcu_replace_pointer(filp->pid, <pid A>, 1) mutex_unlock(&dev->filelist_mutex) get_pid(<pid A>) synchronize_rcu() put_pid(<pid B>) *** pid B reaches refcount 0 and is freed here *** get_pid(<pid B>) *** UAF *** synchronize_rcu() put_pid(<pid A>) As far as I know, this race can only occur with CONFIG_PREEMPT_RCU=y because it requires RCU to detect a quiescent state in code that is not explicitly calling into the scheduler. This race leads to use-after-free of a "struct pid". It is probably somewhat hard to hit because process A has to pass through a synchronize_rcu() operation while process B is between mutex_unlock() and get_pid(). Fix it by ensuring that by the time a pointer to the current task's pid is stored in the file, an extra reference to the pid has been taken. This fix also removes the condition for synchronize_rcu(); I think that optimization is unnecessary complexity, since in that case we would usually have bailed out on the lockless check above. Fixes: 1c7a387ffef8 ("drm: Update file owner during use") Cc: <stable@vger.kernel.org> Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2024-06-28Merge tag 'drm-intel-fixes-2024-06-27' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes drm/i915 fixes for v6.10-rc6: - Fix potential UAF due to race on fence register revocation Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/87ikxudcpd.fsf@intel.com
2024-06-27Merge tag 'amd-drm-fixes-6.10-2024-06-26' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.10-2024-06-26: amdgpu: - SMU 14.x fix - vram info parsing fix - mode1 reset fix - LTTPR fix - Virtual display fix - Avoid spurious error in PSP init Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240626221408.2019633-1-alexander.deucher@amd.com
2024-06-27Merge tag 'drm-misc-fixes-2024-06-26' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes drm-misc-fixes for v6.10-rc6: - nouveau tv mode fixes. - Add KOE TX26D202VM0BWA timings. - Fix fb_info when vmalloc is used, regression from CONFIG_DRM_FBDEV_LEAK_PHYS_SMEM. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/2e596c1e-9389-43c2-a029-06fe741c44c3@linux.intel.com
2024-06-25drm/nouveau/dispnv04: fix null pointer dereference in nv17_tv_get_ld_modesMa Ke
In nv17_tv_get_ld_modes(), the return value of drm_mode_duplicate() is assigned to mode, which will lead to a possible NULL pointer dereference on failure of drm_mode_duplicate(). Add a check to avoid npd. Cc: stable@vger.kernel.org Signed-off-by: Ma Ke <make24@iscas.ac.cn> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240625081828.2620794-1-make24@iscas.ac.cn
2024-06-25drm/nouveau/dispnv04: fix null pointer dereference in nv17_tv_get_hd_modesMa Ke
In nv17_tv_get_hd_modes(), the return value of drm_mode_duplicate() is assigned to mode, which will lead to a possible NULL pointer dereference on failure of drm_mode_duplicate(). The same applies to drm_cvt_mode(). Add a check to avoid null pointer dereference. Cc: stable@vger.kernel.org Signed-off-by: Ma Ke <make24@iscas.ac.cn> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240625081029.2619437-1-make24@iscas.ac.cn
2024-06-25drm/amdgpu: Don't show false warning for reg listLijo Lazar
If reg list is already loaded on PSP 13.0.2 SOCs, psp will give TEE_ERR_CANCEL response on second time load. Avoid printing warn message for it. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-25drm/amdgpu: avoid using null object of framebufferJulia Zhang
Instead of using state->fb->obj[0] directly, get object from framebuffer by calling drm_gem_fb_get_obj() and return error code when object is null to avoid using null object of framebuffer. Reported-by: Fusheng Huang <fusheng.huang@ecarxgroup.com> Signed-off-by: Julia Zhang <Julia.Zhang@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-06-25drm/amd/display: Send DP_TOTAL_LTTPR_CNT during detection if LTTPR is presentMichael Strauss
[WHY] New register field added in DP2.1 SCR, needed for auxless ALPM [HOW] Echo value read from 0xF0007 back to sink Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Michael Strauss <michael.strauss@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-25drm/amdgpu: Fix pci state save during mode-1 resetLijo Lazar
Cache the PCI state before bus master is disabled. The saved state is later used for other cases like restoring config space after mode-2 reset. Fixes: 5c03e5843e6b ("drm/amdgpu:add smu mode1/2 support for aldebaran") Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-25drm/amdgpu/atomfirmware: fix parsing of vram_infoAlex Deucher
v3.x changed the how vram width was encoded. The previous implementation actually worked correctly for most boards. Fix the implementation to work correctly everywhere. This fixes the vram width reported in the kernel log on some boards. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-06-25drm/amd/swsmu: add MALL init support workaround for smu_v14_0_1Li Ma
[Why] SMU firmware has not supported MALL PG. [How] Disable MALL PG and make it always on until SMU firmware is ready. Signed-off-by: Li Ma <li.ma@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-24drm/i915/gt: Fix potential UAF by revoke of fence registersJanusz Krzysztofik
CI has been sporadically reporting the following issue triggered by igt@i915_selftest@live@hangcheck on ADL-P and similar machines: <6> [414.049203] i915: Running intel_hangcheck_live_selftests/igt_reset_evict_fence ... <6> [414.068804] i915 0000:00:02.0: [drm] GT0: GUC: submission enabled <6> [414.068812] i915 0000:00:02.0: [drm] GT0: GUC: SLPC enabled <3> [414.070354] Unable to pin Y-tiled fence; err:-4 <3> [414.071282] i915_vma_revoke_fence:301 GEM_BUG_ON(!i915_active_is_idle(&fence->active)) ... <4>[ 609.603992] ------------[ cut here ]------------ <2>[ 609.603995] kernel BUG at drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c:301! <4>[ 609.604003] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI <4>[ 609.604006] CPU: 0 PID: 268 Comm: kworker/u64:3 Tainted: G U W 6.9.0-CI_DRM_14785-g1ba62f8cea9c+ #1 <4>[ 609.604008] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P DDR4 RVP, BIOS RPLPFWI1.R00.4035.A00.2301200723 01/20/2023 <4>[ 609.604010] Workqueue: i915 __i915_gem_free_work [i915] <4>[ 609.604149] RIP: 0010:i915_vma_revoke_fence+0x187/0x1f0 [i915] ... <4>[ 609.604271] Call Trace: <4>[ 609.604273] <TASK> ... <4>[ 609.604716] __i915_vma_evict+0x2e9/0x550 [i915] <4>[ 609.604852] __i915_vma_unbind+0x7c/0x160 [i915] <4>[ 609.604977] force_unbind+0x24/0xa0 [i915] <4>[ 609.605098] i915_vma_destroy+0x2f/0xa0 [i915] <4>[ 609.605210] __i915_gem_object_pages_fini+0x51/0x2f0 [i915] <4>[ 609.605330] __i915_gem_free_objects.isra.0+0x6a/0xc0 [i915] <4>[ 609.605440] process_scheduled_works+0x351/0x690 ... In the past, there were similar failures reported by CI from other IGT tests, observed on other platforms. Before commit 63baf4f3d587 ("drm/i915/gt: Only wait for GPU activity before unbinding a GGTT fence"), i915_vma_revoke_fence() was waiting for idleness of vma->active via fence_update(). That commit introduced vma->fence->active in order for the fence_update() to be able to wait selectively on that one instead of vma->active since only idleness of fence registers was needed. But then, another commit 0d86ee35097a ("drm/i915/gt: Make fence revocation unequivocal") replaced the call to fence_update() in i915_vma_revoke_fence() with only fence_write(), and also added that GEM_BUG_ON(!i915_active_is_idle(&fence->active)) in front. No justification was provided on why we might then expect idleness of vma->fence->active without first waiting on it. The issue can be potentially caused by a race among revocation of fence registers on one side and sequential execution of signal callbacks invoked on completion of a request that was using them on the other, still processed in parallel to revocation of those fence registers. Fix it by waiting for idleness of vma->fence->active in i915_vma_revoke_fence(). Fixes: 0d86ee35097a ("drm/i915/gt: Make fence revocation unequivocal") Closes: https://gitlab.freedesktop.org/drm/intel/issues/10021 Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Cc: stable@vger.kernel.org # v5.8+ Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240603195446.297690-2-janusz.krzysztofik@linux.intel.com (cherry picked from commit 24bb052d3dd499c5956abad5f7d8e4fd07da7fb1) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-06-24drm/panel: simple: Add missing display timing flags for KOE TX26D202VM0BWALiu Ying
KOE TX26D202VM0BWA panel spec indicates the DE signal is active high in timing chart, so add DISPLAY_FLAGS_DE_HIGH flag in display timing flags. This aligns display_timing with panel_desc. Fixes: 8a07052440c2 ("drm/panel: simple: Add support for KOE TX26D202VM0BWA panel") Signed-off-by: Liu Ying <victor.liu@nxp.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20240624015612.341983-1-victor.liu@nxp.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240624015612.341983-1-victor.liu@nxp.com
2024-06-23Merge tag 'x86_urgent_for_v6.10_rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - An ARM-relevant fix to not free default RMIDs of a resource control group - A randconfig build fix for the VMware virtual GPU driver * tag 'x86_urgent_for_v6.10_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/resctrl: Don't try to free nonexistent RMIDs drm/vmwgfx: Fix missing HYPERVISOR_GUEST dependency
2024-06-21Merge tag 'drm-xe-fixes-2024-06-20' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Driver Changes: - Fix for invalid register access Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZnPiE4ROqBowa1nS@fedora
2024-06-21Merge tag 'amd-drm-fixes-6.10-2024-06-19' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.10-2024-06-19: amdgpu: - Fix display idle optimization race - Fix GPUVM TLB flush locking scope - IPS fix - GFX 9.4.3 harvesting fix - Runtime pm fix for shared buffers - DCN 3.5.x fixes - USB4 fix - RISC-V clang fix - Silence UBSAN warnings - MES11 fix - PSP 14.0.x fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240619223233.3116457-1-alexander.deucher@amd.com
2024-06-20drm/xe/vf: Don't touch GuC irq registers if using memory irqsMichal Wajdeczko
On platforms where VFs are using memory based interrupts, we missed invalid access to no longer existing interrupt registers, as we keep them marked with XE_REG_OPTION_VF. To fix that just either setup memirq vectors in GuC or enable legacy interrupts. Fixes: aef4eb7c7dec ("drm/xe/vf: Setup memory based interrupts in GuC") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240617154736.685-1-michal.wajdeczko@intel.com (cherry picked from commit f0ccd2d805e55e12b430d5d6b9acd9f891af455e) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-06-19drm/amdgpu: init TA fw for psp v14Likun Gao
Add support to init TA firmware for psp v14. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19drm/amdgpu: cleanup MES11 command submissionChristian König
The approach of having a separate WB slot for each submission doesn't really work well and for example breaks GPU reset. Use a status query packet for the fence update instead since those should always succeed we can use the fence of the original packet to signal the state of the operation. While at it cleanup the coding style. Fixes: eef016ba8986 ("drm/amdgpu/mes11: Use a separate fence per transaction") Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19drm/amdgpu: fix UBSAN warning in kv_dpm.cAlex Deucher
Adds bounds check for sumo_vid_mapping_entry. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3392 Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-06-19drm/radeon: fix UBSAN warning in kv_dpm.cAlex Deucher
Adds bounds check for sumo_vid_mapping_entry. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-06-19drm/amd/display: Disable CONFIG_DRM_AMD_DC_FP for RISC-V with clangNathan Chancellor
Commit 77acc6b55ae4 ("riscv: add support for kernel-mode FPU") and commit a28e4b672f04 ("drm/amd/display: use ARCH_HAS_KERNEL_FPU_SUPPORT") enabled support for CONFIG_DRM_AMD_DC_FP with RISC-V. Unfortunately, this exposed -Wframe-larger-than warnings (which become fatal with CONFIG_WERROR=y) when building ARCH=riscv allmodconfig with clang: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:58:13: error: stack frame size (2448) exceeds limit (2048) in 'DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation' [-Werror,-Wframe-larger-than] 58 | static void DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation( | ^ 1 error generated. Many functions in this file use a large number of parameters, which must be passed on the stack at a certain pointer due to register exhaustion, which can cause high stack usage when inlining and issues with stack slot analysis get involved. While the compiler can and should do better (as GCC uses less than half the amount of stack space for the same function), it is not as simple as a fix as adjusting the functions not to take a large number of parameters. Unfortunately, modifying these files to avoid the problem is a difficult to justify approach because any revisions to the files in the kernel tree never make it back to the original source (so copies of the code for newer hardware revisions just reintroduce the issue) and the files are hard to read/modify due to being "gcc-parsable HW gospel, coming straight from HW engineers". Avoid building the problematic code for RISC-V by modifying the existing condition for arm64 that exists for the same reason. Factor out the logical not to make the condition a little more readable naturally. Fixes: a28e4b672f04 ("drm/amd/display: use ARCH_HAS_KERNEL_FPU_SUPPORT") Reported-by: Palmer Dabbelt <palmer@rivosinc.com> Closes: https://lore.kernel.org/20240530145741.7506-2-palmer@rivosinc.com/ Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19drm/amd/display: Attempt to avoid empty TUs when endpoint is DPIAMichael Strauss
[WHY] Empty SST TUs are illegal to transmit over a USB4 DP tunnel. Current policy is to configure stream encoder to pack 2 pixels per pclk even when ODM combine is not in use, allowing seamless dynamic ODM reconfiguration. However, in extreme edge cases where average pixel count per TU is less than 2, this can lead to unexpected empty TU generation during compliance testing. For example, VIC 1 with a 1xHBR3 link configuration will average 1.98 pix/TU. [HOW] Calculate average pixel count per TU, and block 2 pixels per clock if endpoint is a DPIA tunnel and pixel clock is low enough that we will never require 2:1 ODM combine. Cc: stable@vger.kernel.org # 6.6+ Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Michael Strauss <michael.strauss@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19drm/amd/display: change dram_clock_latency to 34us for dcn35Paul Hsieh
[Why & How] Current DRAM setting would cause underflow on customer platform. Modify dram_clock_change_latency_us from 11.72 to 34.0 us as per recommendation from HW team Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Signed-off-by: Paul Hsieh <paul.hsieh@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19drm/amd/display: Change dram_clock_latency to 34us for dcn351Daniel Miess
[Why] Intermittent underflow observed when using 4k144 display on dcn351 [How] Update dram_clock_change_latency_us from 11.72us to 34us Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Acked-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Signed-off-by: Daniel Miess <daniel.miess@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-19drm/amdgpu: revert "take runtime pm reference when we attach a buffer" v2Christian König
This reverts commit b8c415e3bf98 ("drm/amdgpu: take runtime pm reference when we attach a buffer") and commit 425285d39afd ("drm/amdgpu: add amdgpu runpm usage trace for separate funcs"). Taking a runtime pm reference for DMA-buf is actually completely unnecessary and even dangerous. The problem is that calling pm_runtime_get_sync() from the DMA-buf callbacks is illegal because we have the reservation locked here which is also taken during resume. So this would deadlock. When the buffer is in GTT it is still accessible even when the GPU is powered down and when it is in VRAM the buffer gets migrated to GTT before powering down. The only use case which would make it mandatory to keep the runtime pm reference would be if we pin the buffer into VRAM, and that's not something we currently do. v2: improve the commit message Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> CC: stable@vger.kernel.org
2024-06-19drm/amdgpu: Indicate CU havest info to CPHarish Kasiviswanathan
To achieve full occupancy CP hardware needs to know if CUs in SE are symmetrically or asymmetrically harvested v2: Reset is_symmetric_cus for each loop Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>