summaryrefslogtreecommitdiff
path: root/drivers/accel/qaic/qaic_drv.c
AgeCommit message (Collapse)Author
2024-04-12accel/qaic: Add Sahara implementation for firmware loadingJeffrey Hugo
The AIC100 secondary bootloader uses the Sahara protocol for two purposes - loading the runtime firmware images from the host, and offloading crashdumps to the host. The crashdump functionality is only invoked when the AIC100 device encounters a crash and dumps are enabled. Also the collection of the dump is optional - the host can reject collecting the dump. The Sahara protocol contains many features and modes including firmware upload, crashdump download, and client commands. For simplicity, implement the parts of the protocol needed for loading firmware to the device. Fundamentally, the Sahara protocol is an embedded file transfer protocol. Both sides negotiate a connection through a simple exchange of hello messages. After handshaking through a hello message, the device either sends a message requesting images, or a message advertising the memory dump available for the host. For image transfer, the remote device issues a read data request that provides an image (by ID), an offset, and a length. The host has an internal mapping of image IDs to filenames. The host is expected to access the image and transfer the requested chunk to the device. The device can issue additional read requests, or signal that it has consumed enough data from this image with an end of image message. The host confirms the end of image, and the device can proceed with another image by starting over with the hello exchange again. Some images may be optional, and only provided as part of a provisioning flow. The host is not aware of this information, and thus should report an error to the device when an image is not available. The device will evaluate if the image is required or not, and take the appropriate action. Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com> Reviewed-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com> Reviewed-by: Bjorn Andersson <andersson@kernel.org> Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240322034917.3522388-1-quic_jhugo@quicinc.com
2024-04-05accel/qaic: Add bootlog debugfsJeffrey Hugo
During the boot process of AIC100, the bootloaders (PBL and SBL) log messages to device RAM. During SBL, if the host opens the QAIC_LOGGING channel, SBL will offload the contents of the log buffer to the host, and stream any new messages that SBL logs. This log of the boot process can be very useful for an initial triage of any boot related issues. For example, if SBL rejects one of the runtime firmware images for a validation failure, SBL will log a reason why. Add the ability of the driver to open the logging channel, receive the messages, and store them. Also define a debugfs entry called "bootlog" by hooking into the DRM debugfs framework. When the bootlog debugfs entry is read, the current contents of the log that the host is caching is displayed to the user. The driver will retain the cache until it detects that the device has rebooted. At that point, the cache will be freed, and the driver will wait for a new log. With this scheme, the driver will only have a cache of the log from the current device boot. Note that if the driver initializes a device and it is already in the runtime state (QSM), no bootlog will be available through this mechanism because the driver and SBL have not communicated. Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com> Reviewed-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com> Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240322175730.3855440-2-quic_jhugo@quicinc.com
2023-12-20accel/qaic: Order pci_remove() operations in reverse of probe()Jeffrey Hugo
In probe() we create the drm_device, and then register the MHI controller. In remove(), we should unregister the controller first, then remove the drm_device. Update the remove() operations to match. Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com> Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com> Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231208163457.1295993-8-quic_jhugo@quicinc.com
2023-12-20accel/qaic: Leverage DRM managed APIs to release resourcesPranjal Ramajor Asha Kanojiya
Offload the balancing of init and destroy calls to DRM managed APIs. mutex destroy for ->cntl_mutex is not called during device release and destroy workqueue is not called in error path of create_qdev(). So, use DRM managed APIs to manage the release of resources and avoid such problems. Signed-off-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231208163457.1295993-7-quic_jhugo@quicinc.com
2023-12-01accel/qaic: Expand DRM device lifecycleCarl Vanderlip
Currently the QAIC DRM device registers itself when the MHI QAIC_CONTROL channel becomes available. This is when the device is able to process workloads. However, the DRM driver also provides the debugfs interface bootlog for the device. If the device fails to boot to the QSM (which brings up the MHI QAIC_CONTROL channel), the bootlog won't be available for debugging why it failed to boot. Change when the DRM device registers itself from when QAIC_CONTROL is available to when the card is first probed on the PCI bus. Additionally, make the DRM driver persist through reset/error cases so the driver doesn't have to be reloaded to access the card again. Send KOBJ_ONLINE/OFFLINE uevents so userspace can know when DRM device is ready to handle requests. Signed-off-by: Carl Vanderlip <quic_carlv@quicinc.com> Reviewed-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231117174337.20174-3-quic_jhugo@quicinc.com
2023-12-01accel/qaic: Increase number of in_reset statesCarl Vanderlip
'in_reset' holds the state of the device. As part of bringup, the device needs to be queried to check if it's in a valid state. Add a new state that indicates that the device is coming up, but not ready for users yet. Rename to 'dev_state' to better describe the variable. Signed-off-by: Carl Vanderlip <quic_carlv@quicinc.com> Reviewed-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231117174337.20174-2-quic_jhugo@quicinc.com
2023-10-27accel/qaic: Support MHI QAIC_TIMESYNC channelPranjal Ramajor Asha Kanojiya
Use QAIC_TIMESYNC MHI channel to send UTC time to device in SBL environment. Remove support for QAIC_TIMESYNC MHI channel in AMSS environment as it is not used in that environment. Signed-off-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com> Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231016170114.5446-3-quic_jhugo@quicinc.com
2023-10-27accel/qaic: Add support for periodic timesyncAjit Pal Singh
Device and Host have a time synchronization mechanism that happens once during boot when device is in SBL mode. After that, in mission-mode there is no timesync. In an experiment after continuous operation, device time drifted w.r.t. host by approximately 3 seconds per day. This drift leads to mismatch in timestamp of device and Host logs. To correct this implement periodic timesync in driver. This timesync is carried out via QAIC_TIMESYNC_PERIODIC MHI channel. Signed-off-by: Ajit Pal Singh <quic_ajitpals@quicinc.com> Signed-off-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com> Reviewed-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com> Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231016170114.5446-2-quic_jhugo@quicinc.com
2023-10-27accel/qaic: Enable 1 MSI fallback modeCarl Vanderlip
Several virtualization use-cases either don't support 32 MultiMSIs (Xen/VMware) or have significant drawbacks to their use (KVM's vIOMMU, which is required to support 32 MSI, needs to allocate an alternate system memory space for each device using vIOMMU (e.g. 8GB VM mem and 2 cards => 8 + 2 * 8 = 24GB host memory required)). Support these cases by enabling a 1 MSI fallback mode. Whenever all 32 MSIs requested are not available, a second request for a single MSI is made. Its success is the initiator of single MSI mode. This mode causes all interrupts generated by the device to be directed to the 0th MSI (firmware >=v1.10 will do this as a response to the PCIe MSI capability configuration). Likewise, all interrupt handlers for the device are registered to the 0th MSI. Since the DBC interrupt handler checks if the DBC is in use or if there is any pending changes, the 'spurious' interrupts are disregarded. If there is work to be done, the standard threaded IRQ handler is dispatched. On every interrupt, the MHI handler wakes up its threaded interrupt handler, and attempts to wake any waiters for MHI state events. Performance is within +-0.6% for test cases that typify real world use. Larger differences ([-4,+132]%, avg +47%) exist for very simple tasks (e.g. addition) compiled for single NSPs. It is assumed that the small work and many interrupts typically cause contention (e.g. 16 NSPs vs 4 CPUs), as evidenced by the standard deviation between runs also decreasing (r=-0.48 between delta(Performace_test) and delta(StdDev_test/Avg_test)) Signed-off-by: Carl Vanderlip <quic_carlv@quicinc.com> Reviewed-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231016170036.5409-1-quic_jhugo@quicinc.com
2023-09-22accel/qaic: Add QAIC_DETACH_SLICE_BO IOCTLPranjal Ramajor Asha Kanojiya
Once a BO is attached with slicing configuration that BO can only be used for that particular setting. With this new feature user can detach slicing configuration off an already sliced BO and attach new slicing configuration using QAIC_ATTACH_SLICE_BO. This will support BO recycling. detach_slice_bo() detaches slicing configuration from a BO. This new helper function can also be used in release_dbc() as we are doing the exact same thing. Signed-off-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> [jhugo: add documentation for new ioctl] Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230901172247.11410-8-quic_jhugo@quicinc.com
2023-09-15accel/qaic: Use devm_drm_dev_alloc() instead of drm_dev_alloc()Pranjal Ramajor Asha Kanojiya
Since drm_dev_alloc() is deprecated it is recommended to use devm_drm_dev_alloc() instead. Update the driver to start using devm_drm_dev_alloc(). Signed-off-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com> Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230901161236.8371-1-quic_jhugo@quicinc.com
2023-09-15accel/qaic: Register for PCI driver at the beginning of module initPranjal Ramajor Asha Kanojiya
As qaic drivers base device is connected to host via PCI framework, it makes sense to register in PCI framework at the beginning of module init. Signed-off-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com> Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230901161037.6124-1-quic_jhugo@quicinc.com
2023-06-26drm: Clear fd/handle callbacks in struct drm_driverThomas Zimmermann
Clear all assignments of struct drm_driver's fd/handle callbacks to drm_gem_prime_fd_to_handle() and drm_gem_prime_handle_to_fd(). These functions are called by default. Add a TODO item to convert vmwgfx to the defaults as well. v2: * remove TODO item (Zack) * also update amdgpu's amdgpu_partition_driver Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Simon Ser <contact@emersion.fr> Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Jeffrey Hugo <quic_jhugo@quicinc.com> # qaic Link: https://patchwork.freedesktop.org/patch/msgid/20230620080252.16368-3-tzimmermann@suse.de
2023-06-09accel/qaic: Fix NULL pointer deref in qaic_destroy_drm_device()Jeffrey Hugo
If qaic_destroy_drm_device() is called before the device has fully initialized it will cause a NULL pointer dereference as the drm device has not yet been created. Fix this with a NULL check. Fixes: c501ca23a6a3 ("accel/qaic: Add uapi and core driver file") Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com> Reviewed-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230602210440.8411-3-quic_jhugo@quicinc.com
2023-06-09accel/qaic: Free user handle on interrupted mutexCarl Vanderlip
After user handle is allocated, if mutex is interrupted, we do not free the user handle and return an error. Kref had been initialized, but not added to users list, so device teardown would also not call free_usr. Fixes: c501ca23a6a3 ("accel/qaic: Add uapi and core driver file") Signed-off-by: Carl Vanderlip <quic_carlv@quicinc.com> Reviewed-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230602210440.8411-2-quic_jhugo@quicinc.com
2023-05-16accel/qaic: silence some uninitialized variable warningsDan Carpenter
Smatch complains that these are not initialized if get_cntl_version() fails but we still print them in the debug message. Not the end of the world, but true enough. Let's just initialize them to a dummy value to make the checker happy. Fixes: c501ca23a6a3 ("accel/qaic: Add uapi and core driver file") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com> Reviewed-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com> [jhugo: Add fixes and reorder varable declarations for style] Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Link: https://patchwork.freedesktop.org/patch/msgid/d11ee378-7b06-4b5e-b56f-d66174be1ab3@kili.mountain
2023-04-13Revert "accel/qaic: Add mhi_qaic_cntl"Jeffrey Hugo
This reverts commit 566fc96198b4bb07ca6806386956669881225271. This exposes a userspace API that is still under debate. Revert the change before the uAPI gets exposed to avoid making a mistake. QAIC is otherwise still functional. Suggested-by: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1681307864-3782-1-git-send-email-quic_jhugo@quicinc.com
2023-04-06accel/qaic: Add uapi and core driver fileJeffrey Hugo
Add the QAIC driver uapi file and core driver file that binds to the PCIe device. The core driver file also creates the accel device and manages all the interconnections between the different parts of the driver. The driver can be built as a module. If so, it will be called "qaic.ko". Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com> Reviewed-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Acked-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1679932497-30277-3-git-send-email-quic_jhugo@quicinc.com