Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the large set of driver core changes for 6.4-rc1.
Once again, a busy development cycle, with lots of changes happening
in the driver core in the quest to be able to move "struct bus" and
"struct class" into read-only memory, a task now complete with these
changes.
This will make the future rust interactions with the driver core more
"provably correct" as well as providing more obvious lifetime rules
for all busses and classes in the kernel.
The changes required for this did touch many individual classes and
busses as many callbacks were changed to take const * parameters
instead. All of these changes have been submitted to the various
subsystem maintainers, giving them plenty of time to review, and most
of them actually did so.
Other than those changes, included in here are a small set of other
things:
- kobject logging improvements
- cacheinfo improvements and updates
- obligatory fw_devlink updates and fixes
- documentation updates
- device property cleanups and const * changes
- firwmare loader dependency fixes.
All of these have been in linux-next for a while with no reported
problems"
* tag 'driver-core-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (120 commits)
device property: make device_property functions take const device *
driver core: update comments in device_rename()
driver core: Don't require dynamic_debug for initcall_debug probe timing
firmware_loader: rework crypto dependencies
firmware_loader: Strip off \n from customized path
zram: fix up permission for the hot_add sysfs file
cacheinfo: Add use_arch[|_cache]_info field/function
arch_topology: Remove early cacheinfo error message if -ENOENT
cacheinfo: Check cache properties are present in DT
cacheinfo: Check sib_leaf in cache_leaves_are_shared()
cacheinfo: Allow early level detection when DT/ACPI info is missing/broken
cacheinfo: Add arm64 early level initializer implementation
cacheinfo: Add arch specific early level initializer
tty: make tty_class a static const structure
driver core: class: remove struct class_interface * from callbacks
driver core: class: mark the struct class in struct class_interface constant
driver core: class: make class_register() take a const *
driver core: class: mark class_release() as taking a const *
driver core: remove incorrect comment for device_create*
MIPS: vpe-cmp: remove module owner pointer from struct class usage.
...
|
|
Rather than duplicating most of the code for constructing the initial
TX_DATA and subsequent TX_DATA_CONT packets, roll them into a single
loop.
Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com>
Reviewed-by: Chris Lew <quic_clew@quicinc.com>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20230418163018.785524-3-quic_bjorande@quicinc.com
|
|
As support for splitting transmission over several messages using
TX_DATA_CONT was introduced it does not immediately return the return
value of qcom_glink_tx().
The result is that in the intentless case (i.e. intent == NULL), the
code will continue to send all additional chunks. This is wasteful, and
it's possible that the send operation could incorrectly indicate
success, if the last chunk fits in the TX fifo.
Fix the condition.
Fixes: 8956927faed3 ("rpmsg: glink: Add TX_DATA_CONT command while sending")
Reviewed-by: Chris Lew <quic_clew@quicinc.com>
Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20230418163018.785524-2-quic_bjorande@quicinc.com
|
|
In some implementations of the remote firmware, an intent request
acknowledgment is sent when it's determined if the intent allocation
will be fulfilled, but then the intent is queued after the
acknowledgment.
The result is that upon receiving a granted allocation request, the
search for the newly allocated intent will fail and an additional
request will be made. This will at best waste memory, but if the second
request is rejected the transaction will be incorrectly rejected.
Take the incoming intent into account in the wait to mitigate this
problem.
The above scenario can still happen, in the case that, on that same
channel, an unrelated intent is delivered prior to the request
acknowledgment and a separate process enters the send path and picks up
the intent. Given that there's no relationship between the
acknowledgment and the delivered (or to be delivered intent), there
doesn't seem to be a solution to this problem.
Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com>
Reviewed-by: Chris Lew <quic_clew@quicinc.com>
[bjorn: Fixed spelling issues pointed out by checkpatch in commit message]
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20230327144617.3134175-3-quic_bjorande@quicinc.com
|
|
Transition the intent request acknowledgement to use a wait queue so
that it's possible, in the next commit, to extend the wait to also wait
for an incoming intent.
Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com>
Reviewed-by: Chris Lew <quic_clew@quicinc.com>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20230327144617.3134175-2-quic_bjorande@quicinc.com
|
|
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is (mostly) ignored
and this typically results in resource leaks. To improve here there is a
quest to make the remove callback return void. In the first step of this
quest all drivers are converted to .remove_new() which already returns
void.
qcom_smd_remove() always returned zero, though that isn't completely
trivial to see. So explain that in a comment and convert to
.remove_new().
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20230321154039.355098-4-u.kleine-koenig@pengutronix.de
|
|
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is (mostly) ignored
and this typically results in resource leaks. To improve here there is a
quest to make the remove callback return void. In the first step of this
quest all drivers are converted to .remove_new() which already returns
void.
Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20230321154039.355098-3-u.kleine-koenig@pengutronix.de
|
|
This function returned zero unconditionally. Convert it to return no
value instead. This makes it more obvious what happens in the callers.
One caller is converted to return zero explicitly. The only other caller
(smd_subdev_stop() in drivers/remoteproc/qcom_common.c) already ignored
the return value before.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20230321154039.355098-2-u.kleine-koenig@pengutronix.de
|
|
The module pointer in class_create() never actually did anything, and it
shouldn't have been requred to be set as a parameter even if it did
something. So just remove it and fix up all callers of the function in
the kernel tree at the same time.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Link: https://lore.kernel.org/r/20230313181843.1207845-4-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux
Pull rpmsg updates from Bjorn Andersson:
- rpmsg ctrl and char driver locking is ensure ordering in cases where
the communication link is being torn down in parallel with calls to
open(2) or poll(2)
- The glink driver is refactored, to move rpm/smem-specifics out of the
common logic and better suite further improvements, such as
transports without a mailbox controller. The handling of remoteproc
shutdown is improved, to fail clients immediately instead of having
them to wait for timeouts. A driver_override memory leak is corrected
and a few spelling improvements are introduced
- glink_ssr is transitioned off strlcpy() and "gpr" is added as a valid
child node of the glink-edge DT binding
* tag 'rpmsg-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux:
rpmsg: glink: Release driver_override
rpmsg: glink: Avoid infinite loop on intent for missing channel
rpmsg: glink: Fix GLINK command prefix
rpmsg: glink: Fix spelling of peek
rpmsg: glink: Cancel pending intent requests at removal
rpmsg: glink: Fail qcom_glink_tx() once remove has been initiated
rpmsg: glink: Move irq and mbox handling to transports
rpmsg: glink: rpm: Wrap driver context
rpmsg: glink: smem: Wrap driver context
rpmsg: glink: Extract tx kick operation
rpmsg: glink: Include types in qcom_glink_native.h
rpmsg: ctrl: Add lock to rpmsg_ctrldev_remove
rpmsg: char: Add lock to avoid race when rpmsg device is released
rpmsg: move from strlcpy with unused retval to strscpy
dt-bindings: remoteproc: qcom,glink-edge: add GPR node
|
|
Upon termination of the rpmsg_device, driver_override needs to be freed
to avoid leaking the potentially assigned string.
Fixes: 42cd402b8fd4 ("rpmsg: Fix kfree() of static memory on setting driver_override")
Fixes: 39e47767ec9b ("rpmsg: Add driver_override device attribute for rpmsg_device")
Reviewed-by: Chris Lew <quic_clew@quicinc.com>
Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20230109223931.1706429-1-quic_bjorande@quicinc.com
|
|
In the event that an intent advertisement arrives on an unknown channel
the fifo is not advanced, resulting in the same message being handled
over and over.
Fixes: dacbb35e930f ("rpmsg: glink: Receive and store the remote intent buffers")
Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com>
Reviewed-by: Chris Lew <quic_clew@quicinc.com>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20230214234231.2069751-1-quic_bjorande@quicinc.com
|
|
The upstream GLINK driver was first introduced to communicate with the
RPM on MSM8996, presumably as an artifact from that era the command
defines was prefixed RPM_CMD, while they actually are GLINK_CMDs.
Let's rename these, to keep things tidy. No functional change.
Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com>
Reviewed-by: Chris Lew <quic_clew@quicinc.com>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20230214225933.2025595-1-quic_bjorande@quicinc.com
|
|
The code is peeking into the buffers, not peaking. Fix this throughout
the glink drivers.
Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com>
Reviewed-by: Chris Lew <quic_clew@quicinc.com>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20230214224746.1996130-1-quic_bjorande@quicinc.com
|
|
During removal of the glink edge interrupts are disabled and no more
incoming messages are being serviced. In addition to the remote endpoint
being defunct that means that any outstanding requests for intents will
not be serviced, and qcom_glink_request_intent() will blindly wait for
up to 10 seconds.
Mark the intent request as not granted and complete the intent request
completion to fail the waiting client immediately.
Once the current intent request is failed, any potential clients waiting
for the intent request mutex will not enter the same wait, as the
qcom_glink_tx() call will fail fast.
Reviewed-by: Chris Lew <quic_clew@quicinc.com>
Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20230213155215.1237059-7-quic_bjorande@quicinc.com
|
|
Upon removing the glink edge, communication is at best one-way. This
means that the very common scenario of glink requesting intents will not
be possible to serve.
Typically a successful transmission results in the client waiting for a
response, with some timeout and a mechanism for aborting that timeout.
Because of this, once the glink edge is defunct once removal is
commenced it's better to fail transmissions fast.
Reviewed-by: Chris Lew <quic_clew@quicinc.com>
Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20230213155215.1237059-6-quic_bjorande@quicinc.com
|
|
Not all GLINK transports uses an interrupt and a mailbox instance. The
interrupt for RPM needs to be IRQF_NOSUSPEND, while it seems reasonable
for the SMEM interrupt to use irq_set_wake. The glink struct device is
constructed in the SMEM and RPM drivers but torn down in the core
driver.
Move the interrupt and kick handling into the SMEM and RPM driver, to
improve this and facilitate further improvements.
Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com>
Reviewed-by: Chris Lew <quic_clew@quicinc.com>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20230213155215.1237059-5-quic_bjorande@quicinc.com
|
|
As with the SMEM driver update, wrap the RPM context in a struct to
facilitate the upcoming changes of moving IRQ and mailbox registration
to the driver.
Reviewed-by: Chris Lew <quic_clew@quicinc.com>
Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20230213155215.1237059-4-quic_bjorande@quicinc.com
|
|
The Glink SMEM driver allocates a struct device and hangs two
devres-allocated pipe objects thereon. To facilitate the move of
interrupt and mailbox handling to the driver, introduce a wrapper object
capturing the device, glink reference and remote processor id.
The type of the remoteproc reference is updated, as these are
specifically targeting the SMEM implementation.
Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com>
Reviewed-by: Chris Lew <quic_clew@quicinc.com>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20230213155215.1237059-3-quic_bjorande@quicinc.com
|
|
Refactor out the tx kick operations to its own function, in preparation
for pushing the details to the individual transports.
Reviewed-by: Chris Lew <quic_clew@quicinc.com>
Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20230213155215.1237059-2-quic_bjorande@quicinc.com
|
|
The uevent() callback in struct bus_type should not be modifying the
device that is passed into it, so mark it as a const * and propagate the
function signature changes out into all relevant subsystems that use
this callback.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20230111113018.459199-16-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Ensure that the used data types are available in qcom_glink_native.h, to
silence LSP warnings.
Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com>
Reviewed-by: Chris Lew <quic_clew@quicinc.com>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20230109223745.1706152-1-quic_bjorande@quicinc.com
|
|
Call to rpmsg_ctrldev_ioctl() and rpmsg_ctrldev_remove() must be synchronized.
In present code rpmsg_ctrldev_remove() is not protected with lock, therefore
new char device creation can succeed through rpmsg_ctrldev_ioctl() call. At the
same time call to rpmsg_ctrldev_remove() function for ctrl device removal will
free associated rpdev device. As char device creation already succeeded, user
space is free to issue open() call which maps to rpmsg_create_ept() in kernel.
rpmsg_create_ept() function tries to reference rpdev which has already been
freed through rpmsg_ctrldev_remove(). Issue is predominantly seen in aggressive
reboot tests where rpmsg_ctrldev_ioctl() and rpmsg_ctrldev_remove() can race with
each other.
Adding lock in rpmsg_ctrldev_remove() avoids any new char device creation
through rpmsg_ctrldev_ioctl() while remove call is already in progress.
Signed-off-by: Deepak Kumar Singh <quic_deesin@quicinc.com>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/1663584840-15762-3-git-send-email-quic_deesin@quicinc.com
|
|
When remote host goes down glink char device channel is freed and
associated rpdev is destroyed through rpmsg_chrdev_eptdev_destroy(),
At the same time user space apps can still try to open/poll rpmsg
char device which will result in calling rpmsg_create_ept()/rpmsg_poll().
These functions will try to reference rpdev which has already been freed
through rpmsg_chrdev_eptdev_destroy().
File operation functions and device removal function must be protected
with lock. This patch adds existing ept lock in remove function as well.
Signed-off-by: Deepak Kumar Singh <quic_deesin@quicinc.com>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/1663584840-15762-2-git-send-email-quic_deesin@quicinc.com
|
|
Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.
Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20220818210100.7277-1-wsa+renesas@sang-engineering.com
|
|
The rpmsg_dev_remove() in rpmsg_core is the place for releasing
this default endpoint.
So need to avoid destroying the default endpoint in
rpmsg_chrdev_eptdev_destroy(), this should be the same as
rpmsg_eptdev_release(). Otherwise there will be double destroy
issue that ept->refcount report warning:
refcount_t: underflow; use-after-free.
Call trace:
refcount_warn_saturate+0xf8/0x150
virtio_rpmsg_destroy_ept+0xd4/0xec
rpmsg_dev_remove+0x60/0x70
The issue can be reproduced by stopping remoteproc before
closing the /dev/rpmsgX.
Fixes: bea9b79c2d10 ("rpmsg: char: Add possibility to use default endpoint of the rpmsg device")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Reviewed-by: Arnaud Pouliquen <arnaud.pouliquen@foss.st.com>
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/1663725523-6514-1-git-send-email-shengjiu.wang@nxp.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
|
|
Return the value rpmsg_chrdev_eptdev_add() directly instead of storing it
in another redundant variable.
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn>
Link: https://lore.kernel.org/r/20220826071954.252485-1-ye.xingchen@zte.com.cn
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
|
|
Fix the following coccicheck warning:
drivers/rpmsg/qcom_glink_native.c:1677:8-16:
WARNING: use scnprintf or sprintf
Signed-off-by: Xuezhi Zhang <zhangxuezhi1@coolpad.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220607120649.78436-1-zhangxuezhi1@coolpad.com
|
|
of_parse_phandle() returns a node pointer with refcount
incremented, we should use of_node_put() on it when done.
Fixes: 53e2822e56c7 ("rpmsg: Introduce Qualcomm SMD backend")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220511120737.57374-1-linmq006@gmail.com
|
|
Correct kerneldoc warnings like:
drivers/rpmsg/qcom_glink_ssr.c:45:
warning: expecting prototype for G(). Prototype was for GLINK_SSR_DO_CLEANUP() instead
Also fix meaning of 'flag' argument.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220519073330.7187-3-krzysztof.kozlowski@linaro.org
|
|
The qcom_glink.name is read from DTS but never used further, never
referenced, so drop it. This also fixes kerneldoc warning:
drivers/rpmsg/qcom_glink_native.c:125:
warning: Function parameter or member 'name' not described in 'qcom_glink'
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220519073330.7187-2-krzysztof.kozlowski@linaro.org
|
|
The use of strncpy() is considered deprecated for NUL-terminated
strings[1]. Replace strncpy() with strscpy_pad(), to keep existing
pad-behavior of strncpy, similarly to commit 08de420a8014 ("rpmsg:
glink: Replace strncpy() with strscpy_pad()"). This fixes W=1 warning:
In function ‘qcom_glink_rx_close’,
inlined from ‘qcom_glink_work’ at ../drivers/rpmsg/qcom_glink_native.c:1638:4:
drivers/rpmsg/qcom_glink_native.c:1549:17: warning: ‘strncpy’ specified bound 32 equals destination size [-Wstringop-truncation]
1549 | strncpy(chinfo.name, channel->name, sizeof(chinfo.name));
[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220519073330.7187-1-krzysztof.kozlowski@linaro.org
|
|
Replace strcpy() with strscpy_pad() for copying the rpmsg
device name in rpmsg_register_device_override().
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Saud Farooqui <farooqui_saud@hotmail.com>
Link: https://lore.kernel.org/r/PA4P189MB14210AA95DCA3715AFA7F4A68BB59@PA4P189MB1421.EURP189.PROD.OUTLOOK.COM
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
|
|
rpmsg_register_device_override need to call put_device to free vch when
driver_set_override fails.
Fix this by adding a put_device() to the error path.
Fixes: bb17d110cbf2 ("rpmsg: Fix calling device_lock() on non-initialized device")
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Link: https://lore.kernel.org/r/20220624024120.11576-1-hbh25y@gmail.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
|
|
The parameter associated to the announce_create and
announce_destroy ops functions is not an endpoint but a rpmsg device.
Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@foss.st.com>
Link: https://lore.kernel.org/r/20220425071723.774050-1-arnaud.pouliquen@foss.st.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
|
|
During execution of the worker that's used to register rpmsg devices
we are safely locking the channels mutex but, when creating a new
endpoint for such devices, we are registering a IPI on the SCP, which
then makes the SCP to trigger an interrupt, lock its own mutex and in
turn register more subdevices.
This creates a circular locking dependency situation, as the mtk_rpmsg
channels_lock will then depend on the SCP IPI lock.
[ 15.447736] ======================================================
[ 15.460158] WARNING: possible circular locking dependency detected
[ 15.460161] 5.17.0-next-20220324+ #399 Not tainted
[ 15.460165] ------------------------------------------------------
[ 15.460166] kworker/0:3/155 is trying to acquire lock:
[ 15.460170] ffff5b4d0eaf1308 (&scp->ipi_desc[i].lock){+.+.}-{4:4}, at: scp_ipi_lock+0x34/0x50 [mtk_scp_ipi]
[ 15.504958]
[] but task is already holding lock:
[ 15.504960] ffff5b4d0e8f1918 (&mtk_subdev->channels_lock){+.+.}-{4:4}, at: mtk_register_device_work_function+0x50/0x1cc [mtk_rpmsg]
[ 15.504978]
[] which lock already depends on the new lock.
[ 15.504980]
[] the existing dependency chain (in reverse order) is:
[ 15.504982]
[] -> #1 (&mtk_subdev->channels_lock){+.+.}-{4:4}:
[ 15.504990] lock_acquire+0x68/0x84
[ 15.504999] __mutex_lock+0xa4/0x3e0
[ 15.505007] mutex_lock_nested+0x40/0x70
[ 15.505012] mtk_rpmsg_ns_cb+0xe4/0x134 [mtk_rpmsg]
[ 15.641684] mtk_rpmsg_ipi_handler+0x38/0x64 [mtk_rpmsg]
[ 15.641693] scp_ipi_handler+0xbc/0x180 [mtk_scp]
[ 15.663905] mt8192_scp_irq_handler+0x44/0xa4 [mtk_scp]
[ 15.663915] scp_irq_handler+0x6c/0xa0 [mtk_scp]
[ 15.685779] irq_thread_fn+0x34/0xa0
[ 15.685785] irq_thread+0x18c/0x240
[ 15.685789] kthread+0x104/0x110
[ 15.709579] ret_from_fork+0x10/0x20
[ 15.709586]
[] -> #0 (&scp->ipi_desc[i].lock){+.+.}-{4:4}:
[ 15.731271] __lock_acquire+0x11e4/0x1910
[ 15.740367] lock_acquire.part.0+0xd8/0x220
[ 15.749813] lock_acquire+0x68/0x84
[ 15.757861] __mutex_lock+0xa4/0x3e0
[ 15.766084] mutex_lock_nested+0x40/0x70
[ 15.775006] scp_ipi_lock+0x34/0x50 [mtk_scp_ipi]
[ 15.785503] scp_ipi_register+0x40/0xa4 [mtk_scp_ipi]
[ 15.796697] scp_register_ipi+0x1c/0x30 [mtk_scp]
[ 15.807194] mtk_rpmsg_create_ept+0xa0/0x108 [mtk_rpmsg]
[ 15.818912] rpmsg_create_ept+0x44/0x60
[ 15.827660] cros_ec_rpmsg_probe+0x15c/0x1f0
[ 15.837282] rpmsg_dev_probe+0x128/0x1d0
[ 15.846203] really_probe.part.0+0xa4/0x2a0
[ 15.855649] __driver_probe_device+0xa0/0x150
[ 15.865443] driver_probe_device+0x48/0x150
[ 15.877157] __device_attach_driver+0xc0/0x12c
[ 15.889359] bus_for_each_drv+0x80/0xe0
[ 15.900330] __device_attach+0xe4/0x190
[ 15.911303] device_initial_probe+0x1c/0x2c
[ 15.922969] bus_probe_device+0xa8/0xb0
[ 15.933927] device_add+0x3a8/0x8a0
[ 15.944193] device_register+0x28/0x40
[ 15.954970] rpmsg_register_device+0x5c/0xa0
[ 15.966782] mtk_register_device_work_function+0x148/0x1cc [mtk_rpmsg]
[ 15.983146] process_one_work+0x294/0x664
[ 15.994458] worker_thread+0x7c/0x45c
[ 16.005069] kthread+0x104/0x110
[ 16.014789] ret_from_fork+0x10/0x20
[ 16.025201]
[] other info that might help us debug this:
[ 16.047769] Possible unsafe locking scenario:
[ 16.063942] CPU0 CPU1
[ 16.075166] ---- ----
[ 16.086376] lock(&mtk_subdev->channels_lock);
[ 16.097592] lock(&scp->ipi_desc[i].lock);
[ 16.113188] lock(&mtk_subdev->channels_lock);
[ 16.129482] lock(&scp->ipi_desc[i].lock);
[ 16.140020]
[] *** DEADLOCK ***
[ 16.158282] 4 locks held by kworker/0:3/155:
[ 16.168978] #0: ffff5b4d00008748 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x1fc/0x664
[ 16.190017] #1: ffff80000953bdc8 ((work_completion)(&mtk_subdev->register_work)){+.+.}-{0:0}, at: process_one_work+0x1fc/0x664
[ 16.215269] #2: ffff5b4d0e8f1918 (&mtk_subdev->channels_lock){+.+.}-{4:4}, at: mtk_register_device_work_function+0x50/0x1cc [mtk_rpmsg]
[ 16.242131] #3: ffff5b4d05964190 (&dev->mutex){....}-{4:4}, at: __device_attach+0x44/0x190
To solve this, simply unlock the channels_lock mutex before calling
mtk_rpmsg_register_device() and relock it right after, as safety is
still ensured by the locking mechanism that happens right after
through SCP.
Fixes: 7017996951fd ("rpmsg: add rpmsg support for mt8183 SCP.")
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://lore.kernel.org/r/20220525091201.14210-1-angelogioacchino.delregno@collabora.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
|
|
There is no mutex protection for rpmsg_eptdev_open(),
especially for eptdev->ept read and write operation.
It may cause issues when multiple instances call
rpmsg_eptdev_open() in parallel,the return state
may be success or EBUSY.
Fixes: 964e8bedd5a1 ("rpmsg: char: Return an error if device already open")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://lore.kernel.org/r/1653104105-16779-1-git-send-email-shengjiu.wang@nxp.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the set of driver core changes for 5.19-rc1.
Lots of tiny driver core changes and cleanups happened this cycle, but
the two major things are:
- firmware_loader reorganization and additions including the ability
to have XZ compressed firmware images and the ability for userspace
to initiate the firmware load when it needs to, instead of being
always initiated by the kernel. FPGA devices specifically want this
ability to have their firmware changed over the lifetime of the
system boot, and this allows them to work without having to come up
with yet-another-custom-uapi interface for loading firmware for
them.
- physical location support added to sysfs so that devices that know
this information, can tell userspace where they are located in a
common way. Some ACPI devices already support this today, and more
bus types should support this in the future.
Smaller changes include:
- driver_override api cleanups and fixes
- error path cleanups and fixes
- get_abi script fixes
- deferred probe timeout changes.
It's that last change that I'm the most worried about. It has been
reported to cause boot problems for a number of systems, and I have a
tested patch series that resolves this issue. But I didn't get it
merged into my tree before 5.18-final came out, so it has not gotten
any linux-next testing.
I'll send the fixup patches (there are 2) as a follow-on series to this
pull request.
All have been tested in linux-next for weeks, with no reported issues
other than the above-mentioned boot time-outs"
* tag 'driver-core-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (55 commits)
driver core: fix deadlock in __device_attach
kernfs: Separate kernfs_pr_cont_buf and rename_lock.
topology: Remove unused cpu_cluster_mask()
driver core: Extend deferred probe timeout on driver registration
MAINTAINERS: add Russ Weight as a firmware loader maintainer
driver: base: fix UAF when driver_attach failed
test_firmware: fix end of loop test in upload_read_show()
driver core: location: Add "back" as a possible output for panel
driver core: location: Free struct acpi_pld_info *pld
driver core: Add "*" wildcard support to driver_async_probe cmdline param
driver core: location: Check for allocations failure
arch_topology: Trace the update thermal pressure
kernfs: Rename kernfs_put_open_node to kernfs_unlink_open_file.
export: fix string handling of namespace in EXPORT_SYMBOL_NS
rpmsg: use local 'dev' variable
rpmsg: Fix calling device_lock() on non-initialized device
firmware_loader: describe 'module' parameter of firmware_upload_register()
firmware_loader: Move definitions from sysfs_upload.h to sysfs.h
firmware_loader: Fix configs for sysfs split
selftests: firmware: Add firmware upload selftests
...
|
|
'&rpdev->dev' is already cached as local variable, so use it to simplify
the code.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220429195946.1061725-3-krzysztof.kozlowski@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
driver_set_override() helper uses device_lock() so it should not be
called before rpmsg_register_device() (which calls device_register()).
Effect can be seen with CONFIG_DEBUG_MUTEXES:
DEBUG_LOCKS_WARN_ON(lock->magic != lock)
WARNING: CPU: 3 PID: 57 at kernel/locking/mutex.c:582 __mutex_lock+0x1ec/0x430
...
Call trace:
__mutex_lock+0x1ec/0x430
mutex_lock_nested+0x44/0x50
driver_set_override+0x124/0x150
qcom_glink_native_probe+0x30c/0x3b0
glink_rpm_probe+0x274/0x350
platform_probe+0x6c/0xe0
really_probe+0x17c/0x3d0
__driver_probe_device+0x114/0x190
driver_probe_device+0x3c/0xf0
...
Refactor the rpmsg_register_device() function to use two-step device
registering (initialization + add) and call driver_set_override() in
proper moment.
This moves the code around, so while at it also NULL-ify the
rpdev->driver_override in error path to be sure it won't be kfree()
second time.
Fixes: 42cd402b8fd4 ("rpmsg: Fix kfree() of static memory on setting driver_override")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20220429195946.1061725-2-krzysztof.kozlowski@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
irq_of_parse_and_map() returns 0 on failure, so this should not be
passed further as error return code.
Fixes: 1a358d350664 ("rpmsg: qcom_smd: Fix irq_of_parse_and_map() return value")
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220423093932.32136-1-krzysztof.kozlowski@linaro.org
|
|
Unregister the rpmsg_ctrl device instead of just freeing the
the virtio_rpmsg_channel structure.
This will properly unregister the device and call
virtio_rpmsg_release_device() that frees the structure.
Fixes: c486682ae1e2 ("rpmsg: virtio: Register the rpmsg_char device")
Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@foss.st.com>
Reviewed-by: Hangyu Hua <hbh25y@gmail.com>
Link: https://lore.kernel.org/r/20220426060536.15594-4-hbh25y@gmail.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
|
|
vch will be free in virtio_rpmsg_release_device() when
rpmsg_ctrldev_register_device() fails. There is no need to call
kfree() again.
Fixes: c486682ae1e2 ("rpmsg: virtio: Register the rpmsg_char device")
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Tested-by: Arnaud Pouliquen <arnaud.pouliquen@foss.st.com>
Link: https://lore.kernel.org/r/20220426060536.15594-3-hbh25y@gmail.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
|
|
vch will be free in virtio_rpmsg_release_device() when
rpmsg_ns_register_device() fails. There is no need to call kfree() again.
Fix this by changing error path from free_vch to free_ctrldev.
Fixes: c486682ae1e2 ("rpmsg: virtio: Register the rpmsg_char device")
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Tested-by: Arnaud Pouliquen <arnaud.pouliquen@foss.st.com>
Link: https://lore.kernel.org/r/20220426060536.15594-2-hbh25y@gmail.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
|
|
The irq_of_parse_and_map() returns 0 on failure, not a negative ERRNO.
Fixes: 53e2822e56c7 ("rpmsg: Introduce Qualcomm SMD backend")
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220422105326.78713-1-krzysztof.kozlowski@linaro.org
|
|
The driver_override field from platform driver should not be initialized
from static memory (string literal) because the core later kfree() it,
for example when driver_override is set via sysfs.
Use dedicated helper to set driver_override properly.
Fixes: 950a7388f02b ("rpmsg: Turn name service into a stand alone driver")
Fixes: c0cdc19f84a4 ("rpmsg: Driver for user space endpoint interface")
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220419113435.246203-13-krzysztof.kozlowski@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Memory pointed by variable 'old' in field store macro is not modified,
so it can be made a pointer to const.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220419113435.246203-12-krzysztof.kozlowski@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Allow the user space application to create and release an rpmsg device
by adding RPMSG_CREATE_DEV_IOCTL and RPMSG_RELEASE_DEV_IOCTL ioctrls to
the /dev/rpmsg_ctrl interface
The RPMSG_CREATE_DEV_IOCTL Ioctl can be used to instantiate a local rpmsg
device.
Depending on the back-end implementation, the associated rpmsg driver is
probed and a NS announcement can be sent to the remote processor.
The RPMSG_RELEASE_DEV_IOCTL allows the user application to release a
rpmsg device created either by the remote processor or with the
RPMSG_CREATE_DEV_IOCTL call.
Depending on the back-end implementation, the associated rpmsg driver is
removed and a NS destroy rpmsg can be sent to the remote processor.
Suggested-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@foss.st.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220124102524.295783-12-arnaud.pouliquen@foss.st.com
|
|
For the rpmsg virtio backend, the current implementation of the rpmsg char
only allows to instantiate static(i.e. prefixed source and destination
addresses) end points, and only on the Linux user space initiative.
This patch defines the "rpmsg-raw" channel and registers it to the rpmsg bus.
This registration allows:
- To create the channel at the initiative of the remote processor
relying on the name service announcement mechanism. In other words the
/dev/rpmsgX interface is instantiate by the remote processor.
- To use the channel object instead of the endpoint, thus preventing the
user space from having the knowledge of the remote processor's
endpoint addresses.
- To rely on udev to be inform when a /dev/rpmsgX is created on remote
processor request, indicating that the remote processor is ready to
communicate.
Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@foss.st.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220124102524.295783-11-arnaud.pouliquen@foss.st.com
|
|
Current implementation create/destroy a new endpoint on each
rpmsg_eptdev_open/rpmsg_eptdev_release calls.
For a rpmsg device created by the NS announcement a default endpoint is created.
In this case we have to reuse the default rpmsg device endpoint associated to
the channel instead of creating a new one.
This patch prepares the introduction of a rpmsg channel device for the
char device. The rpmsg channel device will require a default endpoint to
communicate to the remote processor.
Add the default_ept field in rpmsg_eptdev structure.This pointer
determines the behavior on rpmsg_eptdev_open and rpmsg_eptdev_release call.
- If default_ept == NULL:
Use the legacy behavior by creating a new endpoint each time
rpmsg_eptdev_open is called and release it when rpmsg_eptdev_release
is called on /dev/rpmsgX device open/close.
- If default_ept is set:
use the rpmsg device default endpoint for the communication.
Add protection in rpmsg_eptdev_ioctl to prevent to destroy a default endpoint.
Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@foss.st.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220124102524.295783-10-arnaud.pouliquen@foss.st.com
|