summaryrefslogtreecommitdiff
path: root/include
diff options
context:
space:
mode:
authorMike Christie <michael.christie@oracle.com>2021-05-25 13:18:00 -0500
committerMartin K. Petersen <martin.petersen@oracle.com>2021-06-02 01:28:20 -0400
commit23d6fefbb3f6b1cc29794427588b470ed06ff64e (patch)
treec76fc23db6f4338819ec9e4c96d34b426c2fc627 /include
parent9e5fe1700896c85040943fdc0d3fee0dd3e0d36f (diff)
scsi: iscsi: Fix in-kernel conn failure handling
Commit 0ab710458da1 ("scsi: iscsi: Perform connection failure entirely in kernel space") has the following regressions/bugs that this patch fixes: 1. It can return cmds to upper layers like dm-multipath where that can retry them. After they are successful the fs/app can send new I/O to the same sectors, but we've left the cmds running in FW or in the net layer. We need to be calling ep_disconnect if userspace is not up. This patch only fixes the issue for offload drivers. iscsi_tcp will be fixed in separate commit because it doesn't have a ep_disconnect call. 2. The drivers that implement ep_disconnect expect that it's called before conn_stop. Besides crashes, if the cleanup_task callout is called before ep_disconnect it might free up driver/card resources for session1 then they could be allocated for session2. But because the driver's ep_disconnect is not called it has not cleaned up the firmware so the card is still using the resources for the original cmd. 3. The stop_conn_work_fn can run after userspace has done its recovery and we are happily using the session. We will then end up with various bugs depending on what is going on at the time. We may also run stop_conn_work_fn late after userspace has called stop_conn and ep_disconnect and is now going to call start/bind conn. If stop_conn_work_fn runs after bind but before start, we would leave the conn in a unbound but sort of started state where IO might be allowed even though the drivers have been set in a state where they no longer expect I/O. 4. Returning -EAGAIN in iscsi_if_destroy_conn if we haven't yet run the in kernel stop_conn function is breaking userspace. We should have been doing this for the caller. Link: https://lore.kernel.org/r/20210525181821.7617-8-michael.christie@oracle.com Fixes: 0ab710458da1 ("scsi: iscsi: Perform connection failure entirely in kernel space") Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Diffstat (limited to 'include')
-rw-r--r--include/scsi/scsi_transport_iscsi.h10
1 files changed, 9 insertions, 1 deletions
diff --git a/include/scsi/scsi_transport_iscsi.h b/include/scsi/scsi_transport_iscsi.h
index d36a72cf049f..3974329d4d02 100644
--- a/include/scsi/scsi_transport_iscsi.h
+++ b/include/scsi/scsi_transport_iscsi.h
@@ -197,15 +197,23 @@ enum iscsi_connection_state {
ISCSI_CONN_BOUND,
};
+#define ISCSI_CLS_CONN_BIT_CLEANUP 1
+
struct iscsi_cls_conn {
struct list_head conn_list; /* item in connlist */
- struct list_head conn_list_err; /* item in connlist_err */
void *dd_data; /* LLD private data */
struct iscsi_transport *transport;
uint32_t cid; /* connection id */
+ /*
+ * This protects the conn startup and binding/unbinding of the ep to
+ * the conn. Unbinding includes ep_disconnect and stop_conn.
+ */
struct mutex ep_mutex;
struct iscsi_endpoint *ep;
+ unsigned long flags;
+ struct work_struct cleanup_work;
+
struct device dev; /* sysfs transport/container device */
enum iscsi_connection_state state;
};