Age | Commit message (Collapse) | Author |
|
Add code in scsi_result_to_blk_status to translate a new error
DID_TRANSPORT_MARGINAL to the corresponding blk_status_t i.e
BLK_STS_TRANSPORT.
Add DID_TRANSPORT_MARGINAL case to scsi_decide_disposition().
Link: https://lore.kernel.org/r/1609969748-17684-2-git-send-email-muneendra.kumar@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Add the various module parameter toggles for adjusting the MQ
characteristics at boot/load time as well as a device attribute for
changing the client scsi channel request amount.
Link: https://lore.kernel.org/r/20210114203148.246656-22-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Turn on MQ by default and set sane values for the upper limit on hw queues
for the SCSI host, and number of hw SCSI channels to request from the
partner VIOS.
Link: https://lore.kernel.org/r/20210114203148.246656-21-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Grab the queue and list lock for each Sub-CRQ and add any uncompleted
events to the host purge list.
Link: https://lore.kernel.org/r/20210114203148.246656-20-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
In general the client needs to send Cancel MADs and task management
commands down the same channel as the command(s) intended to cancel or
abort. The client assigns cancel keys per LUN and thus must send a Cancel
down each channel commands were submitted for that LUN. Further, the client
then must wait for those cancel completions prior to submitting a LUN RESET
or ABORT TASK SET.
Add a cancel rsp iu syncronization field to the ibmvfc_queue struct such
that the cancel routine can sync the cancel response to each queue that
requires a cancel command. Build a list of each cancel event sent and wait
for the completion of each submitted cancel.
Link: https://lore.kernel.org/r/20210114203148.246656-19-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Add a helper routine for initializing a Cancel MAD. This will be useful for
a channelized client that needs to send Cancel commands down every channel
commands were sent for a particular LUN.
Link: https://lore.kernel.org/r/20210114203148.246656-18-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
If the ibmvfc client adapter requests channels it must submit a number of
Sub-CRQ handles matching the number of channels being requested. The VIOS
in its response will overwrite the actual number of channel resources
allocated which may be less than what was requested. The client then must
store the VIOS Sub-CRQ handle for each queue. This VIOS handle is needed as
a parameter with h_send_sub_crq().
Link: https://lore.kernel.org/r/20210114203148.246656-17-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
When the client has negotiated the use of channels all vfcFrames are
required to go down a Sub-CRQ channel or it is a protocoal violation. If
the adapter state is channelized submit vfcFrames to the appropriate
Sub-CRQ via the h_send_sub_crq() helper.
Link: https://lore.kernel.org/r/20210114203148.246656-16-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Extract the hwq id from a SCSI command and store it in the ibmvfc_event
structure to identify which Sub-CRQ to send the command down when channels
are being utilized.
Link: https://lore.kernel.org/r/20210114203148.246656-15-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Previous patches have plumbed the necessary Sub-CRQ interface and channel
negotiation MADs to fully channelize via hardware backed queues.
Advertise client support via NPIV Login capability IBMVFC_CAN_USE_CHANNELS
when the client bits have MQ enabled via vhost->mq_enabled, or when
channels were already in use during a subsequent NPIV Login. The later is
required because channel support is only renegotiated after a CRQ pair is
broken. Simple NPIV Logout/Logins require the client to continue to
advertise the channel capability until the CRQ pair between the client is
broken.
Link: https://lore.kernel.org/r/20210114203148.246656-14-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
New NPIV_ENQUIRY_CHANNEL and NPIV_SETUP_CHANNEL management datagrams (MADs)
were defined in a previous patchset. If the client advertises a desire to
use channels and the partner VIOS is channel capable then the client must
proceed with channel enquiry to determine the maximum number of channels
the VIOS is capable of providing, and registering SubCRQs via channel setup
with the VIOS immediately following NPIV Login. This handshaking should not
be performed for subsequent NPIV Logins unless the CRQ connection has been
reset.
Implement these two new MADs and issue them following a successful NPIV
login where the VIOS has set the SUPPORT_CHANNELS capability bit in the
NPIV Login response.
Link: https://lore.kernel.org/r/20210114203148.246656-13-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Create an irq mapping for the hw_irq number provided from phyp firmware.
Request an irq assigned our Sub-CRQ interrupt handler. Unmap these irqs at
Sub-CRQ teardown.
Link: https://lore.kernel.org/r/20210114203148.246656-12-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Simple handler that calls Sub-CRQ drain routine directly.
Link: https://lore.kernel.org/r/20210114203148.246656-11-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The logic for iterating over the Sub-CRQ responses is similiar to that of
the primary CRQ. Add the necessary handlers for processing those responses.
Link: https://lore.kernel.org/r/20210114203148.246656-10-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Each Sub-CRQ has its own interrupt. A hypercall is required to toggle the
IRQ state. Provide the necessary mechanism via a helper function.
Link: https://lore.kernel.org/r/20210114203148.246656-9-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Allocate a set of Sub-CRQs in advance. During channel setup the client and
VIOS negotiate the number of queues the VIOS supports and the number that
the client desires to request. Its possible that the final channel
resources allocated is less than requested, but the client is still
responsible for sending handles for every queue it is hoping for.
Also, provide deallocation cleanup routines.
Link: https://lore.kernel.org/r/20210114203148.246656-8-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Subordinate Command Response Queues (Sub CRQ) are used in conjunction with
the primary CRQ when more than one queue is needed by the virtual I/O
adapter. Recent phyp firmware versions support Sub CRQ's with ibmvfc
adapters. This feature is a prerequisite for supporting multiple hardware
backed submission queues in the vfc adapter.
The Sub CRQ command element differs from the standard CRQ in that it is
32bytes long as opposed to 16bytes for the latter. Despite this extra
16bytes the ibmvfc protocol will use the original CRQ command element
mapped to the first 16bytes of the Sub CRQ element initially.
Add definitions for the Sub CRQ command element and queue.
Link: https://lore.kernel.org/r/20210114203148.246656-7-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Sub-CRQs are registred with firmware via a hypercall. Abstract that
interface into a simpler helper function.
Link: https://lore.kernel.org/r/20210114203148.246656-6-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
With the upcoming addition of Sub-CRQs the event pool size may vary
per-queue.
Add a size parameter to ibmvfc_init_event_pool() such that different size
event pools can be requested by ibmvfc_alloc_queue().
Link: https://lore.kernel.org/r/20210114203148.246656-5-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The event pool and CRQ used to be separate entities of the adapter host
structure and as such were allocated and freed independently of each
other. Recent work as defined a generic queue structure with an event pool
specific to each queue. As such the event pool for each queue shouldn't be
allocated/freed independently, but instead performed as part of the queue
allocation/free routines.
Move the calls to ibmvfc_event_pool_{init|free} into
ibmvfc_{alloc|free}_queue respectively. The only functional change here is
that the CRQ cannot be released in ibmvfc_remove until after the event pool
has been successfully purged since releasing the queue will also free the
event pool.
Link: https://lore.kernel.org/r/20210114203148.246656-4-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The next patch in this series reworks the event pool allocation calls to
happen within the individual queue allocation routines instead of as
independent calls.
Move the init/free routines earlier in ibmvfc.c to prevent undefined
reference errors when calling these functions from the queue allocation
code. No functional change.
Link: https://lore.kernel.org/r/20210114203148.246656-3-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Introduce several new vhost fields for managing MQ state of the adapter as
well as initial defaults for MQ enablement.
Link: https://lore.kernel.org/r/20210114203148.246656-2-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
User layer may access sysfs nodes when system PM ops or error handling is
running. This can cause various problems. Rename eh_sem to host_sem and use
it to protect PM ops and error handling from user layer intervention.
Link: https://lore.kernel.org/r/1610594010-7254-3-git-send-email-cang@codeaurora.org
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Acked-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
During system resume/suspend, hba could be NULL. In this case, do not touch
eh_sem.
Fixes: 88a92d6ae4fe ("scsi: ufs: Serialize eh_work with system PM events and async scan")
Link: https://lore.kernel.org/r/1610594010-7254-2-git-send-email-cang@codeaurora.org
Acked-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The mail address intel-linux-scu@intel.com bounces. Remove it.
Link: https://lore.kernel.org/r/1610449890-198089-1-git-send-email-john.garry@huawei.com
Cc: Ahmed S. Darwish <a.darwish@linutronix.de>
Cc: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Acked-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The memory allocated with devm_kzalloc() is freed automatically no need to
explicitly call devm_kfree(). Delete it and save some instruction cycles.
Link: https://lore.kernel.org/r/20210112092128.19295-1-huobean@gmail.com
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Fix the following coccicheck warning:
./drivers/scsi/lpfc/lpfc_bsg.c:5392:5-29: WARNING: Comparison to bool
Link: https://lore.kernel.org/r/1610439893-64872-1-git-send-email-abaci-bugfix@linux.alibaba.com
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: YANG LI <abaci-bugfix@linux.alibaba.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Kernel stack violation when getting unit_descriptor/wb_buf_alloc_units from
rpmb LUN. The reason is that the unit descriptor length is different per
LU.
The length of Normal LU is 45 while the one of rpmb LU is 35.
int ufshcd_read_desc_param(struct ufs_hba *hba, ...)
{
param_offset=41;
param_size=4;
buff_len=45;
...
buff_len=35 by rpmb LU;
if (is_kmalloc) {
/* Make sure we don't copy more data than available */
if (param_offset + param_size > buff_len)
param_size = buff_len - param_offset;
--> param_size = 250;
memcpy(param_read_buf, &desc_buf[param_offset], param_size);
--> memcpy(param_read_buf, desc_buf+41, 250);
[ 141.868974][ T9174] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: wb_buf_alloc_units_show+0x11c/0x11c
}
}
Link: https://lore.kernel.org/r/20210111095927.1830311-1-jaegeuk@kernel.org
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Link: https://lore.kernel.org/r/20210111093134.1206-8-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Enable NVMe confirmation bit in PRLI.
Link: https://lore.kernel.org/r/20210111093134.1206-7-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Mailbox Ch/dump ram extend expects mb register 10 to be set. If not
set/clear, firmware can pick up garbage from previous invocation of this
mailbox. Example: mctp dump can set mb10. On subsequent flash read which
use mailbox cmd Ch, mb10 can retain previous value.
Link: https://lore.kernel.org/r/20210111093134.1206-6-njavali@marvell.com
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
FW needs to wait for an ABTS response before completing the I/O.
Link: https://lore.kernel.org/r/20210111093134.1206-5-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Bikash Hazarika <bhazarika@marvell.com>
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
This change will aid in debugging issues arising because of dropped frame,
DIF errors, queue full etc where debug level is not set.
Link: https://lore.kernel.org/r/20210111093134.1206-4-njavali@marvell.com
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Display error counters via debugfs node.
Link: https://lore.kernel.org/r/20210111093134.1206-3-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
initiator port
This statistics will help in debugging process and checking specific error
counts. It also provides a capability to isolate the port or bring it out
of isolation.
Link: https://lore.kernel.org/r/20210111093134.1206-2-njavali@marvell.com
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Fix the following coccicheck warning:
./drivers/scsi/qedf/qedf_main.c:3716:5-31: WARNING: Comparison to bool
Link: https://lore.kernel.org/r/1610357368-62866-1-git-send-email-abaci-bugfix@linux.alibaba.com
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Acked-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: YANG LI <abaci-bugfix@linux.alibaba.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Some comments in this driver don't comply with the preferred multi-line
comment style, as reported by 'scripts/checkpatch.pl':
WARNING: Block comments use * on subsequent lines
WARNING: Block comments use a trailing */ on a separate line
Fix those comments, along with the (unreported for some reason?) starts of
the multi-line comments not being /* on their own line...
Link: https://lore.kernel.org/r/08c231e5-d86f-9d0b-19ac-ad46fa0c0b58@omprussia.ru
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Some source lines (mostly the comments) in this driver end with spaces, as
reported by 'scripts/checkpatch.pl'. Trim these lines.
Link: https://lore.kernel.org/r/59829052-4932-4ea3-b504-857bbb19e6a0@omprussia.ru
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
This driver's original authors did pretty bad job of documenting the
Command Control Block (CCB) structure -- especially its 2nd byte, where the
bit numbers were completely left out. Sync up the 'struct ccb' comments to
the Adaptec AHA-154xA manual.
Link: https://lore.kernel.org/r/17a7be14-a9d2-9822-bb3e-1d7385f486b0@omprussia.ru
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Remove a redundant if clause in ufshcd_add_query_upiu_trace.
Link: https://lore.kernel.org/r/20210110084618.189371-1-avri.altman@wdc.com
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
iov_iter_bvec() initialises iterators well, no need to pre-zero it
beforehand as done in fd_execute_rw_aio(). Compilers can't optimise it out
and generate extra code for that (confirmed with assembly).
Link: https://lore.kernel.org/r/34cd22d6cec046e3adf402accb1453cc255b9042.1610207523.git.asml.silence@gmail.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Added a log message in SATA completion path to capture the status of failed
command. If the status does not match any expected status, another message
will be logged.
On IO failure with known status, the log message will be:
[ 1712.951735] pm80xx0:: mpi_sata_completion 2269: IO failed device_id 16385 status 0x1 tag XX
If the firmware returns unexpected status, a message of the following
format will be logged:
[ 1712.951735] pm80xx0:: mpi_sata_completion XXXX: Unknown status device_id XXXXX status 0xX tag XX
Link: https://lore.kernel.org/r/20210109123849.17098-8-Viswas.G@microchip.com
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Vishakha Channapattan <vishakhavc@google.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com>
Signed-off-by: Radha Ramachandran <radha@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
In check_fw_ready() we first wait for ILA to come up and then we wait for
RAAE to come up and IOPs and so on. This is a sequential check. Because of
this, ILA image seems to be not ready in the allocated time and so the
driver marks it as "not ready" and then moves on to other FW images.
ILA does become ready eventually, but is not checked again. The driver
concludes that FW is not ready when it actually is.
Instead of sequentially polling each image, we keep polling for all images
to be ready. The timeout for the polling has been set to the sum of what
was used for each individual image.
Link: https://lore.kernel.org/r/20210109123849.17098-7-Viswas.G@microchip.com
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Bhavesh Jashnani <bjashnani@google.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com>
Signed-off-by: Radha Ramachandran <radha@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The function pm80xx_get_fatal_dump() has two issues that result in the
fatal dump not being able to complete successfully.
1. Trying to collect fatal_logs from the application fails because we are
not shifting the MEMBASE-II register properly. Once we read 64K region
of data we have to shift the MEMBASE-II register and read the next
chunk. Only then would we be able to get complete data.
2. If a timeout occurs, our application will get stuck.
Link: https://lore.kernel.org/r/20210109123849.17098-6-Viswas.G@microchip.com
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Tag was not freed in NVMD get/set data request failure scenario. This
caused a tag leak each time a request failed.
Link: https://lore.kernel.org/r/20210109123849.17098-5-Viswas.G@microchip.com
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: akshatzen <akshatzen@google.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Radha Ramachandran <radha@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The driver initializes main configuration, general status, inbound queue
and outbound queue table addresses based on a value read from
MSGU_SCRATCH_PAD_0 register.
We should validate these addresses before dereferencing them.
Adds two validations:
1. Check if main configuration table offset lies within the pcibar
mapped
2. Check if first dword of main configuration table reads "PMCS"
There are two calls to init_pci_device_addresses() done during
pm8001_pci_probe() in this sequence:
1. First inside chip_soft_rst, where if init_pci_device_addresses fails we
will go ahead assuming MPI state is not ready and reset the device as
long as bootloader is okay. This gives chance to second call of
init_pci_device_addresses to set up the addresses after reset.
2. The second call is via pm80xx_chip_init, after soft reset is done and
firmware is checked to be ready. Once that is done we are safe to go
ahead and initialize default table values and use them.
Tests:
1. Enabled debugging logs and observed no issues during initialization,
with a controller with no issues:
pm80xx0:: pm8001_setup_msix 1034: pci_alloc_irq_vectors request ret:64 no of intr 64
pm80xx0:: init_pci_device_addresses 917: Scratchpad 0 Offset: 0x2000 value 0x40002000
pm80xx0:: init_pci_device_addresses 925: Scratchpad 0 PCI BAR: 0
pm80xx0:: init_pci_device_addresses 952: VALID main config signature 0x53434d50
pm80xx0:: init_pci_device_addresses 975: GST OFFSET 0xc4
pm80xx0:: init_pci_device_addresses 978: INBND OFFSET 0x20000128
pm80xx0:: init_pci_device_addresses 981: OBND OFFSET 0x24000928
pm80xx0:: init_pci_device_addresses 984: IVT OFFSET 0x8001408
pm80xx0:: init_pci_device_addresses 987: PSPA OFFSET 0x8001608
pm80xx0:: init_pci_device_addresses 991: addr - main cfg (ptrval) general status (ptrval)
pm80xx0:: init_pci_device_addresses 995: addr - inbnd (ptrval) obnd (ptrval)
pm80xx0:: init_pci_device_addresses 999: addr - pspa (ptrval) ivt (ptrval)
pm80xx0:: pm80xx_chip_soft_rst 1446: reset register before write : 0x0
pm80xx0:: pm80xx_chip_soft_rst 1478: reset register after write 0x40
pm80xx0:: pm80xx_chip_soft_rst 1544: SPCv soft reset Complete
pm80xx0:: init_pci_device_addresses 917: Scratchpad 0 Offset: 0x2000 value 0x40002000
pm80xx0:: init_pci_device_addresses 925: Scratchpad 0 PCI BAR: 0
pm80xx0:: init_pci_device_addresses 952: VALID main config signature 0x53434d50
pm80xx0:: init_pci_device_addresses 975: GST OFFSET 0xc4
pm80xx0:: init_pci_device_addresses 978: INBND OFFSET 0x20000128
pm80xx0:: init_pci_device_addresses 981: OBND OFFSET 0x24000928
pm80xx0:: init_pci_device_addresses 984: IVT OFFSET 0x8001408
pm80xx0:: init_pci_device_addresses 987: PSPA OFFSET 0x8001608
pm80xx0:: init_pci_device_addresses 991: addr - main cfg (ptrval) general status (ptrval)
pm80xx0:: init_pci_device_addresses 995: addr - inbnd (ptrval) obnd (ptrval)
pm80xx0:: init_pci_device_addresses 999: addr - pspa (ptrval) ivt (ptrval)
pm80xx0:: pm80xx_chip_init 1329: MPI initialize successful!
2. Tested controller with firmware known to have initialization issue and
observed no crashes with this fix:
pm80xx 0000:01:00.0: pm80xx: driver version 0.1.38
pm80xx 0000:01:00.0: Removing from 1:1 domain
pm80xx 0000:01:00.0: Requesting non-1:1 mappings
pm80xx0:: init_pci_device_addresses 948: BAD main config signature 0x0
pm80xx0:: mpi_uninit_check 1365: Failed to init pci addresses
pm80xx0:: pm80xx_chip_soft_rst 1435: MPI state is not ready scratch:0:8:62a01000:0
pm80xx0:: pm80xx_chip_soft_rst 1518: Firmware is not ready!
pm80xx0:: pm80xx_chip_soft_rst 1532: iButton Feature is not Available!!!
pm80xx0:: pm80xx_chip_init 1301: Firmware is not ready!
pm80xx0:: pm8001_pci_probe 1215: chip_init failed [ret: -16]
pm80xx: probe of 0000:01:00.0 failed with error -16
pm80xx 0000:07:00.0: pm80xx: driver version 0.1.38
pm80xx 0000:07:00.0: Removing from 1:1 domain
pm80xx 0000:07:00.0: Requesting non-1:1 mappings
scsi host6: pm80xx
pm80xx1:: pm8001_setup_sgpio 5568: failed sgpio_req timeout
pm80xx1:: mpi_phy_start_resp 3447: phy start resp status:0x0, phyid:0x0
pm80xx 0000:08:00.0: pm80xx: driver version 0.1.38
pm80xx 0000:08:00.0: Removing from 1:1 domain
pm80xx 0000:08:00.0: Requesting non-1:1 mappings
3. Without this fix we observe crash on the same controller:
pm80xx 0000:01:00.0: pm80xx: driver version 0.1.38
pm80xx 0000:01:00.0: Removing from 1:1 domain
pm80xx 0000:01:00.0: Requesting non-1:1 mappings
[<ffffffffc0451b3b>] pm80xx_chip_soft_rst+0x6b/0x4c0 [pm80xx]
[<ffffffffc043a933>] pm8001_pci_probe+0xa43/0x1630 [pm80xx]
RIP: 0010:pm80xx_chip_soft_rst+0x71/0x4c0 [pm80xx]
[<ffffffffc0451b3b>] ? pm80xx_chip_soft_rst+0x6b/0x4c0 [pm80xx]
[<ffffffffc043a933>] pm8001_pci_probe+0xa43/0x1630 [pm80xx]
pm80xx0:: mpi_uninit_check 1339: TIMEOUT:IBDB value/=2
pm80xx0:: pm80xx_chip_soft_rst 1387: MPI state is not ready scratch:0:8:62a01000:0
pm80xx0:: pm80xx_chip_soft_rst 1470: Firmware is not ready!
pm80xx0:: pm80xx_chip_soft_rst 1484: iButton Feature is not Available!!!
pm80xx0:: pm80xx_chip_init 1266: Firmware is not ready!
pm80xx0:: pm8001_pci_probe 1207: chip_init failed [ret: -16]
pm80xx: probe of 0000:01:00.0 failed with error -16
Link: https://lore.kernel.org/r/20210109123849.17098-4-Viswas.G@microchip.com
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: akshatzen <akshatzen@google.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Radha Ramachandran <radha@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
When the controller runs into a fatal error, commands get stuck due to no
response. If the controller is in fatal error state, abort requests issued
to the controller get stuck too.
Check the controller state for fatal error conditions.
Link: https://lore.kernel.org/r/20210109123849.17098-3-Viswas.G@microchip.com
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: akshatzen <akshatzen@google.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Radha Ramachandran <radha@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
We do not need to busy wait during mpi_init_check() since it is not being
invoked in atomic context. mpi_init_check() is being called from
pm8001_pci_resume(), pm8001_pci_probe(). Hence we are replacing udelay with
msleep.
Link: https://lore.kernel.org/r/20210109123849.17098-2-Viswas.G@microchip.com
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: akshatzen <akshatzen@google.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Radha Ramachandran <radha@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
According to the spec (JESD220E chapter 7.2), while powering off/on the ufs
device, RST_n signal should be between VSS(Ground) and VCCQ/VCCQ2.
Link: https://lore.kernel.org/r/1610103385-45755-3-git-send-email-ziqichen@codeaurora.org
Acked-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Ziqi Chen <ziqichen@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
According to the spec (JESD220E chapter 7.2), while powering off/on the ufs
device, REF_CLK signal should be between VSS(Ground) and VCCQ/VCCQ2.
Link: https://lore.kernel.org/r/1610103385-45755-2-git-send-email-ziqichen@codeaurora.org
Reviewed-by: Can Guo <cang@codeaurora.org>
Acked-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Ziqi Chen <ziqichen@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|