summaryrefslogtreecommitdiff
path: root/drivers/scsi/lpfc/lpfc_sli4.h
AgeCommit message (Collapse)Author
2019-06-18scsi: lpfc: Fix poor use of hardware queues if fewer irq vectorsJames Smart
While fixing the resources per socket, realized the driver was not using hardware queues (up to 1 per cpu) if there were fewer interrupt vectors. The driver was only using the hardware queue assigned to the cpu with the vector. Rework the affinity map check to use the additional hardware queue elements that had been allocated. If the cpu count exceeds the hardware queue count - share, but choose what is shared with by: hyperthread peer, core peer, socket peer, or finally similar cpu in a different socket. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18scsi: lpfc: Fix oops when driver is loaded with 1 interrupt vectorJames Smart
The driver was coded expecting enough hardware queues and interrupt vectors such that at least there was one per socket. In the case where there were fewer than sockets, cpus were left unassigned thus null pointers. Rework the affinity mappings. Map settings for the cpu's that are in the irq cpu mask. For each cpu not in the mask, map to another cpu that does have a mask. Choice of the "other" cpu will attempt to map to the same cpu but differing hyperthread, or cpu within in same core, or cpu within same socket, or finally cpu in the base socket. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18scsi: lpfc: Rework misleading nvme not supported in firmware messageJames Smart
The driver unconditionally says fw doesn't support nvme when in truth it was a driver parameter settings that disabled nvme support. Rework the code validating nvme support to accurately report what condition is disabling nvme support. Save state on whether nvme fw supports nvme in case sysfs attributes change dynamically. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18scsi: lpfc: Fix nvmet handling of received ABTS for unmapped framesJames Smart
The driver currently is relying on firmware to match ABTSs to existing exchanges. This works fine as long as an exchange has been assigned to the io and work posted to it. However, for unmapped frames (rxid=0xFFFF), the driver has yet to assign an xri. The driver was blindly saying it couldn't match the ABTS and sending the BA_xxx. However, the command frame may have been in queues waiting on xri's before posting to the nvmet_fc layer. When xri's became available, the command frame would still be pushed to the transport and that io would execute, even though the io had been killed by ABTS. The initiator, seeing the io ABTS'd, would reuse the exchange for a different io which would be received on the target and pushed up. If the "zombie" io then came back down and started transmitting, the initiator would match the oxid and accept erroneous data. Bad things happened. Add tracking of active exchanges in the target to allow matching of a received ABTS against active or pending IO requests. If the ABTS is matched to a pending or active IO, the drive initiates cleanup and conditionally notifies the transport. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18scsi: lpfc: Separate CQ processing for nvmet_fc upcallsJames Smart
Currently the driver is notified of new command frame receipt by CQEs. As part of the CQE processing, the driver upcalls the nvmet_fc transport to deliver the command. nvmet_fc, as part of receiving the command builds out a context for it, where one of the first steps is to allocate memory for the io. When running with tests that do large ios (1MB), it was found on some systems, the total number of outstanding I/O's, at 1MB per, completely consumed the system's memory. Thus additional ios were getting blocked in the memory allocator. Given that this blocked the lpfc thread processing CQEs, there were lots of other commands that were received and which are then held up, and given CQEs are serially processed, the aggregate delays for an IO waiting behind the others became cummulative - enough so that the initiator hit timeouts for the ios. The basic fix is to avoid the direct upcall and instead schedule a work item for each io as it is received. This allows the cq processing to complete very quickly, and each io can then run or block on it's own. However, this general solution hurts latency when there are few ios. As such, implemented the fix such that the driver watches how many CQEs it has processed sequentially in one run. As long as the count is below a threshold, the direct nvmet_fc upcall will be made. Only when the count is exceeded will it revert to work scheduling. Given that debug of this showed a surprisingly long delay in cq processing, the io timer stats were updated to better reflect the processing of the different points. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-20scsi: lpfc: Fixup eq_clr_intr referencesJames Smart
Declaring interrupt clear routines as inline is bogus as they are used as an indirect pointer. Remove the inline references. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-20scsi: lpfc: Fix build errorJames Bottomley
You can't declare a function inline in a header if it doesn't have a body available to the compiler. So realistically you either don't declare it inline or you make it a static inline in the header. I think the latter applies in this case, so this should be the fix Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-19scsi: lpfc: Specify node affinity for queue memory allocationJames Smart
Change the SLI4 queue creation code to use NUMA node based memory allocation based on the cpu the queues will be related to. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-19scsi: lpfc: Reduce memory footprint for lpfc_queueJames Smart
Currently the driver maintains a sideband structure which has a pointer for each queue element. However, at 8 bytes per pointer, and up to 4k elements per queue, and 100s of queues, this can take up a lot of memory. Convert the driver to using an access routine that calculates the element address based on its index rather than using the pointer table. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-19scsi: lpfc: Add loopback testing to trunking modeJames Smart
When in trunking mode, the adapter can be placed into diagnostic mode and each link in the trunk tested via loopback. Add support to the driver to perform per-link loopback testing when in trunking mode. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-19scsi: lpfc: Fix link speed reporting for 4-link trunkJames Smart
Driver is using uint16_t and is encountering an overflow of the 16bits when calculating link speed. Fix by using a u32 type. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05scsi: lpfc: Update 12.2.0.0 file copyrights to 2019James Smart
For files modified as part of 12.2.0.0 patches, update copyright to 2019 Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05scsi: lpfc: Resize cpu maps structures based on possible cpusJames Smart
The work done to date utilized the number of present cpus when sizing per-cpu structures. Structures should have been sized based on the max possible cpu count. Convert the driver over to possible cpu count for sizing allocation. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05scsi: lpfc: Rework EQ/CQ processing to address interrupt coalescingJames Smart
When driving high iop counts, auto_imax coalescing kicks in and drives the performance to extremely small iops levels. There are two issues: 1) auto_imax is enabled by default. The auto algorithm, when iops gets high, divides the iops by the hdwq count and uses that value to calculate EQ_Delay. The EQ_Delay is set uniformly on all EQs whether they have load or not. The EQ_delay is only manipulated every 5s (a long time). Thus there were large 5s swings of no interrupt delay followed by large/maximum delay, before repeating. 2) When processing a CQ, the driver got mixed up on the rate of when to ring the doorbell to keep the chip appraised of the eqe or cqe consumption as well as how how long to sit in the thread and process queue entries. Currently, the driver capped its work at 64 entries (very small) and exited/rearmed the CQ. Thus, on heavy loads, additional overheads were taken to exit and re-enter the interrupt handler. Worse, if in the large/maximum coalescing windows,k it could be a while before getting back to servicing. The issues are corrected by the following: - A change in defaults. Auto_imax is turned OFF and fcp_imax is set to 0. Thus all interrupts are immediate. - Cleanup of field names and their meanings. Existing names were non-intuitive or used for duplicate things. - Added max_proc_limit field, to control the length of time the handlers would service completions. - Reworked EQ handling: Added common routine that walks eq, applying notify interval and max processing limits. Use queue_claimed to claim ownership of the queue while processing. Always rearm the queue whenever the common routine is called. Rework queue element processing, namely to eliminate hba_index vs host_index. Only one index is necessary. The queue entry can be marked invalid and the host_index updated immediately after eqe processing. After rework, xx_release routines are now DB write functions. Renamed the routines as such. Moved lpfc_sli4_eq_flush(), which does similar action, to same area. Replaced the 2 individual loops that walk an eq with a call to the common routine. Slightly revised lpfc_sli4_hba_handle_eqe() calling syntax. Added per-cpu counters to detect interrupt rates and scale interrupt coalescing values. - Reworked CQ handling: Added common routine that walks cq, applying notify interval and max processing limits. Use queue_claimed to claim ownership of the queue while processing. Always rearm the queue whenever the common routine is called. Rework queue element processing, namely to eliminate hba_index vs host_index. Only one index is necessary. The queue entry can be marked invalid and the host_index updated immediately after cqe processing. After rework, xx_release routines are now DB write functions. Renamed the routines as such. Replaced the 3 individual loops that walk a cq with a call to the common routine. Redefined lpfc_sli4_sp_handle_mcqe() to commong handler definition with queue reference. Add increment for mbox completion to handler. - Added a new module/sysfs attribute: lpfc_cq_max_proc_limit To allow dynamic changing of the CQ max_proc_limit value being used. Although this leaves an EQ as an immediate interrupt, that interrupt will only occur if a CQ bound to it is in an armed state and has cqe's to process. By staying in the cq processing routine longer, high loads will avoid generating more interrupts as they will only rearm as the processing thread exits. The immediately interrupt is also beneficial to idle or lower-processing CQ's as they get serviced immediately without being penalized by sharing an EQ with a more loaded CQ. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05scsi: lpfc: cleanup: convert eq_delay to usdelayJames Smart
Review of the eq coalescing logic showed the code was a bit fragmented. Sometimes it would save/set via an interrupt max value, while in others it would do so via a usdelay. There were also two places changing eq delay, one place that issued mailbox commands, and another that changed via register writes if supported. Clean this up by: - Standardizing the operation of lpfc_modify_hba_eq_delay() routine so that it is always told of a us delay to impose. The routine then chooses the best way to set that - via register or via mbx. - Rather than two value types stored in eq->q_mode (usdelay if change via register, imax if change via mbox) - q_mode always contains usdelay. Before any value change, old vs new value is compared and only if different is a change done. - Revised the dmult calculation. dmult is not set based on overall imax divided by hardware queues - instead imax applies to a single cpu and the value will be replicated to all cpus. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05scsi: lpfc: Support non-uniform allocation of MSIX vectors to hardware queuesJames Smart
So far MSIX vector allocation assumed it would be 1:1 with hardware queues. However, there are several reasons why fewer MSIX vectors may be allocated than hardware queues such as the platform being out of vectors or adapter limits being less than cpu count. This patch reworks the MSIX/EQ relationships with the per-cpu hardware queues so they can function independently. MSIX vectors will be equitably split been cpu sockets/cores and then the per-cpu hardware queues will be mapped to the vectors most efficient for them. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05scsi: lpfc: Fix setting affinity hints to correlate with hardware queuesJames Smart
The desired affinity for the hardware queue behavior is for hdwq 0 to be affinitized with cpu 0, hdwq 1 to cpu 1, and so on. The implementation so far does not do this if the number of cpus is greater than the number of hardware queues (e.g. hardware queue allocation was administratively reduced or hardware queue resources could not scale to the cpu count). Correct the queue affinitization logic when queue count is less than cpu count. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05scsi: lpfc: Adapt partitioned XRI lists to efficient sharingJames Smart
The XRI get/put lists were partitioned per hardware queue. However, the adapter rarely had sufficient resources to give a large number of resources per queue. As such, it became common for a cpu to encounter a lack of XRI resource and request the upper io stack to retry after returning a BUSY condition. This occurred even though other cpus were idle and not using their resources. Create as efficient a scheme as possible to move resources to the cpus that need them. Each cpu maintains a small private pool which it allocates from for io. There is a watermark that the cpu attempts to keep in the private pool. The private pool, when empty, pulls from a global pool from the cpu. When the cpu's global pool is empty it will pull from other cpu's global pool. As there many cpu global pools (1 per cpu or hardware queue count) and as each cpu selects what cpu to pull from at different rates and at different times, it creates a radomizing effect that minimizes the number of cpu's that will contend with each other when the steal XRI's from another cpu's global pool. On io completion, a cpu will push the XRI back on to its private pool. A watermark level is maintained for the private pool such that when it is exceeded it will move XRI's to the CPU global pool so that other cpu's may allocate them. On NVME, as heartbeat commands are critical to get placed on the wire, a single expedite pool is maintained. When a heartbeat is to be sent, it will allocate an XRI from the expedite pool rather than the normal cpu private/global pools. On any io completion, if a reduction in the expedite pools is seen, it will be replenished before the XRI is placed on the cpu private pool. Statistics are added to aid understanding the XRI levels on each cpu and their behaviors. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05scsi: lpfc: Move SCSI and NVME Stats to hardware queue structuresJames Smart
Many io statistics were being sampled and saved using adapter-based data structures. This was creating a lot of contention and cache thrashing in the I/O path. Move the statistics to the hardware queue data structures. Given the per-queue data structures, use of atomic types is lessened. Add new sysfs and debugfs stat routines to collate the per hardware queue values and report at an adapter level. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05scsi: lpfc: Adapt cpucheck debugfs logic to Hardware QueuesJames Smart
Similar to the io execution path that reports cpu context information, the debugfs routines for cpu information needs to be aligned with new hardware queue implementation. Convert debugfs cnd nvme cpucheck statistics to report information per Hardware Queue. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05scsi: lpfc: Partition XRI buffer list across Hardware QueuesJames Smart
Once the IO buff allocations were made shared, there was a single XRI buffer list shared by all hardware queues. A single list isn't great for performance when shared across the per-cpu hardware queues. Create a separate XRI IO buffer get/put list for each Hardware Queue. As SGLs and associated IO buffers get allocated/posted to the firmware; round robin their assignment across all available hardware Queues so that there is an equitable assignment. Modify SCSI and NVME IO submit code paths to use the Hardware Queue logic for XRI allocation. Add a debugfs interface to display hardware queue statistics Added new empty_io_bufs counter to track if a cpu runs out of XRIs. Replace common_ variables/names with io_ to make meanings clearer. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05scsi: lpfc: Replace io_channels for nvme and fcp with general hdw_queues per cpuJames Smart
Currently, both nvme and fcp each have their own concept of an io_channel, which is a combination wq/cq and associated msix. Different cpus would share an io_channel. The driver is now moving to per-cpu wq/cq pairs and msix vectors. The driver will still use separate wq/cq pairs per protocol on each cpu, but the protocols will share the msix vector. Given the elimination of the nvme and fcp io channels, the module parameters will be removed. A new parameter, lpfc_hdw_queue is added which allows the wq/cq pair allocation per cpu to be overridden and allocated to lesser value. If lpfc_hdw_queue is zero, the number of pairs allocated will be based on the number of cpus. If non-zero, the parameter specifies the number of queues to allocate. At this time, the maximum non-zero value is 64. To manage this new paradigm, a new hardware queue structure is created to track queue activity and relationships. As MSIX vector allocation must be known before setting up the relationships, msix allocation now occurs before queue datastructures are allocated. If the number of vectors allocated is less than the desired hardware queues, the hardware queue counts will be reduced to the number of vectors Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05scsi: lpfc: Remove extra vector and SLI4 queue for ExpresslaneJames Smart
There is a extra queue and msix vector for expresslane. Now that the driver will be doing queues per cpu, this oddball queue is no longer needed. Expresslane will utilize the normal per-cpu queues. Updated debugfs sli4 queue output to go along with the change Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05scsi: lpfc: Implement common IO buffers between NVME and SCSIJames Smart
Currently, both NVME and SCSI get their IO buffers from separate pools. XRI's are associated 1:1 with IO buffers, so XRI's are also split between protocols. Eliminate the independent pools and use a single pool. Each buffer structure now has a common section and a protocol section. Per protocol routines for SGL initialization are removed and replaced by common routines. Initialization of the buffers is only done on the common area. All other fields, which are protocol specific, are initialized when the buffer is allocated for use in the per-protocol allocation routine. In the past, the SCSI side allocated IO buffers as part of slave_alloc calls until the maximum XRIs for SCSI was reached. As all XRIs are now common and may be used for either protocol, allocation for everything is done as part of adapter initialization and the scsi side has no action in slave alloc. As XRI's are no longer split, the lpfc_xri_split module parameter is removed. Adapters based on SLI3 will continue to use the older scsi_buf_list_get/put routines. All SLI4 adapters utilize the new IO buffer scheme Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-11-06scsi: lpfc: add Trunking supportJames Smart
Add trunking support to the driver. Trunking is found on more recent asics. In general, trunking appears as a single "port" to the driver and overall behavior doesn't differ. Link speed is reported as an aggregate value, while link speed control is done on a per-physical link basis with all links in the trunk symmetrical. Some commands returning port information are updated to additionally provide trunking information. And new ACQEs are generated to report physical link events relative to the trunk. This patch contains the following modifications: - Added link speed settings of 128GB and 256GB. - Added handling of trunk-related ACQEs, mainly logging and trapping of physical link statuses. - Added additional bsg interface to query trunk state by applications. - Augment link_state sysfs attribtute to display trunk link status Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-11-06scsi: lpfc: fcoe: Fix link down issue after 1000+ link bouncesJames Smart
On FCoE adapters, when running link bounce test in a loop, initiator failed to login with switch switch and required driver reload to recover. Switch reached a point where all subsequent FLOGIs would be LS_RJT'd. Further testing showed the condition to be related to not performing FCF discovery between FLOGI's. Fix by monitoring FLOGI failures and once a repeated error is seen repeat FCF discovery. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-09-11scsi: lpfc: add support to retrieve firmware logsJames Smart
This patch adds the ability to read firmware logs from the adapter. The driver registers a buffer with the adapter that is then written to by the adapter. The adapter posts CQEs to indicate content updates in the buffer. While the adapter is writing to the buffer in a circular fashion, an application will poll the driver to read the next amount of log data from the buffer. Driver log buffer size is configurable via the ras_fwlog_buffsize sysfs attribute. Verbosity to be used by firmware when logging to host memory is controlled through the ras_fwlog_level attribute. The ras_fwlog_func attribute enables or disables loggy by firmware. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-07-10scsi: lpfc: Revise copyright for new company languageJames Smart
Change references from "Broadcom Limited" to "Broadcom Inc." in the copyright message. Update copyright duration if not yet updated for 2018. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-07-10scsi: lpfc: Support duration field in Link Cable Beacon V1 commandJames Smart
Current implementation missed setting the duration field. Correct the code to set the field. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-22scsi: lpfc: Add if_type=6 support for cycling valid bitsJames Smart
Traditional SLI4 required the driver to clear Valid bits on EQEs and CQEs after consuming them. The new if_type=6 hardware will cycle the value for what is valid on each queue itteration. The driver no longer has to touch the valid bits. This also means all the cpu cache dirtying and perhaps flush/refill's done by the hardware in accessing the EQ/CQ elements is eliminated. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-22scsi: lpfc: Add push-to-adapter support to sli4James Smart
New if_type=6 adapters support an additional BAR that provides apertures to allow direct WQE to adapter push support - termed Direct Packet Push (DPP). WQ creation differs slightly to ask for a WQ to be DPP-ized. When submitting a WQE to a DPP WQ, it is submitted to the host memory for the WQ normally, but is also written by the host cpu directly to a BAR aperture. Write buffer coalescing in hardware is (hopefully) turned on, enabling single pci write operation support. The doorbell is thing rung to indicate the WQE is available and was pushed to the aperture. This patch: - Updates the WQ Create commands for the DPP options - Adds the bar mapping for if_type=6 DPP bar - Adds the WQE pushing to the DDP aperture received from WQ create - Adds a new module parameter to disable DPP operation if desired. Default is enabled. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-22scsi: lpfc: Add SLI-4 if_type=6 support to the code baseJames Smart
New hardware supports a SLI-4 interface, but with a new if_type variant of 6. If_type=6 has a different PCI BAR map, separate EQ/CQ doorbells, and some changes in doorbell formats. Add the changes for the if_type into headers, adapter initialization and control flows. Add new eq and cq handlers. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-22scsi: lpfc: Rework sli4 doorbell infrastructureJames Smart
Up until now, all SLI-4 devices had the same doorbells at the same bar locations. With newer hardware, there are now independent EQ and CQ doorbells and the bar locations differ. Prepare the code for new hardware by separating the eq/cq doorbell into separate components. The components can be set based on if_type. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-22scsi: lpfc: Rework lpfc to allow different sli4 cq and eq handlersJames Smart
Up until now, an SLI-4 device had no variance in the way it handled its EQs and CQs. With newer hardware, there are now differences in doorbells and some differences in how entries are valid. Prepare the code for new hardware by creating a sli4-based callout table that can be set based on if_type. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-12scsi: lpfc: Update 11.4.0.7 modified files for 2018 CopyrightJames Smart
Updated Copyright in files updated 11.4.0.7 Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-12scsi: lpfc: Add WQ Full Logic for NVME TargetJames Smart
I/O conditions on the nvme target may have the driver submitting to a full hardware wq. The hardware wq is a shared resource among all nvme controllers. When the driver hit a full wq, it failed the io posting back to the nvme-fc transport, which then escalated it into errors. Correct by maintaining a sideband queue within the driver that is added to when the WQ full condition is hit, and drained from as soon as new WQ space opens up. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-12scsi: lpfc: Increase CQ and WQ sizes for SCSIJames Smart
Increased CQ and WQ sizes for SCSI FCP, matching those used for NVMe development. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-20scsi: lpfc: Increase SCSI CQ and WQ sizes.James Smart
Increased the sizes of the SCSI WQ's and CQ's so that SCSI operation is similar to that used by NVME. However, size increase restricted only to those newer adapters that can support the larger WQE size, thus bigger queue sizes. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-04scsi: lpfc: Handle XRI_ABORTED_CQE in soft IRQJames Smart
XRI_ABORTED_CQE completions were not being handled in the fast path. They were being queued and deferred to the lpfc worker thread for processing. This is an artifact of the driver design prior to moving queue processing out of the isr and into a workq element. Now that queue processing is already in a deferred context, remove this artifact and process them directly. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-04scsi: lpfc: Expand WQE capability of every NVME hardware queueJames Smart
Hardware queues are a fast staging area to push commands into the adapter. The adapter should drain them extremely quickly. However, under heavy io load, the host cpu is pushing commands faster than the drain rate of the adapter causing the driver to resource busy commands. Enlarge the hardware queue (wq & cq) to support a larger number of queue entries (4x the prior size) before backpressure. Enlarging the queue requires larger contiguous buffers (16k) per logical page for the hardware. This changed calling sequences that were expecting 4K page sizes that now must pass a parameter with the page sizes. It also required use of a new version of an adapter command that can vary the page size values. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02scsi: lpfc: Move CQ processing to a soft IRQDick Kennedy
Under heavy target nvme load duration, the lpfc irq handler is encountering cpu lockup warnings. Convert the driver to a shortened ISR handler which identifies the interrupting condition then schedules a workq thread to process the completion queue the interrupt was for. This moves all the real work into the workq element. As nvmet_fc upcalls are no longer in ISR context, don't set the feature flags Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24scsi: lpfc: Add Buffer to Buffer credit recovery supportJames Smart
Add Buffer to buffer credit recovery support to the driver. This is a negotiated feature with the peer that allows for both sides to detect dropped RRDY's and FC Frames and recover credit. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24scsi: lpfc: Fix MRQ > 1 context list handlingDick Kennedy
Various oops including cpu LOCKUPs were seen. For asynchronously received ius where the driver must assign exchange resources, the resources were on a single get (free) list and put list (finished, waiting to be put on get list). As all cpus are sharing the lists, an interrupt for a receive frame may have to wait for all the other cpus to place their done work onto the put list before it can acquire the lock to pull from the list. Fix by breaking the resource lists into per-cpu lists or at least more than 1 list with cpu's sharing the lists). A cpu would allocate from the free list for its own cpu, and put its done work on the its own put list - avoiding the contention. As cpu load may vary, when empty, a cpu may grab from another cpu, thereby changing resource distribution. But searching for a resource only occurs on 1 or a few cpus until a single resource can be allocated. if the condition reoccurs, it starts looking at a different cpu. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24scsi: lpfc: Limit amount of work processed in IRQDick Kennedy
Various oops being seen on being in the ISR too long and cpu lockups, when under heavy load. The amount of work being posted off of completion queues kept the ISR running almost all the time Correct the issue by limiting the amount of work per iteration. [mkp: typo] Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-19scsi: lpfc: Break up IO ctx list into a separate get and put listJames Smart
Since unsol rcv ISR and command cmpl ISR both access/lock this list, separate get/put lists will reduce contention. Replaced struct list_head lpfc_nvmet_ctx_list; with struct list_head lpfc_nvmet_ctx_get_list; struct list_head lpfc_nvmet_ctx_put_list; and all correpsonding locks and counters. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12scsi: lpfc: Add auto EQ delay logicJames Smart
Administrator intervention is currently required to get good numbers when switching from running latency tests to IOPS tests. The configured interrupt coalescing values will greatly effect the results of these tests. Currently, the driver has a single coalescing value set by values of the module attribute. This patch changes the driver to support auto-configuration of the coalescing value based on the total number of outstanding IOs and average number of CQEs processed per interrupt for an EQ. Values are checked every 5 seconds. The driver defaults to the automatic selection. Automatic selection can be disabled by the new lpfc_auto_imax module_parameter. Older hardware can only change interrupt coalescing by mailbox command. Newer hardware supports change via a register. The patch support both. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12scsi: lpfc: Fix System panic after loading the driverJames Smart
System panic with general protection fault during driver load The driver uses a static array sli4_hba.handler_name to store the irq handler names. If the io_channel_irqs exceeds the pre-allocated size (32+1), then the driver will overwrite other fields of sli4_hba. Fix: Dynamically allocate handler_name. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16scsi: lpfc: Cleanup entry_repost settings on SLI4 queuesJames Smart
Too many work items being processed in IRQ context take a lot of CPU time and cause problems. With a recent change, we get out of the ISR after hitting entry_repost work items on a queue. However, the actual values for entry repost are still high. EQ is 128 and CQ is 128, this could translate into processing 128 * 128 (16384) work items under IRQ context. Set entry_repost in the actual queue creation routine now. Limit EQ repost to 8 and CQ repost to 64 to further limit the amount of time spent in the IRQ. Fix fof IRQ routines as well. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16scsi: lpfc: Added recovery logic for running out of NVMET IO context resourcesJames Smart
Previous logic would just drop the IO. Added logic to queue the IO to wait for an IO context resource from an IO thats already in progress. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16scsi: lpfc: Separate NVMET RQ buffer posting from IO resources SGL/iocbq/contextJames Smart
Currently IO resources are mapped 1 to 1 with RQ buffers posted Added logic to separate RQE buffers from IO op resources (sgl/iocbq/context). During initialization, the driver will determine how many SGLs it will allocate for NVMET (based on what the firmware reports) and associate a NVMET IOCBq and NVMET context structure with each one. Now that hdr/data buffers are immediately reposted back to the RQ, 512 RQEs for each MRQ is sufficient. Also, since NVMET data buffers are now 128 bytes, lpfc_nvmet_mrq_post is not necessary anymore as we will always post the max (512) buffers per NVMET MRQ. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>