summaryrefslogtreecommitdiff
path: root/drivers/net/ipa/gsi_trans.c
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2022-08-03 16:29:08 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2022-08-03 16:29:08 -0700
commitf86d1fbbe7858884d6754534a0afbb74fc30bc26 (patch)
treef61796870edefbe77d495e9d719c68af1d14275b /drivers/net/ipa/gsi_trans.c
parent526942b8134cc34d25d27f95dfff98b8ce2f6fcd (diff)
parent7c6327c77d509e78bff76f2a4551fcfee851682e (diff)
Merge tag 'net-next-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking changes from Paolo Abeni: "Core: - Refactor the forward memory allocation to better cope with memory pressure with many open sockets, moving from a per socket cache to a per-CPU one - Replace rwlocks with RCU for better fairness in ping, raw sockets and IP multicast router. - Network-side support for IO uring zero-copy send. - A few skb drop reason improvements, including codegen the source file with string mapping instead of using macro magic. - Rename reference tracking helpers to a more consistent netdev_* schema. - Adapt u64_stats_t type to address load/store tearing issues. - Refine debug helper usage to reduce the log noise caused by bots. BPF: - Improve socket map performance, avoiding skb cloning on read operation. - Add support for 64 bits enum, to match types exposed by kernel. - Introduce support for sleepable uprobes program. - Introduce support for enum textual representation in libbpf. - New helpers to implement synproxy with eBPF/XDP. - Improve loop performances, inlining indirect calls when possible. - Removed all the deprecated libbpf APIs. - Implement new eBPF-based LSM flavor. - Add type match support, which allow accurate queries to the eBPF used types. - A few TCP congetsion control framework usability improvements. - Add new infrastructure to manipulate CT entries via eBPF programs. - Allow for livepatch (KLP) and BPF trampolines to attach to the same kernel function. Protocols: - Introduce per network namespace lookup tables for unix sockets, increasing scalability and reducing contention. - Preparation work for Wi-Fi 7 Multi-Link Operation (MLO) support. - Add support to forciby close TIME_WAIT TCP sockets via user-space tools. - Significant performance improvement for the TLS 1.3 receive path, both for zero-copy and not-zero-copy. - Support for changing the initial MTPCP subflow priority/backup status - Introduce virtually contingus buffers for sockets over RDMA, to cope better with memory pressure. - Extend CAN ethtool support with timestamping capabilities - Refactor CAN build infrastructure to allow building only the needed features. Driver API: - Remove devlink mutex to allow parallel commands on multiple links. - Add support for pause stats in distributed switch. - Implement devlink helpers to query and flash line cards. - New helper for phy mode to register conversion. New hardware / drivers: - Ethernet DSA driver for the rockchip mt7531 on BPI-R2 Pro. - Ethernet DSA driver for the Renesas RZ/N1 A5PSW switch. - Ethernet DSA driver for the Microchip LAN937x switch. - Ethernet PHY driver for the Aquantia AQR113C EPHY. - CAN driver for the OBD-II ELM327 interface. - CAN driver for RZ/N1 SJA1000 CAN controller. - Bluetooth: Infineon CYW55572 Wi-Fi plus Bluetooth combo device. Drivers: - Intel Ethernet NICs: - i40e: add support for vlan pruning - i40e: add support for XDP framented packets - ice: improved vlan offload support - ice: add support for PPPoE offload - Mellanox Ethernet (mlx5) - refactor packet steering offload for performance and scalability - extend support for TC offload - refactor devlink code to clean-up the locking schema - support stacked vlans for bridge offloads - use TLS objects pool to improve connection rate - Netronome Ethernet NICs (nfp): - extend support for IPv6 fields mangling offload - add support for vepa mode in HW bridge - better support for virtio data path acceleration (VDPA) - enable TSO by default - Microsoft vNIC driver (mana) - add support for XDP redirect - Others Ethernet drivers: - bonding: add per-port priority support - microchip lan743x: extend phy support - Fungible funeth: support UDP segmentation offload and XDP xmit - Solarflare EF100: add support for virtual function representors - MediaTek SoC: add XDP support - Mellanox Ethernet/IB switch (mlxsw): - dropped support for unreleased H/W (XM router). - improved stats accuracy - unified bridge model coversion improving scalability (parts 1-6) - support for PTP in Spectrum-2 asics - Broadcom PHYs - add PTP support for BCM54210E - add support for the BCM53128 internal PHY - Marvell Ethernet switches (prestera): - implement support for multicast forwarding offload - Embedded Ethernet switches: - refactor OcteonTx MAC filter for better scalability - improve TC H/W offload for the Felix driver - refactor the Microchip ksz8 and ksz9477 drivers to share the probe code (parts 1, 2), add support for phylink mac configuration - Other WiFi: - Microchip wilc1000: diable WEP support and enable WPA3 - Atheros ath10k: encapsulation offload support Old code removal: - Neterion vxge ethernet driver: this is untouched since more than 10 years" * tag 'net-next-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1890 commits) doc: sfp-phylink: Fix a broken reference wireguard: selftests: support UML wireguard: allowedips: don't corrupt stack when detecting overflow wireguard: selftests: update config fragments wireguard: ratelimiter: use hrtimer in selftest net/mlx5e: xsk: Discard unaligned XSK frames on striding RQ net: usb: ax88179_178a: Bind only to vendor-specific interface selftests: net: fix IOAM test skip return code net: usb: make USB_RTL8153_ECM non user configurable net: marvell: prestera: remove reduntant code octeontx2-pf: Reduce minimum mtu size to 60 net: devlink: Fix missing mutex_unlock() call net/tls: Remove redundant workqueue flush before destroy net: txgbe: Fix an error handling path in txgbe_probe() net: dsa: Fix spelling mistakes and cleanup code Documentation: devlink: add add devlink-selftests to the table of contents dccp: put dccp_qpolicy_full() and dccp_qpolicy_push() in the same lock net: ionic: fix error check for vlan flags in ionic_set_nic_features() net: ice: fix error NETIF_F_HW_VLAN_CTAG_FILTER check in ice_vsi_sync_fltr() nfp: flower: add support for tunnel offload without key ID ...
Diffstat (limited to 'drivers/net/ipa/gsi_trans.c')
-rw-r--r--drivers/net/ipa/gsi_trans.c197
1 files changed, 99 insertions, 98 deletions
diff --git a/drivers/net/ipa/gsi_trans.c b/drivers/net/ipa/gsi_trans.c
index 55f8fe7d2668..18e7e8c405be 100644
--- a/drivers/net/ipa/gsi_trans.c
+++ b/drivers/net/ipa/gsi_trans.c
@@ -214,26 +214,14 @@ void *gsi_trans_pool_alloc_dma(struct gsi_trans_pool *pool, dma_addr_t *addr)
return pool->base + offset;
}
-/* Return the pool element that immediately follows the one given.
- * This only works done if elements are allocated one at a time.
- */
-void *gsi_trans_pool_next(struct gsi_trans_pool *pool, void *element)
+/* Map a TRE ring entry index to the transaction it is associated with */
+static void gsi_trans_map(struct gsi_trans *trans, u32 index)
{
- void *end = pool->base + pool->count * pool->size;
-
- WARN_ON(element < pool->base);
- WARN_ON(element >= end);
- WARN_ON(pool->max_alloc != 1);
-
- element += pool->size;
+ struct gsi_channel *channel = &trans->gsi->channel[trans->channel_id];
- return element < end ? element : pool->base;
-}
+ /* The completion event will indicate the last TRE used */
+ index += trans->used_count - 1;
-/* Map a given ring entry index to the transaction associated with it */
-static void gsi_channel_trans_map(struct gsi_channel *channel, u32 index,
- struct gsi_trans *trans)
-{
/* Note: index *must* be used modulo the ring count here */
channel->trans_info.map[index % channel->tre_ring.count] = trans;
}
@@ -253,15 +241,31 @@ struct gsi_trans *gsi_channel_trans_complete(struct gsi_channel *channel)
struct gsi_trans, links);
}
-/* Move a transaction from the allocated list to the pending list */
+/* Move a transaction from the allocated list to the committed list */
+static void gsi_trans_move_committed(struct gsi_trans *trans)
+{
+ struct gsi_channel *channel = &trans->gsi->channel[trans->channel_id];
+ struct gsi_trans_info *trans_info = &channel->trans_info;
+
+ spin_lock_bh(&trans_info->spinlock);
+
+ list_move_tail(&trans->links, &trans_info->committed);
+
+ spin_unlock_bh(&trans_info->spinlock);
+}
+
+/* Move transactions from the committed list to the pending list */
static void gsi_trans_move_pending(struct gsi_trans *trans)
{
struct gsi_channel *channel = &trans->gsi->channel[trans->channel_id];
struct gsi_trans_info *trans_info = &channel->trans_info;
+ struct list_head list;
spin_lock_bh(&trans_info->spinlock);
- list_move_tail(&trans->links, &trans_info->pending);
+ /* Move this transaction and all predecessors to the pending list */
+ list_cut_position(&list, &trans_info->committed, &trans->links);
+ list_splice_tail(&list, &trans_info->pending);
spin_unlock_bh(&trans_info->spinlock);
}
@@ -340,7 +344,7 @@ struct gsi_trans *gsi_channel_trans_alloc(struct gsi *gsi, u32 channel_id,
struct gsi_trans_info *trans_info;
struct gsi_trans *trans;
- if (WARN_ON(tre_count > gsi_channel_trans_tre_max(gsi, channel_id)))
+ if (WARN_ON(tre_count > channel->trans_tre_max))
return NULL;
trans_info = &channel->trans_info;
@@ -351,14 +355,14 @@ struct gsi_trans *gsi_channel_trans_alloc(struct gsi *gsi, u32 channel_id,
if (!gsi_trans_tre_reserve(trans_info, tre_count))
return NULL;
- /* Allocate and initialize non-zero fields in the the transaction */
+ /* Allocate and initialize non-zero fields in the transaction */
trans = gsi_trans_pool_alloc(&trans_info->pool, 1);
trans->gsi = gsi;
trans->channel_id = channel_id;
- trans->tre_count = tre_count;
+ trans->rsvd_count = tre_count;
init_completion(&trans->completion);
- /* Allocate the scatterlist and (if requested) info entries. */
+ /* Allocate the scatterlist */
trans->sgl = gsi_trans_pool_alloc(&trans_info->sg_pool, tre_count);
sg_init_marker(trans->sgl, tre_count);
@@ -400,22 +404,23 @@ void gsi_trans_free(struct gsi_trans *trans)
if (!last)
return;
- ipa_gsi_trans_release(trans);
+ if (trans->used_count)
+ ipa_gsi_trans_release(trans);
/* Releasing the reserved TREs implicitly frees the sgl[] and
* (if present) info[] arrays, plus the transaction itself.
*/
- gsi_trans_tre_release(trans_info, trans->tre_count);
+ gsi_trans_tre_release(trans_info, trans->rsvd_count);
}
/* Add an immediate command to a transaction */
void gsi_trans_cmd_add(struct gsi_trans *trans, void *buf, u32 size,
dma_addr_t addr, enum ipa_cmd_opcode opcode)
{
- u32 which = trans->used++;
+ u32 which = trans->used_count++;
struct scatterlist *sg;
- WARN_ON(which >= trans->tre_count);
+ WARN_ON(which >= trans->rsvd_count);
/* Commands are quite different from data transfer requests.
* Their payloads come from a pool whose memory is allocated
@@ -446,9 +451,9 @@ int gsi_trans_page_add(struct gsi_trans *trans, struct page *page, u32 size,
struct scatterlist *sg = &trans->sgl[0];
int ret;
- if (WARN_ON(trans->tre_count != 1))
+ if (WARN_ON(trans->rsvd_count != 1))
return -EINVAL;
- if (WARN_ON(trans->used))
+ if (WARN_ON(trans->used_count))
return -EINVAL;
sg_set_page(sg, page, size, offset);
@@ -456,7 +461,7 @@ int gsi_trans_page_add(struct gsi_trans *trans, struct page *page, u32 size,
if (!ret)
return -ENOMEM;
- trans->used++; /* Transaction now owns the (DMA mapped) page */
+ trans->used_count++; /* Transaction now owns the (DMA mapped) page */
return 0;
}
@@ -465,25 +470,26 @@ int gsi_trans_page_add(struct gsi_trans *trans, struct page *page, u32 size,
int gsi_trans_skb_add(struct gsi_trans *trans, struct sk_buff *skb)
{
struct scatterlist *sg = &trans->sgl[0];
- u32 used;
+ u32 used_count;
int ret;
- if (WARN_ON(trans->tre_count != 1))
+ if (WARN_ON(trans->rsvd_count != 1))
return -EINVAL;
- if (WARN_ON(trans->used))
+ if (WARN_ON(trans->used_count))
return -EINVAL;
/* skb->len will not be 0 (checked early) */
ret = skb_to_sgvec(skb, sg, 0, skb->len);
if (ret < 0)
return ret;
- used = ret;
+ used_count = ret;
- ret = dma_map_sg(trans->gsi->dev, sg, used, trans->direction);
+ ret = dma_map_sg(trans->gsi->dev, sg, used_count, trans->direction);
if (!ret)
return -ENOMEM;
- trans->used += used; /* Transaction now owns the (DMA mapped) skb */
+ /* Transaction now owns the (DMA mapped) skb */
+ trans->used_count += used_count;
return 0;
}
@@ -549,7 +555,7 @@ static void gsi_trans_tre_fill(struct gsi_tre *dest_tre, dma_addr_t addr,
static void __gsi_trans_commit(struct gsi_trans *trans, bool ring_db)
{
struct gsi_channel *channel = &trans->gsi->channel[trans->channel_id];
- struct gsi_ring *ring = &channel->tre_ring;
+ struct gsi_ring *tre_ring = &channel->tre_ring;
enum ipa_cmd_opcode opcode = IPA_CMD_NONE;
bool bei = channel->toward_ipa;
struct gsi_tre *dest_tre;
@@ -559,7 +565,7 @@ static void __gsi_trans_commit(struct gsi_trans *trans, bool ring_db)
u32 avail;
u32 i;
- WARN_ON(!trans->used);
+ WARN_ON(!trans->used_count);
/* Consume the entries. If we cross the end of the ring while
* filling them we'll switch to the beginning to finish.
@@ -567,43 +573,39 @@ static void __gsi_trans_commit(struct gsi_trans *trans, bool ring_db)
* transfer request, whose opcode is IPA_CMD_NONE.
*/
cmd_opcode = channel->command ? &trans->cmd_opcode[0] : NULL;
- avail = ring->count - ring->index % ring->count;
- dest_tre = gsi_ring_virt(ring, ring->index);
- for_each_sg(trans->sgl, sg, trans->used, i) {
- bool last_tre = i == trans->used - 1;
+ avail = tre_ring->count - tre_ring->index % tre_ring->count;
+ dest_tre = gsi_ring_virt(tre_ring, tre_ring->index);
+ for_each_sg(trans->sgl, sg, trans->used_count, i) {
+ bool last_tre = i == trans->used_count - 1;
dma_addr_t addr = sg_dma_address(sg);
u32 len = sg_dma_len(sg);
byte_count += len;
if (!avail--)
- dest_tre = gsi_ring_virt(ring, 0);
+ dest_tre = gsi_ring_virt(tre_ring, 0);
if (cmd_opcode)
opcode = *cmd_opcode++;
gsi_trans_tre_fill(dest_tre, addr, len, last_tre, bei, opcode);
dest_tre++;
}
- ring->index += trans->used;
-
- if (channel->toward_ipa) {
- /* We record TX bytes when they are sent */
- trans->len = byte_count;
- trans->trans_count = channel->trans_count;
- trans->byte_count = channel->byte_count;
- channel->trans_count++;
- channel->byte_count += byte_count;
- }
+ /* Associate the TRE with the transaction */
+ gsi_trans_map(trans, tre_ring->index);
- /* Associate the last TRE with the transaction */
- gsi_channel_trans_map(channel, ring->index - 1, trans);
+ tre_ring->index += trans->used_count;
- gsi_trans_move_pending(trans);
+ trans->len = byte_count;
+ if (channel->toward_ipa)
+ gsi_trans_tx_committed(trans);
+
+ gsi_trans_move_committed(trans);
/* Ring doorbell if requested, or if all TREs are allocated */
if (ring_db || !atomic_read(&channel->trans_info.tre_avail)) {
/* Report what we're handing off to hardware for TX channels */
if (channel->toward_ipa)
- gsi_channel_tx_queued(channel);
+ gsi_trans_tx_queued(trans);
+ gsi_trans_move_pending(trans);
gsi_channel_doorbell(channel);
}
}
@@ -611,7 +613,7 @@ static void __gsi_trans_commit(struct gsi_trans *trans, bool ring_db)
/* Commit a GSI transaction */
void gsi_trans_commit(struct gsi_trans *trans, bool ring_db)
{
- if (trans->used)
+ if (trans->used_count)
__gsi_trans_commit(trans, ring_db);
else
gsi_trans_free(trans);
@@ -620,7 +622,7 @@ void gsi_trans_commit(struct gsi_trans *trans, bool ring_db)
/* Commit a GSI transaction and wait for it to complete */
void gsi_trans_commit_wait(struct gsi_trans *trans)
{
- if (!trans->used)
+ if (!trans->used_count)
goto out_trans_free;
refcount_inc(&trans->refcount);
@@ -638,7 +640,7 @@ void gsi_trans_complete(struct gsi_trans *trans)
{
/* If the entire SGL was mapped when added, unmap it now */
if (trans->direction != DMA_NONE)
- dma_unmap_sg(trans->gsi->dev, trans->sgl, trans->used,
+ dma_unmap_sg(trans->gsi->dev, trans->sgl, trans->used_count,
trans->direction);
ipa_gsi_trans_complete(trans);
@@ -675,7 +677,7 @@ void gsi_channel_trans_cancel_pending(struct gsi_channel *channel)
int gsi_trans_read_byte(struct gsi *gsi, u32 channel_id, dma_addr_t addr)
{
struct gsi_channel *channel = &gsi->channel[channel_id];
- struct gsi_ring *ring = &channel->tre_ring;
+ struct gsi_ring *tre_ring = &channel->tre_ring;
struct gsi_trans_info *trans_info;
struct gsi_tre *dest_tre;
@@ -685,12 +687,12 @@ int gsi_trans_read_byte(struct gsi *gsi, u32 channel_id, dma_addr_t addr)
if (!gsi_trans_tre_reserve(trans_info, 1))
return -EBUSY;
- /* Now fill the the reserved TRE and tell the hardware */
+ /* Now fill the reserved TRE and tell the hardware */
- dest_tre = gsi_ring_virt(ring, ring->index);
+ dest_tre = gsi_ring_virt(tre_ring, tre_ring->index);
gsi_trans_tre_fill(dest_tre, addr, 1, true, false, IPA_CMD_NONE);
- ring->index++;
+ tre_ring->index++;
gsi_channel_doorbell(channel);
return 0;
@@ -708,6 +710,7 @@ void gsi_trans_read_byte_done(struct gsi *gsi, u32 channel_id)
int gsi_channel_trans_init(struct gsi *gsi, u32 channel_id)
{
struct gsi_channel *channel = &gsi->channel[channel_id];
+ u32 tre_count = channel->tre_count;
struct gsi_trans_info *trans_info;
u32 tre_max;
int ret;
@@ -715,68 +718,66 @@ int gsi_channel_trans_init(struct gsi *gsi, u32 channel_id)
/* Ensure the size of a channel element is what's expected */
BUILD_BUG_ON(sizeof(struct gsi_tre) != GSI_RING_ELEMENT_SIZE);
- /* The map array is used to determine what transaction is associated
- * with a TRE that the hardware reports has completed. We need one
- * map entry per TRE.
- */
trans_info = &channel->trans_info;
- trans_info->map = kcalloc(channel->tre_count, sizeof(*trans_info->map),
- GFP_KERNEL);
- if (!trans_info->map)
- return -ENOMEM;
- /* We can't use more TREs than there are available in the ring.
- * This limits the number of transactions that can be oustanding.
- * Worst case is one TRE per transaction (but we actually limit
- * it to something a little less than that). We allocate resources
- * for transactions (including transaction structures) based on
- * this maximum number.
+ /* The tre_avail field is what ultimately limits the number of
+ * outstanding transactions and their resources. A transaction
+ * allocation succeeds only if the TREs available are sufficient
+ * for what the transaction might need.
*/
tre_max = gsi_channel_tre_max(channel->gsi, channel_id);
+ atomic_set(&trans_info->tre_avail, tre_max);
- /* Transactions are allocated one at a time. */
+ /* We can't use more TREs than the number available in the ring.
+ * This limits the number of transactions that can be outstanding.
+ * Worst case is one TRE per transaction (but we actually limit
+ * it to something a little less than that). By allocating a
+ * power-of-two number of transactions we can use an index
+ * modulo that number to determine the next one that's free.
+ * Transactions are allocated one at a time.
+ */
ret = gsi_trans_pool_init(&trans_info->pool, sizeof(struct gsi_trans),
tre_max, 1);
if (ret)
- goto err_kfree;
+ return -ENOMEM;
+
+ /* A completion event contains a pointer to the TRE that caused
+ * the event (which will be the last one used by the transaction).
+ * Each entry in this map records the transaction associated
+ * with a corresponding completed TRE.
+ */
+ trans_info->map = kcalloc(tre_count, sizeof(*trans_info->map),
+ GFP_KERNEL);
+ if (!trans_info->map) {
+ ret = -ENOMEM;
+ goto err_trans_free;
+ }
/* A transaction uses a scatterlist array to represent the data
* transfers implemented by the transaction. Each scatterlist
* element is used to fill a single TRE when the transaction is
* committed. So we need as many scatterlist elements as the
* maximum number of TREs that can be outstanding.
- *
- * All TREs in a transaction must fit within the channel's TLV FIFO.
- * A transaction on a channel can allocate as many TREs as that but
- * no more.
*/
ret = gsi_trans_pool_init(&trans_info->sg_pool,
sizeof(struct scatterlist),
- tre_max, channel->tlv_count);
+ tre_max, channel->trans_tre_max);
if (ret)
- goto err_trans_pool_exit;
-
- /* Finally, the tre_avail field is what ultimately limits the number
- * of outstanding transactions and their resources. A transaction
- * allocation succeeds only if the TREs available are sufficient for
- * what the transaction might need. Transaction resource pools are
- * sized based on the maximum number of outstanding TREs, so there
- * will always be resources available if there are TREs available.
- */
- atomic_set(&trans_info->tre_avail, tre_max);
+ goto err_map_free;
spin_lock_init(&trans_info->spinlock);
INIT_LIST_HEAD(&trans_info->alloc);
+ INIT_LIST_HEAD(&trans_info->committed);
INIT_LIST_HEAD(&trans_info->pending);
INIT_LIST_HEAD(&trans_info->complete);
INIT_LIST_HEAD(&trans_info->polled);
return 0;
-err_trans_pool_exit:
- gsi_trans_pool_exit(&trans_info->pool);
-err_kfree:
+err_map_free:
kfree(trans_info->map);
+err_trans_free:
+ gsi_trans_pool_exit(&trans_info->pool);
dev_err(gsi->dev, "error %d initializing channel %u transactions\n",
ret, channel_id);