diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2022-08-03 16:29:08 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2022-08-03 16:29:08 -0700 |
commit | f86d1fbbe7858884d6754534a0afbb74fc30bc26 (patch) | |
tree | f61796870edefbe77d495e9d719c68af1d14275b /drivers/net/ipa/gsi_trans.c | |
parent | 526942b8134cc34d25d27f95dfff98b8ce2f6fcd (diff) | |
parent | 7c6327c77d509e78bff76f2a4551fcfee851682e (diff) |
Merge tag 'net-next-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking changes from Paolo Abeni:
"Core:
- Refactor the forward memory allocation to better cope with memory
pressure with many open sockets, moving from a per socket cache to
a per-CPU one
- Replace rwlocks with RCU for better fairness in ping, raw sockets
and IP multicast router.
- Network-side support for IO uring zero-copy send.
- A few skb drop reason improvements, including codegen the source
file with string mapping instead of using macro magic.
- Rename reference tracking helpers to a more consistent netdev_*
schema.
- Adapt u64_stats_t type to address load/store tearing issues.
- Refine debug helper usage to reduce the log noise caused by bots.
BPF:
- Improve socket map performance, avoiding skb cloning on read
operation.
- Add support for 64 bits enum, to match types exposed by kernel.
- Introduce support for sleepable uprobes program.
- Introduce support for enum textual representation in libbpf.
- New helpers to implement synproxy with eBPF/XDP.
- Improve loop performances, inlining indirect calls when possible.
- Removed all the deprecated libbpf APIs.
- Implement new eBPF-based LSM flavor.
- Add type match support, which allow accurate queries to the eBPF
used types.
- A few TCP congetsion control framework usability improvements.
- Add new infrastructure to manipulate CT entries via eBPF programs.
- Allow for livepatch (KLP) and BPF trampolines to attach to the same
kernel function.
Protocols:
- Introduce per network namespace lookup tables for unix sockets,
increasing scalability and reducing contention.
- Preparation work for Wi-Fi 7 Multi-Link Operation (MLO) support.
- Add support to forciby close TIME_WAIT TCP sockets via user-space
tools.
- Significant performance improvement for the TLS 1.3 receive path,
both for zero-copy and not-zero-copy.
- Support for changing the initial MTPCP subflow priority/backup
status
- Introduce virtually contingus buffers for sockets over RDMA, to
cope better with memory pressure.
- Extend CAN ethtool support with timestamping capabilities
- Refactor CAN build infrastructure to allow building only the needed
features.
Driver API:
- Remove devlink mutex to allow parallel commands on multiple links.
- Add support for pause stats in distributed switch.
- Implement devlink helpers to query and flash line cards.
- New helper for phy mode to register conversion.
New hardware / drivers:
- Ethernet DSA driver for the rockchip mt7531 on BPI-R2 Pro.
- Ethernet DSA driver for the Renesas RZ/N1 A5PSW switch.
- Ethernet DSA driver for the Microchip LAN937x switch.
- Ethernet PHY driver for the Aquantia AQR113C EPHY.
- CAN driver for the OBD-II ELM327 interface.
- CAN driver for RZ/N1 SJA1000 CAN controller.
- Bluetooth: Infineon CYW55572 Wi-Fi plus Bluetooth combo device.
Drivers:
- Intel Ethernet NICs:
- i40e: add support for vlan pruning
- i40e: add support for XDP framented packets
- ice: improved vlan offload support
- ice: add support for PPPoE offload
- Mellanox Ethernet (mlx5)
- refactor packet steering offload for performance and scalability
- extend support for TC offload
- refactor devlink code to clean-up the locking schema
- support stacked vlans for bridge offloads
- use TLS objects pool to improve connection rate
- Netronome Ethernet NICs (nfp):
- extend support for IPv6 fields mangling offload
- add support for vepa mode in HW bridge
- better support for virtio data path acceleration (VDPA)
- enable TSO by default
- Microsoft vNIC driver (mana)
- add support for XDP redirect
- Others Ethernet drivers:
- bonding: add per-port priority support
- microchip lan743x: extend phy support
- Fungible funeth: support UDP segmentation offload and XDP xmit
- Solarflare EF100: add support for virtual function representors
- MediaTek SoC: add XDP support
- Mellanox Ethernet/IB switch (mlxsw):
- dropped support for unreleased H/W (XM router).
- improved stats accuracy
- unified bridge model coversion improving scalability (parts 1-6)
- support for PTP in Spectrum-2 asics
- Broadcom PHYs
- add PTP support for BCM54210E
- add support for the BCM53128 internal PHY
- Marvell Ethernet switches (prestera):
- implement support for multicast forwarding offload
- Embedded Ethernet switches:
- refactor OcteonTx MAC filter for better scalability
- improve TC H/W offload for the Felix driver
- refactor the Microchip ksz8 and ksz9477 drivers to share the
probe code (parts 1, 2), add support for phylink mac
configuration
- Other WiFi:
- Microchip wilc1000: diable WEP support and enable WPA3
- Atheros ath10k: encapsulation offload support
Old code removal:
- Neterion vxge ethernet driver: this is untouched since more than 10 years"
* tag 'net-next-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1890 commits)
doc: sfp-phylink: Fix a broken reference
wireguard: selftests: support UML
wireguard: allowedips: don't corrupt stack when detecting overflow
wireguard: selftests: update config fragments
wireguard: ratelimiter: use hrtimer in selftest
net/mlx5e: xsk: Discard unaligned XSK frames on striding RQ
net: usb: ax88179_178a: Bind only to vendor-specific interface
selftests: net: fix IOAM test skip return code
net: usb: make USB_RTL8153_ECM non user configurable
net: marvell: prestera: remove reduntant code
octeontx2-pf: Reduce minimum mtu size to 60
net: devlink: Fix missing mutex_unlock() call
net/tls: Remove redundant workqueue flush before destroy
net: txgbe: Fix an error handling path in txgbe_probe()
net: dsa: Fix spelling mistakes and cleanup code
Documentation: devlink: add add devlink-selftests to the table of contents
dccp: put dccp_qpolicy_full() and dccp_qpolicy_push() in the same lock
net: ionic: fix error check for vlan flags in ionic_set_nic_features()
net: ice: fix error NETIF_F_HW_VLAN_CTAG_FILTER check in ice_vsi_sync_fltr()
nfp: flower: add support for tunnel offload without key ID
...
Diffstat (limited to 'drivers/net/ipa/gsi_trans.c')
-rw-r--r-- | drivers/net/ipa/gsi_trans.c | 197 |
1 files changed, 99 insertions, 98 deletions
diff --git a/drivers/net/ipa/gsi_trans.c b/drivers/net/ipa/gsi_trans.c index 55f8fe7d2668..18e7e8c405be 100644 --- a/drivers/net/ipa/gsi_trans.c +++ b/drivers/net/ipa/gsi_trans.c @@ -214,26 +214,14 @@ void *gsi_trans_pool_alloc_dma(struct gsi_trans_pool *pool, dma_addr_t *addr) return pool->base + offset; } -/* Return the pool element that immediately follows the one given. - * This only works done if elements are allocated one at a time. - */ -void *gsi_trans_pool_next(struct gsi_trans_pool *pool, void *element) +/* Map a TRE ring entry index to the transaction it is associated with */ +static void gsi_trans_map(struct gsi_trans *trans, u32 index) { - void *end = pool->base + pool->count * pool->size; - - WARN_ON(element < pool->base); - WARN_ON(element >= end); - WARN_ON(pool->max_alloc != 1); - - element += pool->size; + struct gsi_channel *channel = &trans->gsi->channel[trans->channel_id]; - return element < end ? element : pool->base; -} + /* The completion event will indicate the last TRE used */ + index += trans->used_count - 1; -/* Map a given ring entry index to the transaction associated with it */ -static void gsi_channel_trans_map(struct gsi_channel *channel, u32 index, - struct gsi_trans *trans) -{ /* Note: index *must* be used modulo the ring count here */ channel->trans_info.map[index % channel->tre_ring.count] = trans; } @@ -253,15 +241,31 @@ struct gsi_trans *gsi_channel_trans_complete(struct gsi_channel *channel) struct gsi_trans, links); } -/* Move a transaction from the allocated list to the pending list */ +/* Move a transaction from the allocated list to the committed list */ +static void gsi_trans_move_committed(struct gsi_trans *trans) +{ + struct gsi_channel *channel = &trans->gsi->channel[trans->channel_id]; + struct gsi_trans_info *trans_info = &channel->trans_info; + + spin_lock_bh(&trans_info->spinlock); + + list_move_tail(&trans->links, &trans_info->committed); + + spin_unlock_bh(&trans_info->spinlock); +} + +/* Move transactions from the committed list to the pending list */ static void gsi_trans_move_pending(struct gsi_trans *trans) { struct gsi_channel *channel = &trans->gsi->channel[trans->channel_id]; struct gsi_trans_info *trans_info = &channel->trans_info; + struct list_head list; spin_lock_bh(&trans_info->spinlock); - list_move_tail(&trans->links, &trans_info->pending); + /* Move this transaction and all predecessors to the pending list */ + list_cut_position(&list, &trans_info->committed, &trans->links); + list_splice_tail(&list, &trans_info->pending); spin_unlock_bh(&trans_info->spinlock); } @@ -340,7 +344,7 @@ struct gsi_trans *gsi_channel_trans_alloc(struct gsi *gsi, u32 channel_id, struct gsi_trans_info *trans_info; struct gsi_trans *trans; - if (WARN_ON(tre_count > gsi_channel_trans_tre_max(gsi, channel_id))) + if (WARN_ON(tre_count > channel->trans_tre_max)) return NULL; trans_info = &channel->trans_info; @@ -351,14 +355,14 @@ struct gsi_trans *gsi_channel_trans_alloc(struct gsi *gsi, u32 channel_id, if (!gsi_trans_tre_reserve(trans_info, tre_count)) return NULL; - /* Allocate and initialize non-zero fields in the the transaction */ + /* Allocate and initialize non-zero fields in the transaction */ trans = gsi_trans_pool_alloc(&trans_info->pool, 1); trans->gsi = gsi; trans->channel_id = channel_id; - trans->tre_count = tre_count; + trans->rsvd_count = tre_count; init_completion(&trans->completion); - /* Allocate the scatterlist and (if requested) info entries. */ + /* Allocate the scatterlist */ trans->sgl = gsi_trans_pool_alloc(&trans_info->sg_pool, tre_count); sg_init_marker(trans->sgl, tre_count); @@ -400,22 +404,23 @@ void gsi_trans_free(struct gsi_trans *trans) if (!last) return; - ipa_gsi_trans_release(trans); + if (trans->used_count) + ipa_gsi_trans_release(trans); /* Releasing the reserved TREs implicitly frees the sgl[] and * (if present) info[] arrays, plus the transaction itself. */ - gsi_trans_tre_release(trans_info, trans->tre_count); + gsi_trans_tre_release(trans_info, trans->rsvd_count); } /* Add an immediate command to a transaction */ void gsi_trans_cmd_add(struct gsi_trans *trans, void *buf, u32 size, dma_addr_t addr, enum ipa_cmd_opcode opcode) { - u32 which = trans->used++; + u32 which = trans->used_count++; struct scatterlist *sg; - WARN_ON(which >= trans->tre_count); + WARN_ON(which >= trans->rsvd_count); /* Commands are quite different from data transfer requests. * Their payloads come from a pool whose memory is allocated @@ -446,9 +451,9 @@ int gsi_trans_page_add(struct gsi_trans *trans, struct page *page, u32 size, struct scatterlist *sg = &trans->sgl[0]; int ret; - if (WARN_ON(trans->tre_count != 1)) + if (WARN_ON(trans->rsvd_count != 1)) return -EINVAL; - if (WARN_ON(trans->used)) + if (WARN_ON(trans->used_count)) return -EINVAL; sg_set_page(sg, page, size, offset); @@ -456,7 +461,7 @@ int gsi_trans_page_add(struct gsi_trans *trans, struct page *page, u32 size, if (!ret) return -ENOMEM; - trans->used++; /* Transaction now owns the (DMA mapped) page */ + trans->used_count++; /* Transaction now owns the (DMA mapped) page */ return 0; } @@ -465,25 +470,26 @@ int gsi_trans_page_add(struct gsi_trans *trans, struct page *page, u32 size, int gsi_trans_skb_add(struct gsi_trans *trans, struct sk_buff *skb) { struct scatterlist *sg = &trans->sgl[0]; - u32 used; + u32 used_count; int ret; - if (WARN_ON(trans->tre_count != 1)) + if (WARN_ON(trans->rsvd_count != 1)) return -EINVAL; - if (WARN_ON(trans->used)) + if (WARN_ON(trans->used_count)) return -EINVAL; /* skb->len will not be 0 (checked early) */ ret = skb_to_sgvec(skb, sg, 0, skb->len); if (ret < 0) return ret; - used = ret; + used_count = ret; - ret = dma_map_sg(trans->gsi->dev, sg, used, trans->direction); + ret = dma_map_sg(trans->gsi->dev, sg, used_count, trans->direction); if (!ret) return -ENOMEM; - trans->used += used; /* Transaction now owns the (DMA mapped) skb */ + /* Transaction now owns the (DMA mapped) skb */ + trans->used_count += used_count; return 0; } @@ -549,7 +555,7 @@ static void gsi_trans_tre_fill(struct gsi_tre *dest_tre, dma_addr_t addr, static void __gsi_trans_commit(struct gsi_trans *trans, bool ring_db) { struct gsi_channel *channel = &trans->gsi->channel[trans->channel_id]; - struct gsi_ring *ring = &channel->tre_ring; + struct gsi_ring *tre_ring = &channel->tre_ring; enum ipa_cmd_opcode opcode = IPA_CMD_NONE; bool bei = channel->toward_ipa; struct gsi_tre *dest_tre; @@ -559,7 +565,7 @@ static void __gsi_trans_commit(struct gsi_trans *trans, bool ring_db) u32 avail; u32 i; - WARN_ON(!trans->used); + WARN_ON(!trans->used_count); /* Consume the entries. If we cross the end of the ring while * filling them we'll switch to the beginning to finish. @@ -567,43 +573,39 @@ static void __gsi_trans_commit(struct gsi_trans *trans, bool ring_db) * transfer request, whose opcode is IPA_CMD_NONE. */ cmd_opcode = channel->command ? &trans->cmd_opcode[0] : NULL; - avail = ring->count - ring->index % ring->count; - dest_tre = gsi_ring_virt(ring, ring->index); - for_each_sg(trans->sgl, sg, trans->used, i) { - bool last_tre = i == trans->used - 1; + avail = tre_ring->count - tre_ring->index % tre_ring->count; + dest_tre = gsi_ring_virt(tre_ring, tre_ring->index); + for_each_sg(trans->sgl, sg, trans->used_count, i) { + bool last_tre = i == trans->used_count - 1; dma_addr_t addr = sg_dma_address(sg); u32 len = sg_dma_len(sg); byte_count += len; if (!avail--) - dest_tre = gsi_ring_virt(ring, 0); + dest_tre = gsi_ring_virt(tre_ring, 0); if (cmd_opcode) opcode = *cmd_opcode++; gsi_trans_tre_fill(dest_tre, addr, len, last_tre, bei, opcode); dest_tre++; } - ring->index += trans->used; - - if (channel->toward_ipa) { - /* We record TX bytes when they are sent */ - trans->len = byte_count; - trans->trans_count = channel->trans_count; - trans->byte_count = channel->byte_count; - channel->trans_count++; - channel->byte_count += byte_count; - } + /* Associate the TRE with the transaction */ + gsi_trans_map(trans, tre_ring->index); - /* Associate the last TRE with the transaction */ - gsi_channel_trans_map(channel, ring->index - 1, trans); + tre_ring->index += trans->used_count; - gsi_trans_move_pending(trans); + trans->len = byte_count; + if (channel->toward_ipa) + gsi_trans_tx_committed(trans); + + gsi_trans_move_committed(trans); /* Ring doorbell if requested, or if all TREs are allocated */ if (ring_db || !atomic_read(&channel->trans_info.tre_avail)) { /* Report what we're handing off to hardware for TX channels */ if (channel->toward_ipa) - gsi_channel_tx_queued(channel); + gsi_trans_tx_queued(trans); + gsi_trans_move_pending(trans); gsi_channel_doorbell(channel); } } @@ -611,7 +613,7 @@ static void __gsi_trans_commit(struct gsi_trans *trans, bool ring_db) /* Commit a GSI transaction */ void gsi_trans_commit(struct gsi_trans *trans, bool ring_db) { - if (trans->used) + if (trans->used_count) __gsi_trans_commit(trans, ring_db); else gsi_trans_free(trans); @@ -620,7 +622,7 @@ void gsi_trans_commit(struct gsi_trans *trans, bool ring_db) /* Commit a GSI transaction and wait for it to complete */ void gsi_trans_commit_wait(struct gsi_trans *trans) { - if (!trans->used) + if (!trans->used_count) goto out_trans_free; refcount_inc(&trans->refcount); @@ -638,7 +640,7 @@ void gsi_trans_complete(struct gsi_trans *trans) { /* If the entire SGL was mapped when added, unmap it now */ if (trans->direction != DMA_NONE) - dma_unmap_sg(trans->gsi->dev, trans->sgl, trans->used, + dma_unmap_sg(trans->gsi->dev, trans->sgl, trans->used_count, trans->direction); ipa_gsi_trans_complete(trans); @@ -675,7 +677,7 @@ void gsi_channel_trans_cancel_pending(struct gsi_channel *channel) int gsi_trans_read_byte(struct gsi *gsi, u32 channel_id, dma_addr_t addr) { struct gsi_channel *channel = &gsi->channel[channel_id]; - struct gsi_ring *ring = &channel->tre_ring; + struct gsi_ring *tre_ring = &channel->tre_ring; struct gsi_trans_info *trans_info; struct gsi_tre *dest_tre; @@ -685,12 +687,12 @@ int gsi_trans_read_byte(struct gsi *gsi, u32 channel_id, dma_addr_t addr) if (!gsi_trans_tre_reserve(trans_info, 1)) return -EBUSY; - /* Now fill the the reserved TRE and tell the hardware */ + /* Now fill the reserved TRE and tell the hardware */ - dest_tre = gsi_ring_virt(ring, ring->index); + dest_tre = gsi_ring_virt(tre_ring, tre_ring->index); gsi_trans_tre_fill(dest_tre, addr, 1, true, false, IPA_CMD_NONE); - ring->index++; + tre_ring->index++; gsi_channel_doorbell(channel); return 0; @@ -708,6 +710,7 @@ void gsi_trans_read_byte_done(struct gsi *gsi, u32 channel_id) int gsi_channel_trans_init(struct gsi *gsi, u32 channel_id) { struct gsi_channel *channel = &gsi->channel[channel_id]; + u32 tre_count = channel->tre_count; struct gsi_trans_info *trans_info; u32 tre_max; int ret; @@ -715,68 +718,66 @@ int gsi_channel_trans_init(struct gsi *gsi, u32 channel_id) /* Ensure the size of a channel element is what's expected */ BUILD_BUG_ON(sizeof(struct gsi_tre) != GSI_RING_ELEMENT_SIZE); - /* The map array is used to determine what transaction is associated - * with a TRE that the hardware reports has completed. We need one - * map entry per TRE. - */ trans_info = &channel->trans_info; - trans_info->map = kcalloc(channel->tre_count, sizeof(*trans_info->map), - GFP_KERNEL); - if (!trans_info->map) - return -ENOMEM; - /* We can't use more TREs than there are available in the ring. - * This limits the number of transactions that can be oustanding. - * Worst case is one TRE per transaction (but we actually limit - * it to something a little less than that). We allocate resources - * for transactions (including transaction structures) based on - * this maximum number. + /* The tre_avail field is what ultimately limits the number of + * outstanding transactions and their resources. A transaction + * allocation succeeds only if the TREs available are sufficient + * for what the transaction might need. */ tre_max = gsi_channel_tre_max(channel->gsi, channel_id); + atomic_set(&trans_info->tre_avail, tre_max); - /* Transactions are allocated one at a time. */ + /* We can't use more TREs than the number available in the ring. + * This limits the number of transactions that can be outstanding. + * Worst case is one TRE per transaction (but we actually limit + * it to something a little less than that). By allocating a + * power-of-two number of transactions we can use an index + * modulo that number to determine the next one that's free. + * Transactions are allocated one at a time. + */ ret = gsi_trans_pool_init(&trans_info->pool, sizeof(struct gsi_trans), tre_max, 1); if (ret) - goto err_kfree; + return -ENOMEM; + + /* A completion event contains a pointer to the TRE that caused + * the event (which will be the last one used by the transaction). + * Each entry in this map records the transaction associated + * with a corresponding completed TRE. + */ + trans_info->map = kcalloc(tre_count, sizeof(*trans_info->map), + GFP_KERNEL); + if (!trans_info->map) { + ret = -ENOMEM; + goto err_trans_free; + } /* A transaction uses a scatterlist array to represent the data * transfers implemented by the transaction. Each scatterlist * element is used to fill a single TRE when the transaction is * committed. So we need as many scatterlist elements as the * maximum number of TREs that can be outstanding. - * - * All TREs in a transaction must fit within the channel's TLV FIFO. - * A transaction on a channel can allocate as many TREs as that but - * no more. */ ret = gsi_trans_pool_init(&trans_info->sg_pool, sizeof(struct scatterlist), - tre_max, channel->tlv_count); + tre_max, channel->trans_tre_max); if (ret) - goto err_trans_pool_exit; - - /* Finally, the tre_avail field is what ultimately limits the number - * of outstanding transactions and their resources. A transaction - * allocation succeeds only if the TREs available are sufficient for - * what the transaction might need. Transaction resource pools are - * sized based on the maximum number of outstanding TREs, so there - * will always be resources available if there are TREs available. - */ - atomic_set(&trans_info->tre_avail, tre_max); + goto err_map_free; spin_lock_init(&trans_info->spinlock); INIT_LIST_HEAD(&trans_info->alloc); + INIT_LIST_HEAD(&trans_info->committed); INIT_LIST_HEAD(&trans_info->pending); INIT_LIST_HEAD(&trans_info->complete); INIT_LIST_HEAD(&trans_info->polled); return 0; -err_trans_pool_exit: - gsi_trans_pool_exit(&trans_info->pool); -err_kfree: +err_map_free: kfree(trans_info->map); +err_trans_free: + gsi_trans_pool_exit(&trans_info->pool); dev_err(gsi->dev, "error %d initializing channel %u transactions\n", ret, channel_id); |