summaryrefslogtreecommitdiff
path: root/drivers/net/mv643xx_eth.c
AgeCommit message (Collapse)Author
2008-11-03mv643xx_eth: fix SMI bus access timeoutsLennert Buytenhek
The mv643xx_eth mii bus implementation uses wait_event_timeout() to wait for SMI completion interrupts. If wait_event_timeout() would return zero, mv643xx_eth would conclude that the SMI access timed out, but this is not necessarily true -- wait_event_timeout() can also return zero in the case where the SMI completion interrupt did happen in time but where it took longer than the requested timeout for the process performing the SMI access to be scheduled again. This would lead to occasional SMI access timeouts when the system would be under heavy load. The fix is to ignore the return value of wait_event_timeout(), and to re-check the SMI done bit after wait_event_timeout() returns to determine whether or not the SMI access timed out. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-10-08mv643xx_eth: include linux/ip.h to fix buildLennert Buytenhek
mv643xx_eth uses ip_hdr() (defined in linux/ip.h), but relied on another header file to include the needed header file indirectly. In latest net-next this indirect include chain is gone, so the driver fails to build. Include linux/ip.h explicitly to fix this. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-08phylib: move to dynamic allocation of struct mii_busLennert Buytenhek
This patch introduces mdiobus_alloc() and mdiobus_free(), and makes all mdio bus drivers use these functions to allocate their struct mii_bus'es dynamically. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Andy Fleming <afleming@freescale.com>
2008-10-08phylib: rename mii_bus::dev to mii_bus::parentLennert Buytenhek
In preparation of giving mii_bus objects a device tree presence of their own, rename struct mii_bus's ->dev argument to ->parent, since having a 'struct device *dev' that points to our parent device conflicts with introducing a 'struct device dev' representing our own device. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Andy Fleming <afleming@freescale.com>
2008-10-01mv643xx_eth: hook up skb recyclingLennert Buytenhek
This gives a nice increase in the maximum loss-free packet forwarding rate in routing workloads. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-19mv643xx_eth: bump version to 1.4Lennert Buytenhek
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-09-19mv643xx_eth: convert to phylibLennert Buytenhek
Switch mv643xx_eth from using drivers/net/mii.c to using phylib. Since the mv643xx_eth hardware does all the link state handling and PHY polling, the driver will use phylib in the "Doing it all yourself" mode described in the phylib documentation. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Acked-by: Andy Fleming <afleming@freescale.com>
2008-09-19mv643xx_eth: enforce frequent hardware statistics pollingLennert Buytenhek
If we don't poll the hardware statistics counters at least once every ~34 seconds, overflow might occur without us noticing. So, set up a timer to poll the statistics counters at least once every 30 seconds. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-09-19mv643xx_eth: deal with unexpected ethernet header sizesLennert Buytenhek
When the IP header doesn't start 14, 18, 22 or 26 bytes into the packet (which are the only four cases that the hardware can deal with if asked to do IP checksumming on transmit), invoke the software checksum helper instead of letting the packet go out with a corrupt checksum inserted into the packet in the wrong place. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-09-19mv643xx_eth: fix receive checksummingLennert Buytenhek
We have to explicitly tell the hardware to include the pseudo-header when doing receive checksumming, otherwise hardware checksumming will fail for every received packet and we'll end up setting CHECKSUM_NONE on every received packet. While we're at it, when skb->ip_summed is set to CHECKSUM_UNNECESSARY on received packets, skb->csum is supposed to be undefined, and thus there is no need to set it. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-09-14mv643xx_eth: add support for chips without transmit bandwidth controlLennert Buytenhek
Add support for mv643xx_eth versions that have no transmit bandwidth control registers at all, such as the ethernet block found in the Marvell 88F6183 ARM SoC. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-09-14mv643xx_eth: avoid reading ->byte_cnt twice during receive processingLennert Buytenhek
Currently, the receive processing reads ->byte_cnt twice (once to update interface statistics and once to properly size the data area of the received skb), but since receive descriptors live in uncached memory, caching this value in a local variable saves one uncached access, and increases routing performance a tiny little bit more. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-09-14mv643xx_eth: shrink default receive and transmit queue sizesLennert Buytenhek
Since the size of the receive queue is directly related to the data cache footprint of the driver (between refilling a receive ring entry with a fresh skb and receiving a packet in that entry, queue_size - 1 other skbs will have been touched), shrink the default receive queue size to a saner number of entries, as 400 is definite overkill for almost all workloads. While we are at it, trim the default transmit queue size a bit as well. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-09-14mv643xx_eth: replace array of skbs awaiting transmit completion with a queueLennert Buytenhek
Get rid of the skb pointer array that we currently use for transmit reclaim, and replace it with an skb queue, to which skbuffs are appended when they are passed to the xmit function, and removed from the front and freed when we do transmit queue reclaim and hit a descriptor with the 'owned by device' bit clear and 'last descriptor' bit set. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-09-14mv643xx_eth: avoid dropping tx lock during transmit reclaimLennert Buytenhek
By moving DMA unmapping during transmit reclaim back under the netif tx lock, we avoid the situation where we read the DMA address and buffer length from the descriptor under the lock and then not do anything with that data after dropping the lock on platforms where the DMA unmapping routines are all NOPs (which is the case on all ARM platforms that mv643xx_eth is used on at least). This saves two uncached reads, which makes a small but measurable performance difference in routing benchmarks. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-09-14mv643xx_eth: switch to netif tx queue lock, get rid of private spinlockLennert Buytenhek
Since our ->hard_start_xmit() method is already called under spinlock protection (the netif tx queue lock), we can simply make that lock cover the private transmit state (descriptor ring indexes et al.) as well, which avoids having to use a private lock to protect that state. Since this was the last user of the driver-private spinlock, it can be killed off. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-09-14mv643xx_eth: move all work to the napi poll handlerLennert Buytenhek
Move link status handling, transmit reclaim and TX_END handling from the interrupt handler to the napi poll handler. This allows switching ->lock over to a non-IRQ-safe lock and removes all explicit interrupt disabling from the driver. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-09-05mv643xx_eth: transmit multiqueue supportLennert Buytenhek
As all the infrastructure for multiple transmit queues already exists in the driver, this patch is entirely trivial. The individual transmit queues are still serialised by the driver's per-port private spinlock, but that will disappear (i.e. be replaced by the per-subqueue ->_xmit_lock) in a subsequent patch. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-09-05mv643xx_eth: delete unused and uninteresting interrupt source mask bitsLennert Buytenhek
Delete a couple of unused and uninteresting interrupt source mask bits: - The receive resource underrun interrupt sources are uninteresting because if we are in out-of-memory mode, we are already dealing with the issue, and we don't need the hardware to remind us again that we are out of memory. - The LINK and PHY interrupt sources can be coalesced into one define, since we always use them together. - The transmit resource underrun interrupt source can be disabled since we never activate the head descriptor of a paged skb until the fragments are all activated, so transmit underrun during a packet should never happen. - The INT_EXT_TX_0 define is never used. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-09-05mv643xx_eth: get rid of netif_{stop,wake}_queue() calls on link down/upLennert Buytenhek
There is no need to call netif_{stop,wake}_queue() when the link goes down/up, as the networking already takes care of this internally. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-09-05mv643xx_eth: remove force_phy_addr fieldLennert Buytenhek
Currently, there are two different fields in the mv643xx_eth_platform_data struct that together describe the PHY address -- one field (phy_addr) has the address of the PHY, but if that address is zero, a second field (force_phy_addr) needs to be set to distinguish the actual address zero from a zero due to not having filled in the PHY address explicitly (which should mean 'use the default PHY address'). If we are a bit smarter about the encoding of the phy_addr field, we can avoid the need for a second field -- this patch does that. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-09-05mv643xx_eth: smi sharing is a per-unit property, not a per-port oneLennert Buytenhek
Which top-level unit's SMI interface to use should be a property of the top-level unit, not of the individual ports. This patch moves the ->shared_smi pointer from the per-port platform data to the global platform data. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-09-05mv643xx_eth: require contiguous receive and transmit queue numberingLennert Buytenhek
Simplify receive and transmit queue handling by requiring the set of queue numbers to be contiguous starting from zero. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-09-05mv643xx_eth: get rid of compile-time configurable transmit checksummingLennert Buytenhek
Get rid of the mv643xx_eth-internal MV643XX_ETH_CHECKSUM_OFFLOAD_TX compile-time option. Using transmit checksumming is the sane default, and anyone wanting to disable it should use ethtool(8) instead of recompiling their kernels. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-09-05mv643xx_eth: get rid of receive-side lockingLennert Buytenhek
By having the receive out-of-memory handling timer schedule the napi poll handler and then doing oom processing from the napi poll handler, all code that touches receive state moves to napi context, letting us get rid of all explicit locking in the receive paths since the only mutual exclusion we need anymore at that point is protection against reentering ourselves, which is provided by napi synchronisation. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-09-05mv643xx_eth: make napi unconditionalLennert Buytenhek
Make napi unconditional on the receive side, so that we can get rid of all the locking and local interrupt disabling in the receive path. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-09-05mv643xx_eth: use the SMI done interrupt to wait for SMI access completionLennert Buytenhek
If the platform code has passed us the IRQ number of the mv643xx_eth top-level error interrupt, use the error interrupt to wait for SMI access completion instead of polling the SMI busy bit, since SMI bus accesses can take up to tens of milliseconds. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-09-05mv643xx_eth: switch ->phy_lock from a spinlock to a mutexLennert Buytenhek
Since commit 81600eea98789da09a32de69ca9d3be8b9503c54 ("mv643xx_eth: use auto phy polling for configuring (R)(G)MII interface"), mv643xx_eth no longer does SMI accesses from interrupt context. The only other callers that do SMI accesses all do them from process context, which means we can switch the PHY lock from a spinlock to a mutex, and get rid of the extra locking in some ethtool methods. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-09-05mv643xx_eth: get rid of modulo operationsLennert Buytenhek
Get rid of the modulo operations that are currently used for computing successive TX/RX descriptor ring indexes. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-09-05mv643xx_eth: get rid of IRQF_SAMPLE_RANDOMLennert Buytenhek
Using IRQF_SAMPLE_RANDOM for the mv643xx_eth interrupt handler significantly increases interrupt processing overhead, so get rid of it. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-09-05mv643xx_eth: fix receive buffer DMA unmappingLennert Buytenhek
When tearing down a DMA mapping for a receive buffer, we should pass dma_unmap_single() the exact same address that dma_map_single() gave us when we originally set up the mapping. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-09-05mv643xx_eth: fix 'netdev_priv(dev) == dev->priv' assumptionLennert Buytenhek
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-08-24mv643xx_eth: bump version to 1.3Lennert Buytenhek
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-08-24mv643xx_eth: enforce multiple-of-8-bytes receive buffer size restrictionLennert Buytenhek
The mv643xx_eth hardware ignores the lower three bits of the buffer size field in receive descriptors, causing the reception of full-sized packets to fail at some MTUs. Fix this by rounding the size of allocated receive buffers up to a multiple of eight bytes. While we are at it, add a bit of extra space to each receive buffer so that we can handle multiple vlan tags on ingress. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-08-24mv643xx_eth: fix NULL pointer dereference in rxq_process()Lennert Buytenhek
When we are low on memory, the assumption that every descriptor in the receive ring will have an skbuff associated with it does not hold. rxq_process() was assuming that if the receive descriptor it is working on is not owned by the hardware, it can safely be processed and handed to the networking stack. But a descriptor in the receive ring not being owned by the hardware can also happen when we are low on memory and did not manage to refill the receive ring fully. This patch changes rxq_process()'s bailout condition from "the first receive descriptor to be processed is owned by the hardware" to "the first receive descriptor to be processed is owned by the hardware OR the number of valid receive descriptors in the ring is zero". Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-08-24mv643xx_eth: fix inconsistent lock semanticsLennert Buytenhek
Nicolas Pitre noted that mv643xx_eth_poll was incorrectly using non-IRQ-safe locks while checking whether to wake up the netdevice's transmit queue. Convert the locking to *_irq() variants, since we are running from softirq context where interrupts are enabled. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-08-24mv643xx_eth: fix double add_timer() on the receive oom timerLennert Buytenhek
Commit 12e4ab79cd828563dc090d2117dc8626b344bc8f ("mv643xx_eth: be more agressive about RX refill") changed the condition for the receive out-of-memory timer to be scheduled from "the receive ring is empty" to "the receive ring is not full". This can lead to a situation where the receive out-of-memory timer is pending because a previous rxq_refill() didn't manage to refill the receive ring entirely as a result of being out of memory, and rxq_refill() is then called again as a side effect of a packet receive interrupt, and that rxq_refill() call then again does not succeed to refill the entire receive ring with fresh empty skbuffs because we are still out of memory, and then tries to call add_timer() on the already scheduled out-of-memory timer. This patch fixes this issue by changing the add_timer() call in rxq_refill() to a mod_timer() call. If the OOM timer was not already scheduled, this will behave as before, whereas if it was already scheduled, this patch will push back its firing time a bit, which is safe because we've (unsuccessfully) attempted to refill the receive ring just before we do this. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-08-24mv643xx_eth: fix NAPI 'rotting packet' issueLennert Buytenhek
When a receive interrupt occurs, mv643xx_eth would first process the receive descriptors and then ACK the receive interrupt, instead of the other way round. This would leave a small race window between processing the last receive descriptor and clearing the receive interrupt status in which a new packet could come in, which would then 'rot' in the receive ring until the next receive interrupt would come in. Fix this by ACKing (clearing) the receive interrupt condition before processing the receive descriptors. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-07-24mv643xx_eth: bump version to 1.2Lennert Buytenhek
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-07-24mv643xx_eth: enable hardware TX checksumming with vlan tagsLennert Buytenhek
Although mv643xx_eth has no hardware support for inserting a vlan tag by twiddling some bits in the TX descriptor, it does support hardware TX checksumming on packets where the IP header starts {a limited set of values other than 14} bytes into the packet. This patch sets mv643xx_eth's ->vlan_features to NETIF_F_SG | NETIF_F_IP_CSUM, which prevents the stack from checksumming vlan'ed packets in software, and if vlan tags are present on a transmitted packet, notifies the hardware of this fact by toggling the right bits in the TX descriptor. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-07-24mv643xx_eth: print message on link status changeLennert Buytenhek
When there is a link status change (link or phy status interrupt), print a message notifying the user of the new link status. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-07-24mv643xx_eth: use auto phy polling for configuring (R)(G)MII interfaceLennert Buytenhek
The mv643xx_eth hardware has a provision for polling the PHY's MII management registers to obtain the (R)(G)MII interface speed (10/100/1000) and duplex (half/full) and pause (off/symmetric) settings to use to talk to the PHY. The driver currently does not make use of this feature. Instead, whenever there is a link status change event, it reads the current link parameters from the PHY, and programs those parameters into the mv643xx_eth MAC by hand. This patch switches the mv643xx_eth driver to letting the MAC auto-determine the (R)(G)MII link parameters by PHY polling, if there is a PHY present. For PHYless ports (when e.g. the (R)(G)MII interface is connected to a hardware switch), we keep hardcoding the MII interface parameters. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-07-24mv643xx_eth: print driver version on initLennert Buytenhek
Print the mv643xx_eth driver version on init to help debugging. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-07-24mv643xx_eth: use symbolic MII register addresses and valuesLennert Buytenhek
Instead of hardcoding MII register addresses and values, use the symbolic constants defined in linux/mii.h. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-07-24mv643xx_eth: use longer DMA burstsLennert Buytenhek
The mv643xx_eth driver is limiting DMA bursts to 32 bytes, while using the largest burst size (128 bytes) gives a couple percentage points performance improvement in throughput tests, and the docs say that the 128 byte default should not need to be changed, so use 128 byte bursts instead. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-07-24mv643xx_eth: also check TX_IN_PROGRESS when disabling transmit pathLennert Buytenhek
The recommended sequence for waiting for the transmit path to clear after disabling all of the transmit queues is to wait for the TX_FIFO_EMPTY bit in the Port Status register to become set as well as the TX_IN_PROGRESS bit to clear. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-07-24mv643xx_eth: don't fiddle with maximum receive packet size settingLennert Buytenhek
The maximum receive packet size field in the Port Serial Control register controls at what size received packets are flagged overlength in the receive descriptor, but it doesn't prevent overlength packets from being DMAd to memory and signaled to the host like other received packets. mv643xx_eth does not support receiving jumbo frames in 10/100 mode, but setting the packet threshold to larger than 1522 bytes in 10/100 mode won't cause breakage by itself. If we really want to enforce maximum packet size on the receiving end instead of on the sending end where it should be done, we can always just add a length check to the software receive handler instead of relying on the hardware to do the comparison for us. What's more, changing the maximum packet size field requires temporarily disabling the RX/TX paths. So once the link comes up in 10/100 Mb/s mode or 1000 Mb/s mode, we'd have to disable it again just to set the right maximum packet size field (1522 in 10/100 Mb/s mode or 9700 in 1000 Mb/s mode), just so that we can offload one comparison operation to hardware that we might as well do in software, assuming that we'd want to do it at all. Contrary to what the documentation suggests, there is no harm in just setting a 9700 byte maximum packet size in 10/100 mode, so use the maximum maximum packet size for all modes. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-07-24mv643xx_eth: fix transmit-reclaim-in-napi-pollLennert Buytenhek
The mv643xx_eth driver allows doing transmit reclaim from within the napi poll routine, but after doing reclaim, it would forget to check the free transmit descriptor count and wake up the transmit queue if the reclaim caused enough descriptors for a new packet to become available. This would cause the netdev watchdog to occasionally kick in during certain workloads with combined receive and transmit traffic. Fix this by adding a wakeup check identical to the one in the interrupt handler to the napi poll routine. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-07-24mv643xx_eth: prevent breakage when link goes down during transmitLennert Buytenhek
When the ethernet link goes down while mv643xx_eth is transmitting data, transmit DMA can stop before all queued transmit descriptors have been processed. But even the descriptors that _have_ been processed might not be properly marked as done before the transmit DMA unit shuts down. Then when the link comes up again, the hardware transmit pointer might have advanced while not all previous packet descriptors have been marked as transmitted, causing software transmit reclaim to hang waiting for the hardware to finish transmitting a descriptor that it has already skipped. This patch forcibly reclaims all packets on the transmit ring on a link down interrupt, and then resyncs the hardware transmit pointer to what the software's idea of the first free descriptor is. Also, we need to prevent re-waking the transmit queue if we get a 'transmit done' interrupt at the same time as a 'link down' interrupt, which this patch does as well. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
2008-07-24mv643xx_eth: fix TX hang erratum workaroundLennert Buytenhek
The previously merged TX hang erratum workaround ("mv643xx_eth: work around TX hang hardware issue") assumes that TX_END interrupts are delivered simultaneously with or after their corresponding TX interrupts, but this is not always true in practise. In particular, it appears that TX_END interrupts are issued as soon as descriptor fetch returns an invalid descriptor, which may happen before earlier descriptors have been fully transmitted and written back to memory as being done. This hardware behavior can lead to a situation where the current driver code mistakenly assumes that the MAC has given up transmitting before noticing the packets that it is in fact still currently working on, causing the driver to re-kick the transmit queue, which will only cause the MAC to re-fetch the invalid head descriptor, and generate another TX_END interrupt, et cetera, until the packets in the pipe finally finish transmitting and have their descriptors written back to memory, which will then finally break the loop. Fix this by having the erratum workaround not check the 'number of unfinished descriptor', but instead, to compare the software's idea of what the head descriptor pointer should be to the hardware's head descriptor pointer (which is updated on the same conditions as the TX_END interupt is generated on, i.e. possibly before all previous descriptors have been transmitted and written back). Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>