summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2010-05-17net: Use ip_route_input_noref() in input pathEric Dumazet
Use ip_route_input_noref() in ip fast path, to avoid two atomic ops per incoming packet. Note: loopback is excluded from this optimization in ip_rcv_finish() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-17net: implements ip_route_input_noref()Eric Dumazet
ip_route_input() is the version returning a refcounted dst, while ip_route_input_noref() returns a non refcounted one. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-17net: add a noref bit on skb dstEric Dumazet
Use low order bit of skb->_skb_dst to tell dst is not refcounted. Change _skb_dst to _skb_refdst to make sure all uses are catched. skb_dst() returns the dst, regardless of noref bit set or not, but with a lockdep check to make sure a noref dst is not given if current user is not rcu protected. New skb_dst_set_noref() helper to set an notrefcounted dst on a skb. (with lockdep check) skb_dst_drop() drops a reference only if skb dst was refcounted. skb_dst_force() helper is used to force a refcount on dst, when skb is queued and not anymore RCU protected. Use skb_dst_force() in __sk_add_backlog(), __dev_xmit_skb() if !IFF_XMIT_DST_RELEASE or skb enqueued on qdisc queue, in sock_queue_rcv_skb(), in __nf_queue(). Use skb_dst_force() in dev_requeue_skb(). Note: dst_use_noref() still dirties dst, we might transform it later to do one dirtying per jiffies. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-17rps: avoid one atomic in enqueue_to_backlogEric Dumazet
If CONFIG_SMP=y, then we own a queue spinlock, we can avoid the atomic test_and_set_bit() from napi_schedule_prep(). We now have same number of atomic ops per netif_rx() calls than with pre-RPS kernel. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-17drivers/net/usb/asix.c: Fix unaligned accessesNeil Jones
Using this driver can cause unaligned accesses in the IP layer This has been fixed by aligning the skb data correctly using the spare room left over by the 4 byte header inserted between packets by the device. Signed-off-by: Neil Jones <NeilJay@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-17ibmveth: Add suspend/resume supportBrian King
Adds support for resuming from suspend for IBM virtual ethernet devices. We may have lost an interrupt over the suspend, so we just kick the interrupt handler to process anything that is outstanding. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-17ipv6 addrlabel: permit deletion of labels assigned to removed devFlorian Westphal
as addrlabels with an interface index are left alone when the interface gets removed this results in addrlabels that can no longer be removed. Restrict validation of index to adding new addrlabels. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-16Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: include/linux/if_link.h
2010-05-16rtnetlink: make SR-IOV VF interface symmetricChris Wright
Now we have a set of nested attributes: IFLA_VFINFO_LIST (NESTED) IFLA_VF_INFO (NESTED) IFLA_VF_MAC IFLA_VF_VLAN IFLA_VF_TX_RATE This allows a single set to operate on multiple attributes if desired. Among other things, it means a dump can be replayed to set state. The current interface has yet to be released, so this seems like something to consider for 2.6.34. Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-16qeth: synchronize configuration interfaceFrank Blaschka
Synchronize access to the drivers configuration interface. Also do not allow configuration changes during online/offline transition. Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-16qeth: new message if OLM limit is reachedUrsula Braun
z/OS may activate Optimized Latency Mode (OLM) for a connection through an OSA Express3 adapter, which reduces the number of allowed concurrent connections, if adapter is used in shared mode. Create a meaningful message, if activation of an OSA-connection fails due to an active OLM-connection on the shared OSA-adapter. Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com> Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-16qeth: exploit HW TX checksummingFrank Blaschka
OSA supports HW TX checksumming in layer 3 mode. Enable this feature and remove software fallback used for TSO. Cleanup checksum bits to indicate OSA can do checksumming only for IPv4 TCP and UDP. Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-16sctp: delete active ICMP proto unreachable timer when free transportWei Yongjun
transport may be free before ICMP proto unreachable timer expire, so we should delete active ICMP proto unreachable timer when transport is going away. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-16net: congestion notifications are not dropped packetsEric Dumazet
vlan/macvlan start_xmit() can inform caller of congestion with NET_XMIT_CN return value. This doesnt mean packet was dropped. Increment normal stat counters instead of tx_dropped. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-16net: Introduce sk_route_nocapsEric Dumazet
TCP-MD5 sessions have intermittent failures, when route cache is invalidated. ip_queue_xmit() has to find a new route, calls sk_setup_caps(sk, &rt->u.dst), destroying the sk->sk_route_caps &= ~NETIF_F_GSO_MASK that MD5 desperately try to make all over its way (from tcp_transmit_skb() for example) So we send few bad packets, and everything is fine when tcp_transmit_skb() is called again for this socket. Since ip_queue_xmit() is at a lower level than TCP-MD5, I chose to use a socket field, sk_route_nocaps, containing bits to mask on sk_route_caps. Reported-by: Bhaskar Dutta <bhaskie@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-16tcp: fix MD5 (RFC2385) supportEric Dumazet
TCP MD5 support uses percpu data for temporary storage. It currently disables preemption so that same storage cannot be reclaimed by another thread on same cpu. We also have to make sure a softirq handler wont try to use also same context. Various bug reports demonstrated corruptions. Fix is to disable preemption and BH. Reported-by: Bhaskar Dutta <bhaskie@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-15net: Consistent skb timestampingEric Dumazet
With RPS inclusion, skb timestamping is not consistent in RX path. If netif_receive_skb() is used, its deferred after RPS dispatch. If netif_rx() is used, its done before RPS dispatch. This can give strange tcpdump timestamps results. I think timestamping should be done as soon as possible in the receive path, to get meaningful values (ie timestamps taken at the time packet was delivered by NIC driver to our stack), even if NAPI already can defer timestamping a bit (RPS can help to reduce the gap) Tom Herbert prefer to sample timestamps after RPS dispatch. In case sampling is expensive (HPET/acpi_pm on x86), this makes sense. Let admins switch from one mode to another, using a new sysctl, /proc/sys/net/core/netdev_tstamp_prequeue Its default value (1), means timestamps are taken as soon as possible, before backlog queueing, giving accurate timestamps. Setting a 0 value permits to sample timestamps when processing backlog, after RPS dispatch, to lower the load of the pre-RPS cpu. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-15xfrm: fix policy unreferencing on larval dropTimo Teras
I mistakenly had the error path to use num_pols to decide how many policies we need to drop (cruft from earlier patch set version which did not handle socket policies right). This is wrong since normally we do not keep explicit references (instead we hold reference to the cache entry which holds references to policies). drop_pols is set to num_pols if we are holding the references, so use that. Otherwise we eventually BUG_ON inside xfrm_policy_destroy due to premature policy deletion. Signed-off-by: Timo Teras <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-15net: adjust handle_macvlan to pass port struct to hookJiri Pirko
Now there's null check here and also again in the hook. Looking at bridge bits which are simmilar, port structure is rcu_dereferenced right away in handle_bridge and passed to hook. Looks nicer. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-15skge: use the DMA state API instead of the pci equivalentsFUJITA Tomonori
This replace the PCI DMA state API (include/linux/pci-dma.h) with the DMA equivalents since the PCI DMA state API will be obsolete. No functional change. For further information about the background: http://marc.info/?l=linux-netdev&m=127037540020276&w=2 Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-15net: reserve ports for applications using fixed port numbersAmerigo Wang
(Dropped the infiniband part, because Tetsuo modified the related code, I will send a separate patch for it once this is accepted.) This patch introduces /proc/sys/net/ipv4/ip_local_reserved_ports which allows users to reserve ports for third-party applications. The reserved ports will not be used by automatic port assignments (e.g. when calling connect() or bind() with port number 0). Explicit port allocation behavior is unchanged. Signed-off-by: Octavian Purdila <opurdila@ixiacom.com> Signed-off-by: WANG Cong <amwang@redhat.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-15sysctl: add proc_do_large_bitmapOctavian Purdila
The new function can be used to read/write large bitmaps via /proc. A comma separated range format is used for compact output and input (e.g. 1,3-4,10-10). Writing into the file will first reset the bitmap then update it based on the given input. Signed-off-by: Octavian Purdila <opurdila@ixiacom.com> Signed-off-by: WANG Cong <amwang@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-15sysctl: refactor integer handling proc codeAmerigo Wang
(Based on Octavian's work, and I modified a lot.) As we are about to add another integer handling proc function a little bit of cleanup is in order: add a few helper functions to improve code readability and decrease code duplication. In the process a bug is also fixed: if the user specifies a number with more then 20 digits it will be interpreted as two integers (e.g. 10000...13 will be interpreted as 100.... and 13). Behavior for EFAULT handling was changed as well. Previous to this patch, when an EFAULT error occurred in the middle of a write operation, although some of the elements were set, that was not acknowledged to the user (by shorting the write and returning the number of bytes accepted). EFAULT is now treated just like any other errors by acknowledging the amount of bytes accepted. Signed-off-by: Octavian Purdila <opurdila@ixiacom.com> Signed-off-by: WANG Cong <amwang@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-15Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/inaky/wimax
2010-05-15bridge: update sysfs link names if port device names have changedSimon Arlott
Links for each port are created in sysfs using the device name, but this could be changed after being added to the bridge. As well as being unable to remove interfaces after this occurs (because userspace tools don't recognise the new name, and the kernel won't recognise the old name), adding another interface with the old name to the bridge will cause an error trying to create the sysfs link. This fixes the problem by listening for NETDEV_CHANGENAME notifications and renaming the link. https://bugzilla.kernel.org/show_bug.cgi?id=12743 Signed-off-by: Simon Arlott <simon@fire.lp0.eu> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-15bridge: change console message interfacestephen hemminger
Use one set of macro's for all bridge messages. Note: can't use netdev_XXX macro's because bridge is purely virtual and has no device parent. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-15bridge: netpoll cleanupstephen hemminger
Move code around so that the ifdef for NETPOLL_CONTROLLER don't have to show up in main code path. The control functions should be in helpers that are only compiled if needed. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-15rndis_host: Poll status channel before control channelBen Hutchings
Some RNDIS devices don't respond on the control channel until polled on the status channel. In particular, this was reported to be the case for the 2Wire HomePortal 1000SW. This is roughly based on a patch by John Carr <john.carr@unrouted.co.uk> which is reported to be needed for use with some Windows Mobile devices and which is currently applied by Mandriva. Reported-by: Mark Glassberg <vzeeaxwl@myfairpoint.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Tested-by: Mark Glassberg <vzeeaxwl@myfairpoint.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-14netfilter: xt_TEE depends on NF_CONNTRACKRandy Dunlap
Fix xt_TEE build for the case of NF_CONNTRACK=m and NETFILTER_XT_TARGET_TEE=y: xt_TEE.c:(.text+0x6df5c): undefined reference to `nf_conntrack_untracked' 4x Built with all 4 m/y combinations. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-14Merge branch 'net-2.6' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
2010-05-14qlcnic: add idc debug registersSucheta Chakraborty
When ever driver changes the device state, it should write pci-func number and timestamp in debug registers. Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com> Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-14qlcnic: support quisce modeSucheta Chakraborty
Device can go to quiescent state, during which drivers should refrain from using the device. Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com> Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-14qlcnic: check device classSucheta Chakraborty
pci-func class can be other than ethernet in Qlogic CNA device. Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com> Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-14qlcnic: check IDC versionSucheta Chakraborty
Warn user if IDC version mismatch with different class of drivers. Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com> Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-14qlcnic: cleanup unused codeAmit Kumar Salecha
LRO ring, cut-thru mode and specific fw version are not valid to Qlogic CNA device. Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-14qlcnic: cleanup dma mask settingAmit Kumar Salecha
Device support 64 bit dma mask. Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-14qlcnic: fix caching window registerAmit Kumar Salecha
o Window register is not per pci-func, so caching can result in expected result. Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-14qlcnic: remove obsolete registerAmit Kumar Salecha
MSI_MODE, CAPABILITIES_FW and SCRATCHPAD registers are obsolete. Driver should not use them. Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-14qlcnic: fix context cleanupAmit Kumar Salecha
Before going for recovery, every pci-func should check fw state, irrespective of device state. This to avoid unnecssary sending of command for ctx destroy. Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-14sky2: version 1.28stephen hemminger
Version 1.28 Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-14sky2: Avoid allocating memory in sky2_resumeMike McCormack
Allocating memory can fail, and since we have the memory we need in sky2_resume when sky2_suspend is called, just stop the hardware without freeing the memory it's using. This avoids the possibility of failing because we can't allocate memory in sky2_resume(), and allows sharing code with sky2_restart(). Signed-off-by: Mike McCormack <mikem@ring3k.org> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-14sky2: Refactor down/up code out of sky2_restart()Mike McCormack
Code to bring down all sky2 interfaces and bring it up again can be reused in sky2_suspend and sky2_resume. Factor the code to bring the interfaces down into sky2_all_down and the up code into sky2_all_up. Signed-off-by: Mike McCormack <mikem@ring3k.org> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-14sky2: Shut off interrupts before NAPIMike McCormack
Interrupts should be masked, then synchronized, and finally NAPI should be disabled. Signed-off-by: Mike McCormack <mikem@ring3k.org> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-14sky2: Avoid race in sky2_change_mtuMike McCormack
netif_stop_queue does not ensure all in-progress transmits are complete, so use netif_tx_disable() instead. Secondly, make sure NAPI polls are disabled before stopping the tx queue, otherwise sky2_status_intr might trigger a TX queue wakeup between when we stop the queue and NAPI is disabled. Signed-off-by: Mike McCormack <mikem@ring3k.org> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-14sky2: Restore multicast after restartMike McCormack
Multicast settings will be lost on reset, so restore them. Signed-off-by: Mike McCormack <mikem@ring3k.org> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-14ixgb and e1000: Use new function for copybreak testsJoe Perches
There appears to be an off-by-1 defect in the maximum packet size copied when copybreak is speified in these modules. The copybreak module params are specified as: "Maximum size of packet that is copied to a new buffer on receive" The tests are changed from "< copybreak" to "<= copybreak" and moved into new static functions for readability. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-14e1000: cleanup unused parametersJesse Brandeburg
During the cleanup pass after the removal of e1000e hardware from e1000 some parameters were missed. Remove them because it is just dead code. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-14e1000: fix WARN_ON with mac-vlanJesse Brandeburg
When adding more than 14 mac-vlan adapters on e1000 the driver would fire a WARN_ON when adding the 15th. The WARN_ON in this case is completely un-necessary, as the code below the WARN_ON is directly handling the value the WARN_ON triggered on. CC: Jiri Pirko <jpirko@redhat.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-14drivers/net: Remove unnecessary returns from void function()sJoe Perches
This patch removes from drivers/net/ all the unnecessary return; statements that precede the last closing brace of void functions. It does not remove the returns that are immediately preceded by a label as gcc doesn't like that. It also does not remove null void functions with return. Done via: $ grep -rP --include=*.[ch] -l "return;\n}" net/ | \ xargs perl -i -e 'local $/ ; while (<>) { s/\n[ \t\n]+return;\n}/\n}/g; print; }' with some cleanups by hand. Compile tested x86 allmodconfig only. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-13ixgbe: Refactor common code between 82598 & 82599 to accommodate new hardwareMallikarjuna R Chilakala
Some of the following MAC functions are moved from 82598 & 82599 specific hardware files to common.[ch] to accommodate new silicon changes. Also fixed some white space issues * get_san_mac_addr, check_link, set_vmdq, clear_vmdq, clear_vfta, * set_vfta, fc_enable, init_uta_tables Signed-off-by: Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>