summaryrefslogtreecommitdiff
path: root/net/core
AgeCommit message (Collapse)Author
2006-12-02[NET]: Size listen hash tables using backlog hintEric Dumazet
We currently allocate a fixed size (TCP_SYNQ_HSIZE=512) slots hash table for each LISTEN socket, regardless of various parameters (listen backlog for example) On x86_64, this means order-1 allocations (might fail), even for 'small' sockets, expecting few connections. On the contrary, a huge server wanting a backlog of 50000 is slowed down a bit because of this fixed limit. This patch makes the sizing of listen hash table a dynamic parameter, depending of : - net.core.somaxconn tunable (default is 128) - net.ipv4.tcp_max_syn_backlog tunable (default : 256, 1024 or 128) - backlog value given by user application (2nd parameter of listen()) For large allocations (bigger than PAGE_SIZE), we use vmalloc() instead of kmalloc(). We still limit memory allocation with the two existing tunables (somaxconn & tcp_max_syn_backlog). So for standard setups, this patch actually reduce RAM usage. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02[NET] rules: Add support to invert selectorsThomas Graf
Introduces a new flag FIB_RULE_INVERT causing rules to apply if the specified selector doesn't match. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02[NET] rules: Protocol independant mark selectorThomas Graf
Move mark selector currently implemented per protocol into the protocol independant part. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02[NET]: Turn nfmark into generic markThomas Graf
nfmark is being used in various subsystems and has become the defacto mark field for all kinds of packets. Therefore it makes sense to rename it to `mark' and remove the dependency on CONFIG_NETFILTER. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02[BLUETOOTH] lockdep: annotate sk_lock nesting in AF_BLUETOOTHPeter Zijlstra
============================================= [ INFO: possible recursive locking detected ] 2.6.18-1.2726.fc6 #1
2006-12-02[PATCH] netdev: don't allow register_netdev with blank nameStephen Hemminger
This bit of old backwards compatibility cruft can be removed in 2.6.20. If there is still an device that calls register_netdev() with a zero or blank name, it will get -EINVAL from register_netdevice(). Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-11-07[NET]: Set truesize in pskb_copyHerbert Xu
Since pskb_copy tacks on the non-linear bits from the original skb, it needs to count them in the truesize field of the new skb. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-11-07[NETPOLL]: Compute checksum properly in netpoll_send_udp().Chris Lalancette
Signed-off-by: Chris Lalancette <clalance@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-11-05[NET]: __alloc_pages() failures reported due to fragmentationLarry Woodman
We have seen a couple of __alloc_pages() failures due to fragmentation, there is plenty of free memory but no large order pages available. I think the problem is in sock_alloc_send_pskb(), the gfp_mask includes __GFP_REPEAT but its never used/passed to the page allocator. Shouldnt the gfp_mask be passed to alloc_skb() ? Signed-off-by: Larry Woodman <lwoodman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-11-05[PKTGEN]: TCI endianness fixesAl Viro
open-coded variant there works only for little-endian Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-30[NET]: Fix segmentation of linear packetsHerbert Xu
skb_segment fails to segment linear packets correctly because it tries to write all linear parts of the original skb into each segment. This will always panic as each segment only contains enough space for one MSS. This was not detected earlier because linear packets should be rare for GSO. In fact it still remains to be seen what exactly created the linear packets that triggered this bug. Basically the only time this should happen is if someone enables GSO emulation on an interface that does not support SG. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-21Merge branch 'we21-fix' of ↵Jeff Garzik
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into tmp
2006-10-19[NETPOLL]: initialize skb for UDPStephen Hemminger
Need to fully initialize skb to keep lower layers and queueing happy. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-19[PATCH] wireless: WE-20 compatibility for ESSID and NICKN ioctlsJohn W. Linville
WE-21 changed the ABI for the SIOC[SG]IW{ESSID,NICKN} ioctls by dropping NULL termination. This patch adds compatibility code so that WE-21 can work properly with WE-20 (and older) tools. Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-10-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/dtor/inputLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: fm801-gp - handle errors from pci_enable_device() Input: gameport core - handle errors returned by device_bind_driver() Input: serio core - handle errors returned by device_bind_driver() Lockdep: fix compile error in drivers/input/serio/serio.c Input: serio - add lockdep annotations Lockdep: add lockdep_set_class_and_subclass() and lockdep_set_subclass() Input: atkbd - supress "too many keys" error message Input: i8042 - supress ACK/NAKs when blinking during panic Input: add missing exports to fix modular build
2006-10-17[PATCH] rename net_random to random32Stephen Hemminger
Make net_random() more widely available by calling it random32 akpm: hopefully this will permit the removal of carta_random32. That needs confirmation from Stephane - this code looks somewhat more computationally expensive, and has a different (ie: callee-stateful) interface. [akpm@osdl.org: lots of build fixes, cleanups] Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Stephane Eranian <eranian@hpl.hp.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-12[RTNETLINK]: Fix use of wrong skb in do_getlink()Patrick McHardy
skb is the netlink query, nskb is the reply message. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-11[NET]: File descriptor loss while receiving SCM_RIGHTSMiklos Szeredi
If more than one file descriptor was sent with an SCM_RIGHTS message, and on the receiving end, after installing a nonzero (but not all) file descritpors the process runs out of fds, then the already installed fds will be lost (userspace will have no way of knowing about them). The following patch makes sure, that at least the already installed fds are sent to userspace. It doesn't solve the issue of losing file descriptors in case of an EFAULT on the userspace buffer. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-11IPsec: propagate security module errors up from flow_cache_lookupJames Morris
When a security module is loaded (in this case, SELinux), the security_xfrm_policy_lookup() hook can return an access denied permission (or other error). We were not handling that correctly, and in fact inverting the return logic and propagating a false "ok" back up to xfrm_lookup(), which then allowed packets to pass as if they were not associated with an xfrm policy. The way I was seeing the problem was when connecting via IPsec to a confined service on an SELinux box (vsftpd), which did not have the appropriate SELinux policy permissions to send packets via IPsec. The first SYNACK would be blocked, because of an uncached lookup via flow_cache_lookup(), which would fail to resolve an xfrm policy because the SELinux policy is checked at that point via the resolver. However, retransmitted SYNACKs would then find a cached flow entry when calling into flow_cache_lookup() with a null xfrm policy, which is interpreted by xfrm_lookup() as the packet not having any associated policy and similarly to the first case, allowing it to pass without transformation. The solution presented here is to first ensure that errno values are correctly propagated all the way back up through the various call chains from security_xfrm_policy_lookup(), and handled correctly. Then, flow_cache_lookup() is modified, so that if the policy resolver fails (typically a permission denied via the security module), the flow cache entry is killed rather than having a null policy assigned (which indicates that the packet can pass freely). This also forces any future lookups for the same flow to consult the security module (e.g. SELinux) for current security policy (rather than, say, caching the error on the flow cache entry). Signed-off-by: James Morris <jmorris@namei.org>
2006-10-11Lockdep: add lockdep_set_class_and_subclass() and lockdep_set_subclass()Peter Zijlstra
This annotation makes it possible to assign a subclass on lock init. This annotation is meant to reduce the _nested() annotations by assigning a default subclass. One could do without this annotation and rely on lockdep_set_class() exclusively, but that would require a manual stack of struct lock_class_key objects. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2006-10-04Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/confighLinus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davej/configh: Remove all inclusions of <linux/config.h> Manually resolved trivial path conflicts due to removed files in the sound/oss/ subdirectory.
2006-10-04Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [XFRM]: BEET mode [TCP]: Kill warning in tcp_clean_rtx_queue(). [NET_SCHED]: Remove old estimator implementation [ATM]: [zatm] always *pcr in alloc_shaper() [ATM]: [ambassador] Change the return type to reflect reality [ATM]: kmalloc to kzalloc patches for drivers/atm [TIPC]: fix printk warning [XFRM]: Clearing xfrm_policy_count[] to zero during flush is incorrect. [XFRM] STATE: Use destination address for src hash. [NEIGH]: always use hash_mask under tbl lock [UDP]: Fix MSG_PROBE crash [UDP6]: Fix flowi clobbering [NET_SCHED]: Revert "HTB: fix incorrect use of RB_EMPTY_NODE" [NETFILTER]: ebt_mark: add or/and/xor action support to mark target [NETFILTER]: ipt_REJECT: remove largely duplicate route_reverse function [NETFILTER]: Honour source routing for LVS-NAT [NETFILTER]: add type parameter to ip_route_me_harder [NETFILTER]: Kconfig: fix xt_physdev dependencies
2006-10-04[PATCH] slab: clean up leak tracking ifdefs a little bitChristoph Hellwig
- rename ____kmalloc to kmalloc_track_caller so that people have a chance to guess what it does just from it's name. Add a comment describing it for those who don't. Also move it after kmalloc in slab.h so people get less confused when they are just looking for kmalloc - move things around in slab.c a little to reduce the ifdef mess. [penberg@cs.helsinki.fi: Fix up reversed #ifdef] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Christoph Lameter <clameter@engr.sgi.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04Remove all inclusions of <linux/config.h>Dave Jones
kbuild explicitly includes this at build time. Signed-off-by: Dave Jones <davej@redhat.com>
2006-10-04[NEIGH]: always use hash_mask under tbl lockJulian Anastasov
Make sure hash_mask is protected with tbl->lock in all cases just like the hash_buckets. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-28[ETHTOOL]: Remove some entries from non-root command list.David S. Miller
GWOL might provide passwords GSET, GLINK, and GSTATS might poke the hardware Based upon feedback from Jeff Garzik. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-28[ETHTOOL]: let mortals use ethtoolStephen Hemminger
There is no reason to not allow non-admin users to query network statistics and settings. [ Removed PHYS_ID and GREGS based upon feedback from Auke Kok and Michael Chan -DaveM] Acked-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-28[NET]: Annotate dst_ops protocolAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-28[NET]: is it Andy or Andi ??Steven Rostedt
I don't know of any Andy Kleen's but I do know a Andi Kleen. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-28[NET_SCHED]: Fix fallout from dev->qdisc RCU changePatrick McHardy
The move of qdisc destruction to a rcu callback broke locking in the entire qdisc layer by invalidating previously valid assumptions about the context in which changes to the qdisc tree occur. The two assumptions were: - since changes only happen in process context, read_lock doesn't need bottem half protection. Now invalid since destruction of inner qdiscs, classifiers, actions and estimators happens in the RCU callback unless they're manually deleted, resulting in dead-locks when read_lock in process context is interrupted by write_lock_bh in bottem half context. - since changes only happen under the RTNL, no additional locking is necessary for data not used during packet processing (f.e. u32_list). Again, since destruction now happens in the RCU callback, this assumption is not valid anymore, causing races while using this data, which can result in corruption or use-after-free. Instead of "fixing" this by disabling bottem halfs everywhere and adding new locks/refcounting, this patch makes these assumptions valid again by moving destruction back to process context. Since only the dev->qdisc pointer is protected by RCU, but ->enqueue and the qdisc tree are still protected by dev->qdisc_lock, destruction of the tree can be performed immediately and only the final free needs to happen in the rcu callback to make sure dev_queue_xmit doesn't access already freed memory. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-28[PKTGEN]: DSCP supportFrancesco Fondelli
Anyway, I've been asked to add support for managing DSCP codepoints, so one can test DiffServ capable routers. It's very simple code and is working for me. Signed-off-by: Francesco Fondelli <francesco.fondelli@gmail.com> Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-28[PKTGEN]: vlan supportFrancesco Fondelli
The attached patch allows pktgen to produce 802.1Q and Q-in-Q tagged frames. I have used it for stress test a bridge and seems ok to me. Unfortunately I have no access to net-2.6.x git tree so the diff is against 2.6.17.13. Signed-off-by: Francesco Fondelli <francesco.fondelli@gmail.com> Acked-by: Steven Whitehouse <steve@chygwyn.com> Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-28[RTNETLINK]: Possible dereference in net/core/rtnetlink.cEric Sesterhenn
another possible dereference spotted by coverity (#cid 1390). if the nlmsg_parse() call fails, we goto errout, where we call dev_put(), with dev still initialized to NULL. Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-25[PATCH] WE-21 support (core API)John W. Linville
This is version 21 of the Wireless Extensions. Changelog : o finishes migrating the ESSID API (remove the +1) o netdev->get_wireless_stats is no more o long/short retry This is a redacted version of a patch originally submitted by Jean Tourrilhes. I removed most of the additions, in order to minimize future support requirements for nl80211 (or other WE successor). CC: Jean Tourrilhes <jt@hpl.hp.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-09-24Merge branch 'upstream-linus' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 * 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: (217 commits) net/ieee80211: fix more crypto-related build breakage [PATCH] Spidernet: add ethtool -S (show statistics) [NET] GT96100: Delete bitrotting ethernet driver [PATCH] mv643xx_eth: restrict to 32-bit PPC_MULTIPLATFORM [PATCH] Cirrus Logic ep93xx ethernet driver r8169: the MMIO region of the 8167 stands behin BAR#1 e1000, ixgb: Remove pointless wrappers [PATCH] Remove powerpc specific parts of 3c509 driver [PATCH] s2io: Switch to pci_get_device [PATCH] gt96100: move to pci_get_device API [PATCH] ehea: bugfix for register access functions [PATCH] e1000 disable device on PCI error drivers/net/phy/fixed: #if 0 some incomplete code drivers/net: const-ify ethtool_ops declarations [PATCH] ethtool: allow const ethtool_ops [PATCH] sky2: big endian [PATCH] sky2: fiber support [PATCH] sky2: tx pause bug fix drivers/net: Trim trailing whitespace [PATCH] ehea: IBM eHEA Ethernet Device Driver ... Manually resolved conflicts in drivers/net/ixgb/ixgb_main.c and drivers/net/sky2.c related to CHECKSUM_HW/CHECKSUM_PARTIAL changes by commit 84fa7933a33f806bbbaae6775e87459b1ec584c0 that just happened to be next to unrelated changes in this update.
2006-09-22[IPV6] ADDRCONF: Convert addrconf_lock to RCU.YOSHIFUJI Hideaki
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22[IPV6] NDISC: Set per-entry is_router flag in Proxy NA.Ville Nuorvala
We have sent NA with router flag from the node-wide forwarding configuration. This is not appropriate for proxy NA, and it should be set according to each proxy entry's configuration. This is used by Mobile IPv6 home agent to support physical home link in acting as a proxy router for mobile node which is not a router, for example. Based on MIPL2 kernel patch. Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2006-09-22[RTNETLINK]: Fix netdevice name corruptionPatrick McHardy
When changing a device by ifindex without including a IFLA_IFNAME attribute, the ifname variable contains random garbage and is used to change the device name. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22[NET]: Fix sk->sk_filter field accessDmitry Mishin
Function sk_filter() is called from tcp_v{4,6}_rcv() functions with arg needlock = 0, while socket is not locked at that moment. In order to avoid this and similar issues in the future, use rcu for sk->sk_filter field read protection. Signed-off-by: Dmitry Mishin <dim@openvz.org> Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: Kirill Korotaev <dev@openvz.org>
2006-09-22[RTNETLINK]: Fix typo causing wrong skb to be freedThomas Graf
A typo introduced by myself which leads to freeing the skb containing the netlink message when it should free the newly allocated skb for the reply. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22[NETLINK]: Make use of NLA_STRING/NLA_NUL_STRING attribute validationThomas Graf
Converts existing NLA_STRING attributes to use the new validation features, saving a couple of temporary buffers. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22[NET]: Use SLAB_PANICAlexey Dobriyan
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22[NET] in6_pton: Kill errant printf statement.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22[NET]: Add common helper functions to convert IPv6/IPv4 address string to ↵YOSHIFUJI Hideaki
network address structure. These helpers can be used in netfilter, cifs etc. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2006-09-22[RTNETLINK]: Don't return error on no-metrics.David S. Miller
Instead just cancel the nested attribute and return 0. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22[IPv6] route: Convert FIB6 dumping to use new netlink apiThomas Graf
Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22[NET] neighbour: reduce exportsStephen Hemminger
There are several symbols only used by rtnetlink and since it can not be a module, there is no reason to export them. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22[NET/IPV4/IPV6]: Change some sysctl variables to __read_mostlyBrian Haley
Change net/core, ipv4 and ipv6 sysctl variables to __read_mostly. Couldn't actually measure any performance increase while testing (.3% I consider noise), but seems like the right thing to do. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22[RTNETLINK]: Unexport rtnl socketThomas Graf
Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22[NET] link: Convert notifications to use rtnl_notify()Thomas Graf
Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>