summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2011-12-12tcp memory pressure controlsGlauber Costa
This patch introduces memory pressure controls for the tcp protocol. It uses the generic socket memory pressure code introduced in earlier patches, and fills in the necessary data in cg_proto struct. Signed-off-by: Glauber Costa <glommer@parallels.com> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujtisu.com> CC: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-12socket: initial cgroup code.Glauber Costa
The goal of this work is to move the memory pressure tcp controls to a cgroup, instead of just relying on global conditions. To avoid excessive overhead in the network fast paths, the code that accounts allocated memory to a cgroup is hidden inside a static_branch(). This branch is patched out until the first non-root cgroup is created. So when nobody is using cgroups, even if it is mounted, no significant performance penalty should be seen. This patch handles the generic part of the code, and has nothing tcp-specific. Signed-off-by: Glauber Costa <glommer@parallels.com> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujtsu.com> CC: Kirill A. Shutemov <kirill@shutemov.name> CC: David S. Miller <davem@davemloft.net> CC: Eric W. Biederman <ebiederm@xmission.com> CC: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-12foundations of per-cgroup memory pressure controlling.Glauber Costa
This patch replaces all uses of struct sock fields' memory_pressure, memory_allocated, sockets_allocated, and sysctl_mem to acessor macros. Those macros can either receive a socket argument, or a mem_cgroup argument, depending on the context they live in. Since we're only doing a macro wrapping here, no performance impact at all is expected in the case where we don't have cgroups disabled. Signed-off-by: Glauber Costa <glommer@parallels.com> Reviewed-by: Hiroyouki Kamezawa <kamezawa.hiroyu@jp.fujitsu.com> CC: David S. Miller <davem@davemloft.net> CC: Eric W. Biederman <ebiederm@xmission.com> CC: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-12Basic kernel memory functionality for the Memory ControllerGlauber Costa
This patch lays down the foundation for the kernel memory component of the Memory Controller. As of today, I am only laying down the following files: * memory.independent_kmem_limit * memory.kmem.limit_in_bytes (currently ignored) * memory.kmem.usage_in_bytes (always zero) Signed-off-by: Glauber Costa <glommer@parallels.com> CC: Kirill A. Shutemov <kirill@shutemov.name> CC: Paul Menage <paul@paulmenage.org> CC: Greg Thelen <gthelen@google.com> CC: Johannes Weiner <jweiner@redhat.com> CC: Michal Hocko <mhocko@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-12xen-netfront: delay gARP until backend switches to ConnectedLaszlo Ersek
After a guest is live migrated, the xen-netfront driver emits a gratuitous ARP message, so that networking hardware on the target host's subnet can take notice, and public routing to the guest is re-established. However, if the packet appears on the backend interface before the backend is added to the target host's bridge, the packet is lost, and the migrated guest's peers become unable to talk to the guest. A sufficient two-parts condition to prevent the above is: (1) ensure that the backend only moves to Connected xenbus state after its hotplug scripts completed, ie. the netback interface got added to the bridge; and (2) ensure the frontend only queues the gARP when it sees the backend move to Connected. These two together provide complete ordering. Sub-condition (1) is already satisfied by commit f942dc2552b8 in Linus' tree, based on commit 6b0b80ca7165 from [1]. In general, the full condition is sufficient, not necessary, because, according to [2], live migration has been working for a long time without satisfying sub-condition (2). However, after 6b0b80ca7165 was backported to the RHEL-5 host to ensure (1), (2) still proved necessary in the RHEL-6 guest. This patch intends to provide (2) for upstream. The Reviewed-by line comes from [3]. [1] git://xenbits.xen.org/people/ianc/linux-2.6.git#upstream/dom0/backend/netback-history [2] http://old-list-archives.xen.org/xen-devel/2011-06/msg01969.html [3] http://old-list-archives.xen.org/xen-devel/2011-07/msg00484.html Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-12Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem
2011-12-11net: use IS_ENABLED(CONFIG_IPV6)Eric Dumazet
Instead of testing defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-11be2net: workaround to fix a bug in BEAjit Khaparde
disable Tx vlan offloading in certain cases. Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-11be2net: update some counters to display via ethtoolAjit Khaparde
update pmem_fifo_overflow_drop, rx_priority_pause_frames counters. Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-10udp_diag: Fix the !ipv6 casePavel Emelyanov
Wrap the udp6 lookup into the proper ifdef-s. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-10udp_diag: Make it module when ipv6 is a modulePavel Emelyanov
Eric Dumazet reported, that when inet_diag is built-in the udp_diag also goes built-in and when ipv6 is a module the udp6 lookup symbol is not found. LD .tmp_vmlinux1 net/built-in.o: In function `udp_dump_one': udp_diag.c:(.text+0xa2b40): undefined reference to `__udp6_lib_lookup' make: *** [.tmp_vmlinux1] Erreur 1 Fix this by making udp diag build mode depend on both -- inet diag and ipv6. Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09iwlwifi regression in 20111205 mergeNikolay Martynov
It looks like the regression was introduced between 20111202 and 20111205 (linux-next tree). Symptoms: connection to AP seem to be established, but no data goes though it in any way. Tested on intel 5300. Peek at the changes have shown that it looks like at least part of the code wasn't merged properly. It was originally committed into iwl_agn.c but code in question was moved to iwl-mac80211.c. This patch puts code in place and my card works again. Signed-off-by: Nikolay Martynov <mar.kolya@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-09wl12xx: silence tx_attr uninitialized warning in wl1271_tx_fill_hdrJohn W. Linville
CC [M] drivers/net/wireless/wl12xx/tx.o drivers/net/wireless/wl12xx/tx.c: In function ‘wl1271_tx_fill_hdr’: drivers/net/wireless/wl12xx/tx.c:288:6: warning: ‘tx_attr’ may be used uninitialized in this function Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-09udp_diag: Wire the udp_diag module into kbuildPavel Emelyanov
Copy-s/tcp/udp/-paste from TCP bits. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09udp_diag: Implement the dump-all functionalityPavel Emelyanov
Do the same as TCP does -- iterate the given udp_table, filter sockets with bytecode and dump sockets into reply message. The same filtering as for TCP applies, though only some of the state bits really matter. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09udp_diag: Implement the get_exact dumping functionalityPavel Emelyanov
Do the same as TCP does -- lookup a socket in the given udp_table, check cookie, fill the reply message with existing inet socket dumping helper and send one back. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09udp_diag: Basic skeletonPavel Emelyanov
Introduce the transport level diag handler module for UDP (and UDP-lite) sockets and register (empty for now) callbacks in the inet_diag module. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09udp: Export code sk lookup routinesPavel Emelyanov
The UDP diag get_exact handler will require them to find a socket by provided net, [sd]addr-s, [sd]ports and device. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09inet_diag: Generalize inet_diag dump and get_exact callsPavel Emelyanov
Introduce two callbacks in inet_diag_handler -- one for dumping all sockets (with filters) and the other one for dumping a single sk. Replace direct calls to icsk handlers with indirect calls to callbacks provided by handlers. Make existing TCP and DCCP handlers use provided helpers for icsk-s. The UDP diag module will provide its own. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09inet_diag: Introduce the inet socket dumping routinePavel Emelyanov
The existing inet_csk_diag_fill dumps the inet connection sock info into the netlink inet_diag_message. Prepare this routine to be able to dump only the inet_sock part of a socket if the icsk part is missing. This will be used by UDP diag module when dumping UDP sockets. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09inet_diag: Introduce the byte-code run on an inet socketPavel Emelyanov
The upcoming UDP module will require exactly this ability, so just move the existing code to provide one. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09inet_diag: Split inet_diag_get_exact into partsPavel Emelyanov
Similar to previous patch: the 1st part locks the inet handler and will get generalized and the 2nd one dumps icsk-s and will be used by TCP and DCCP handlers. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09inet_diag: Split inet_diag_get_exact into partsPavel Emelyanov
The 1st part locks the inet handler and the 2nd one dump the inet connection sock. In the next patches the 1st part will be generalized to call the socket dumping routine indirectly (i.e. TCP/UDP/DCCP) and the 2nd part will be used by TCP and DCCP handlers. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09inet_diag: Export inet diag cookie checking routinePavel Emelyanov
The netlink diag susbsys stores sk address bits in the nl message as a "cookie" and uses one when dumps details about particular socket. The same will be required for udp diag module, so introduce a heler in inet_diag module Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09inet_diag: Reduce the number of args for bytecode run routinePavel Emelyanov
Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09inet_diag: Remove indirect sizeof from inet diag handlersPavel Emelyanov
There's an info_size value stored on inet_diag_handler, but for existing code this value is effectively constant, so just use sizeof(struct tcp_info) where required. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09sch_red: generalize accurate MAX_P support to RED/GRED/CHOKEEric Dumazet
Now RED uses a Q0.32 number to store max_p (max probability), allow RED/GRED/CHOKE to use/report full resolution at config/dump time. Old tc binaries are non aware of new attributes, and still set/get Plog. New tc binary set/get both Plog and max_p for backward compatibility, they display "probability value" if they get max_p from new kernels. # tc -d qdisc show dev ... ... qdisc red 10: parent 1:1 limit 360Kb min 30Kb max 90Kb ecn ewma 5 probability 0.09 Scell_log 15 Make sure we avoid potential divides by 0 in reciprocal_value(), if (max_th - min_th) is big. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09Revert "net: netprio_cgroup: make net_prio_subsys static"John Fastabend
This reverts commit 865d9f9f748fdc1943679ea65d9ee1dc55e4a6ae. This commit breaks the build with CONFIG_NETPRIO_CGROUP=y so revert it. It does build as a module though. The SUBSYS macro in the cgroup core code automatically defines a subsys structure as extern. Long term we should fix the macro. And I need to fully build test things. Tested with CONFIG_NETPRIO_CGROUP={y|m|n} with and without CONFIG_CGROUPS defined. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> CC: Neil Horman <nhorman@tuxdriver.com> Reported-By: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08sock_diag: off by one checksDan Carpenter
These tests are off by one because sock_diag_handlers[] only has AF_MAX elements. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08net: netprio_cgroup: make net_prio_subsys staticJohn Fastabend
net_prio_subsys can be made static this removes the sparse warning it was throwing. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08bnx2x: properly initialize L5 featuresDmitry Kravkov
The code is missing initialization of NO_FCOE_FLAG and NO_ISCSI*FLAGS when CONFIG_CNIC is not selected. This causes panic during driver load since commit 1d187b34daaecbb87aa523ba46b92930a388cb21 where NO_FCOE tested unconditionally (outside #ifdef BCM_CNIC structure) and accessed fp[FCOE_IDX] which is not allocated. Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08vlan: add 802.1q netpoll supportBenjamin LaHaise
Add netpoll support to 802.1q vlan devices. Based on the netpoll support in the bridging code. Tested on a forced_eth device with netconsole. Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08sch_red: Adaptative RED AQMEric Dumazet
Adaptative RED AQM for linux, based on paper from Sally FLoyd, Ramakrishna Gummadi, and Scott Shenker, August 2001 : http://icir.org/floyd/papers/adaptiveRed.pdf Goal of Adaptative RED is to make max_p a dynamic value between 1% and 50% to reach the target average queue : (max_th - min_th) / 2 Every 500 ms: if (avg > target and max_p <= 0.5) increase max_p : max_p += alpha; else if (avg < target and max_p >= 0.01) decrease max_p : max_p *= beta; target :[min_th + 0.4*(min_th - max_th), min_th + 0.6*(min_th - max_th)]. alpha : min(0.01, max_p / 4) beta : 0.9 max_P is a Q0.32 fixed point number (unsigned, with 32 bits mantissa) Changes against our RED implementation are : max_p is no longer a negative power of two (1/(2^Plog)), but a Q0.32 fixed point number, to allow full range described in Adatative paper. To deliver a random number, we now use a reciprocal divide (thats really a multiply), but this operation is done once per marked/droped packet when in RED_BETWEEN_TRESH window, so added cost (compared to previous AND operation) is near zero. dump operation gives current max_p value in a new TCA_RED_MAX_P attribute. Example on a 10Mbit link : tc qdisc add dev $DEV parent 1:1 handle 10: est 1sec 8sec red \ limit 400000 min 30000 max 90000 avpkt 1000 \ burst 55 ecn adaptative bandwidth 10Mbit # tc -s -d qdisc show dev eth3 ... qdisc red 10: parent 1:1 limit 400000b min 30000b max 90000b ecn adaptative ewma 5 max_p=0.113335 Scell_log 15 Sent 50414282 bytes 34504 pkt (dropped 35, overlimits 1392 requeues 0) rate 9749Kbit 831pps backlog 72056b 16p requeues 0 marked 1357 early 35 pdrop 0 other 0 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08team: use vlan_vids_[addr/del]_by_devJiri Pirko
So far when vlan id was added to team device befor port was added, this vid was not added to port's vlan filter. Also after removal, vid stayed in port device's vlan filter. Benefit of new vlan functions to handle this work. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08vlan: introduce functions to do mass addition/deletion of vids by another deviceJiri Pirko
Introduce functions handy to copy vlan ids from one driver's list to another. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08vlan: introduce vid list with reference countingJiri Pirko
This allows to keep track of vids needed to be in rx vlan filters of devices even if they are used in bond/team etc. vlan_info as well as vlan_group previously was, is allocated when first vid is added and dealocated whan last vid is deleted. vlan_group definition is moved to private header. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08net: introduce vlan_vid_[add/del] and use them instead of direct ↵Jiri Pirko
[add/kill]_vid ndo calls This patch adds wrapper for ndo_vlan_rx_add_vid/ndo_vlan_rx_kill_vid functions. Check for NETIF_F_HW_VLAN_FILTER feature is done in this wrapper. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08net: make vlan ndo_vlan_rx_[add/kill]_vid return error valueJiri Pirko
Let caller know the result of adding/removing vlan id to/from vlan filter. In some drivers I make those functions to just return 0. But in those where there is able to see if hw setup went correctly, return value is set appropriately. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08vlan: rename vlan_dev_info to vlan_dev_privJiri Pirko
As this structure is priv, name it approprietely. Also for pointer to it use name "vlan". Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08be2net: netpoll supportIvan Vecera
Add missing netpoll support. Signed-off-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08net/fec: make FEC driver buildable as moduleLothar Waßmann
Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de> Tested-by: Shawn Guo <shawn.guo@linaro.org> Acked-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08net/fec: fix the .remove codeLothar Waßmann
The .remove code is broken in several ways. - mdiobus_unregister() is called twice for the same object in case of dual FEC - phy_disconnect() is being called when the PHY is already disconnected - the requested IRQ(s) are not freed - fec_stop() is being called with the inteface already stopped All of those lead to kernel crashes if the remove function is actually used. Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de> Tested-by: Shawn Guo <shawn.guo@linaro.org> Acked-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08net/fec: preserve MII/RMII setting in fec_stop()Lothar Waßmann
Additionally to setting the ETHER_EN bit in FEC_ECNTRL the MII/RMII setting in FEC_R_CNTRL needs to be preserved to keep the MII interface functional. Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de> Tested-by: Shawn Guo <shawn.guo@linaro.org> Acked-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08net/fec: don't munge MAC address from platform dataLothar Waßmann
When the MAC address is supplied via platform_data it should be OK as it is and should not be modified in case of a dual FEC setup. Also copying the MAC from platform_data to the single 'macaddr' variable will overwrite the MAC for the first interface in case of a dual FEC setup. Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de> Acked-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08net/fec: don't request invalid IRQLothar Waßmann
prevent calling request_irq() with a known invalid IRQ number and preserve the return value of the platform_get_irq() function Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de> Acked-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08net/fec: prevent dobule restart of interface on FDX/HDX changeLothar Waßmann
Upon detection of a FDX/HDX change the interface is restarted twice. Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de> Acked-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08net/fec: set con_id in clk_get() call to NULLLothar Waßmann
The con_id is actually not needed for clk_get(). Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de> Acked-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08net/fec: misc cleanupsLothar Waßmann
- remove some bogus whitespace - remove line wraps from printk messages Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de> Acked-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08tg3: Update version to 3.122Matt Carlson
This patch updates the tg3 version to 3.122. Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Reviewed-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08tg3: Return flowctrl config through ethtoolMatt Carlson
This patch changes the driver to return the flow control configuration rather than the flow control status through the ETHTOOL_GPAUSEPARAM ioctl. Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Reviewed-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>