summaryrefslogtreecommitdiff
path: root/Documentation/sysctl
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/sysctl')
-rw-r--r--Documentation/sysctl/kernel.txt51
-rw-r--r--Documentation/sysctl/net.txt63
-rw-r--r--Documentation/sysctl/vm.txt30
3 files changed, 117 insertions, 27 deletions
diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index ccd42589e124..9d4c1d18ad44 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -70,12 +70,12 @@ show up in /proc/sys/kernel:
- shmall
- shmmax [ sysv ipc ]
- shmmni
-- softlockup_thresh
- stop-a [ SPARC only ]
- sysrq ==> Documentation/sysrq.txt
- tainted
- threads-max
- unknown_nmi_panic
+- watchdog_thresh
- version
==============================================================
@@ -182,6 +182,7 @@ core_pattern is used to specify a core dumpfile pattern name.
%<NUL> '%' is dropped
%% output one '%'
%p pid
+ %P global pid (init PID namespace)
%u uid
%g gid
%d dump mode, matches PR_SET_DUMPABLE and
@@ -427,6 +428,32 @@ This file shows up if CONFIG_DEBUG_STACKOVERFLOW is enabled.
==============================================================
+perf_cpu_time_max_percent:
+
+Hints to the kernel how much CPU time it should be allowed to
+use to handle perf sampling events. If the perf subsystem
+is informed that its samples are exceeding this limit, it
+will drop its sampling frequency to attempt to reduce its CPU
+usage.
+
+Some perf sampling happens in NMIs. If these samples
+unexpectedly take too long to execute, the NMIs can become
+stacked up next to each other so much that nothing else is
+allowed to execute.
+
+0: disable the mechanism. Do not monitor or correct perf's
+ sampling rate no matter how CPU time it takes.
+
+1-100: attempt to throttle perf's sample rate to this
+ percentage of CPU. Note: the kernel calculates an
+ "expected" length of each sample event. 100 here means
+ 100% of that expected length. Even if this is set to
+ 100, you may still see sample throttling if this
+ length is exceeded. Set to 0 if you truly do not care
+ how much CPU is consumed.
+
+==============================================================
+
pid_max:
@@ -604,15 +631,6 @@ without users and with a dead originative process will be destroyed.
==============================================================
-softlockup_thresh:
-
-This value can be used to lower the softlockup tolerance threshold. The
-default threshold is 60 seconds. If a cpu is locked up for 60 seconds,
-the kernel complains. Valid values are 1-60 seconds. Setting this
-tunable to zero will disable the softlockup detection altogether.
-
-==============================================================
-
tainted:
Non-zero if the kernel has been tainted. Numeric values, which
@@ -648,3 +666,16 @@ that time, kernel debugging information is displayed on console.
NMI switch that most IA32 servers have fires unknown NMI up, for
example. If a system hangs up, try pressing the NMI switch.
+
+==============================================================
+
+watchdog_thresh:
+
+This value can be used to control the frequency of hrtimer and NMI
+events and the soft and hard lockup thresholds. The default threshold
+is 10 seconds.
+
+The softlockup threshold is (2 * watchdog_thresh). Setting this
+tunable to zero will disable lockup detection altogether.
+
+==============================================================
diff --git a/Documentation/sysctl/net.txt b/Documentation/sysctl/net.txt
index 98335b7a5337..9a0319a82470 100644
--- a/Documentation/sysctl/net.txt
+++ b/Documentation/sysctl/net.txt
@@ -1,4 +1,4 @@
-Documentation for /proc/sys/net/* kernel version 2.4.0-test11-pre4
+Documentation for /proc/sys/net/*
(c) 1999 Terrehon Bowden <terrehon@pacbell.net>
Bodo Bauer <bb@ricochet.net>
(c) 2000 Jorge Nerin <comandante@zaralinux.com>
@@ -9,10 +9,10 @@ For general info and legal blurb, please look in README.
==============================================================
This file contains the documentation for the sysctl files in
-/proc/sys/net and is valid for Linux kernel version 2.4.0-test11-pre4.
+/proc/sys/net
The interface to the networking parts of the kernel is located in
-/proc/sys/net. The following table shows all possible subdirectories.You may
+/proc/sys/net. The following table shows all possible subdirectories. You may
see only some of them, depending on your kernel's configuration.
@@ -26,7 +26,7 @@ Table : Subdirectories in /proc/sys/net
ipv4 IP version 4 x25 X.25 protocol
ipx IPX token-ring IBM token ring
bridge Bridging decnet DEC net
- ipv6 IP version 6
+ ipv6 IP version 6 tipc TIPC
..............................................................................
1. /proc/sys/net/core - Network core options
@@ -50,6 +50,43 @@ The maximum number of packets that kernel can handle on a NAPI interrupt,
it's a Per-CPU variable.
Default: 64
+default_qdisc
+--------------
+
+The default queuing discipline to use for network devices. This allows
+overriding the default queue discipline of pfifo_fast with an
+alternative. Since the default queuing discipline is created with the
+no additional parameters so is best suited to queuing disciplines that
+work well without configuration like stochastic fair queue (sfq),
+CoDel (codel) or fair queue CoDel (fq_codel). Don't use queuing disciplines
+like Hierarchical Token Bucket or Deficit Round Robin which require setting
+up classes and bandwidths.
+Default: pfifo_fast
+
+busy_read
+----------------
+Low latency busy poll timeout for socket reads. (needs CONFIG_NET_RX_BUSY_POLL)
+Approximate time in us to busy loop waiting for packets on the device queue.
+This sets the default value of the SO_BUSY_POLL socket option.
+Can be set or overridden per socket by setting socket option SO_BUSY_POLL,
+which is the preferred method of enabling. If you need to enable the feature
+globally via sysctl, a value of 50 is recommended.
+Will increase power usage.
+Default: 0 (off)
+
+busy_poll
+----------------
+Low latency busy poll timeout for poll and select. (needs CONFIG_NET_RX_BUSY_POLL)
+Approximate time in us to busy loop waiting for events.
+Recommended value depends on the number of sockets you poll on.
+For several sockets 50, for several hundreds 100.
+For more than that you probably want to use epoll.
+Note that only sockets with SO_BUSY_POLL set will be busy polled,
+so you want to either selectively set SO_BUSY_POLL on those sockets or set
+sysctl.net.busy_read globally.
+Will increase power usage.
+Default: 0 (off)
+
rmem_default
------------
@@ -93,8 +130,7 @@ netdev_budget
Maximum number of packets taken from all interfaces in one polling cycle (NAPI
poll). In one polling cycle interfaces which are registered to polling are
-probed in a round-robin manner. The limit of packets in one such probe can be
-set per-device via sysfs class/net/<device>/weight .
+probed in a round-robin manner.
netdev_max_backlog
------------------
@@ -201,3 +237,18 @@ IPX.
The /proc/net/ipx_route table holds a list of IPX routes. For each route it
gives the destination network, the router node (or Directly) and the network
address of the router (or Connected) for internal networks.
+
+6. TIPC
+-------------------------------------------------------
+
+The TIPC protocol now has a tunable for the receive memory, similar to the
+tcp_rmem - i.e. a vector of 3 INTEGERs: (min, default, max)
+
+ # cat /proc/sys/net/tipc/tipc_rmem
+ 4252725 34021800 68043600
+ #
+
+The max value is set to CONN_OVERLOAD_LIMIT, and the default and min values
+are scaled (shifted) versions of that same value. Note that the min value
+is not at this point in time used in any meaningful way, but the triplet is
+preserved in order to be consistent with things like tcp_rmem.
diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
index dcc75a9ed919..79a797eb3e87 100644
--- a/Documentation/sysctl/vm.txt
+++ b/Documentation/sysctl/vm.txt
@@ -200,17 +200,25 @@ fragmentation index is <= extfrag_threshold. The default value is 500.
hugepages_treat_as_movable
-This parameter is only useful when kernelcore= is specified at boot time to
-create ZONE_MOVABLE for pages that may be reclaimed or migrated. Huge pages
-are not movable so are not normally allocated from ZONE_MOVABLE. A non-zero
-value written to hugepages_treat_as_movable allows huge pages to be allocated
-from ZONE_MOVABLE.
+This parameter controls whether we can allocate hugepages from ZONE_MOVABLE
+or not. If set to non-zero, hugepages can be allocated from ZONE_MOVABLE.
+ZONE_MOVABLE is created when kernel boot parameter kernelcore= is specified,
+so this parameter has no effect if used without kernelcore=.
-Once enabled, the ZONE_MOVABLE is treated as an area of memory the huge
-pages pool can easily grow or shrink within. Assuming that applications are
-not running that mlock() a lot of memory, it is likely the huge pages pool
-can grow to the size of ZONE_MOVABLE by repeatedly entering the desired value
-into nr_hugepages and triggering page reclaim.
+Hugepage migration is now available in some situations which depend on the
+architecture and/or the hugepage size. If a hugepage supports migration,
+allocation from ZONE_MOVABLE is always enabled for the hugepage regardless
+of the value of this parameter.
+IOW, this parameter affects only non-migratable hugepages.
+
+Assuming that hugepages are not migratable in your system, one usecase of
+this parameter is that users can make hugepage pool more extensible by
+enabling the allocation from ZONE_MOVABLE. This is because on ZONE_MOVABLE
+page reclaim/migration/compaction work more and you can get contiguous
+memory more likely. Note that using ZONE_MOVABLE for non-migratable
+hugepages can do harm to other features like memory hotremove (because
+memory hotremove expects that memory blocks on ZONE_MOVABLE are always
+removable,) so it's a trade-off responsible for the users.
==============================================================
@@ -510,7 +518,7 @@ Specify "[Dd]efault" to request automatic configuration. Autoconfiguration
will select "node" order in following case.
(1) if the DMA zone does not exist or
(2) if the DMA zone comprises greater than 50% of the available memory or
-(3) if any node's DMA zone comprises greater than 60% of its local memory and
+(3) if any node's DMA zone comprises greater than 70% of its local memory and
the amount of local memory is big enough.
Otherwise, "zone" order will be selected. Default order is recommended unless