summaryrefslogtreecommitdiff
path: root/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/cachetlb.txt6
-rw-r--r--Documentation/cpusets.txt24
-rw-r--r--Documentation/fb/00-INDEX46
-rw-r--r--Documentation/fb/uvesafb.txt188
-rw-r--r--Documentation/filesystems/Locking9
-rw-r--r--Documentation/filesystems/vfs.txt51
-rw-r--r--Documentation/kernel-parameters.txt10
-rw-r--r--Documentation/spi/spi-summary25
-rw-r--r--Documentation/spi/spidev_test.c6
-rw-r--r--Documentation/vm/numa_memory_policy.txt33
-rw-r--r--Documentation/x86_64/mm.txt1
11 files changed, 328 insertions, 71 deletions
diff --git a/Documentation/cachetlb.txt b/Documentation/cachetlb.txt
index 866b76139420..552cabac0608 100644
--- a/Documentation/cachetlb.txt
+++ b/Documentation/cachetlb.txt
@@ -133,12 +133,6 @@ changes occur:
The ia64 sn2 platform is one example of a platform
that uses this interface.
-8) void lazy_mmu_prot_update(pte_t pte)
- This interface is called whenever the protection on
- any user PTEs change. This interface provides a notification
- to architecture specific code to take appropriate action.
-
-
Next, we have the cache flushing interfaces. In general, when Linux
is changing an existing virtual-->physical mapping to a new value,
the sequence will be in one of the following forms:
diff --git a/Documentation/cpusets.txt b/Documentation/cpusets.txt
index f2c0a6842930..ec9de6917f01 100644
--- a/Documentation/cpusets.txt
+++ b/Documentation/cpusets.txt
@@ -35,7 +35,8 @@ CONTENTS:
----------------------
Cpusets provide a mechanism for assigning a set of CPUs and Memory
-Nodes to a set of tasks.
+Nodes to a set of tasks. In this document "Memory Node" refers to
+an on-line node that contains memory.
Cpusets constrain the CPU and Memory placement of tasks to only
the resources within a tasks current cpuset. They form a nested
@@ -86,9 +87,6 @@ This can be especially valuable on:
and a database), or
* NUMA systems running large HPC applications with demanding
performance characteristics.
- * Also cpu_exclusive cpusets are useful for servers running orthogonal
- workloads such as RT applications requiring low latency and HPC
- applications that are throughput sensitive
These subsets, or "soft partitions" must be able to be dynamically
adjusted, as the job mix changes, without impacting other concurrently
@@ -131,8 +129,6 @@ Cpusets extends these two mechanisms as follows:
- A cpuset may be marked exclusive, which ensures that no other
cpuset (except direct ancestors and descendents) may contain
any overlapping CPUs or Memory Nodes.
- Also a cpu_exclusive cpuset would be associated with a sched
- domain.
- You can list all the tasks (by pid) attached to any cpuset.
The implementation of cpusets requires a few, simple hooks
@@ -144,9 +140,6 @@ into the rest of the kernel, none in performance critical paths:
allowed in that tasks cpuset.
- in sched.c migrate_all_tasks(), to keep migrating tasks within
the CPUs allowed by their cpuset, if possible.
- - in sched.c, a new API partition_sched_domains for handling
- sched domain changes associated with cpu_exclusive cpusets
- and related changes in both sched.c and arch/ia64/kernel/domain.c
- in the mbind and set_mempolicy system calls, to mask the requested
Memory Nodes by what's allowed in that tasks cpuset.
- in page_alloc.c, to restrict memory to allowed nodes.
@@ -220,8 +213,8 @@ and name space for cpusets, with a minimum of additional kernel code.
The cpus and mems files in the root (top_cpuset) cpuset are
read-only. The cpus file automatically tracks the value of
cpu_online_map using a CPU hotplug notifier, and the mems file
-automatically tracks the value of node_online_map using the
-cpuset_track_online_nodes() hook.
+automatically tracks the value of node_states[N_MEMORY]--i.e.,
+nodes with memory--using the cpuset_track_online_nodes() hook.
1.4 What are exclusive cpusets ?
@@ -231,15 +224,6 @@ If a cpuset is cpu or mem exclusive, no other cpuset, other than
a direct ancestor or descendent, may share any of the same CPUs or
Memory Nodes.
-A cpuset that is cpu_exclusive has a scheduler (sched) domain
-associated with it. The sched domain consists of all CPUs in the
-current cpuset that are not part of any exclusive child cpusets.
-This ensures that the scheduler load balancing code only balances
-against the CPUs that are in the sched domain as defined above and
-not all of the CPUs in the system. This removes any overhead due to
-load balancing code trying to pull tasks outside of the cpu_exclusive
-cpuset only to be prevented by the tasks' cpus_allowed mask.
-
A cpuset that is mem_exclusive restricts kernel allocations for
page, buffer and other data commonly shared by the kernel across
multiple users. All cpusets, whether mem_exclusive or not, restrict
diff --git a/Documentation/fb/00-INDEX b/Documentation/fb/00-INDEX
index 92e89aeef52e..caabbd395e61 100644
--- a/Documentation/fb/00-INDEX
+++ b/Documentation/fb/00-INDEX
@@ -5,21 +5,49 @@ please mail me.
00-INDEX
- this file
+arkfb.txt
+ - info on the fbdev driver for ARK Logic chips.
+aty128fb.txt
+ - info on the ATI Rage128 frame buffer driver.
+cirrusfb.txt
+ - info on the driver for Cirrus Logic chipsets.
+cyblafb/
+ - directory with documentation files related to the cyblafb driver.
+deferred_io.txt
+ - an introduction to deferred IO.
+fbcon.txt
+ - intro to and usage guide for the framebuffer console (fbcon).
framebuffer.txt
- - introduction to frame buffer devices
+ - introduction to frame buffer devices.
+imacfb.txt
+ - info on the generic EFI platform driver for Intel based Macs.
+intel810.txt
+ - documentation for the Intel 810/815 framebuffer driver.
+intelfb.txt
+ - docs for Intel 830M/845G/852GM/855GM/865G/915G/945G fb driver.
internals.txt
- - quick overview of frame buffer device internals
+ - quick overview of frame buffer device internals.
+matroxfb.txt
+ - info on the Matrox framebuffer driver for Alpha, Intel and PPC.
modedb.txt
- - info on the video mode database
-aty128fb.txt
- - info on the ATI Rage128 frame buffer driver
-clgenfb.txt
- - info on the Cirrus Logic frame buffer driver
+ - info on the video mode database.
matroxfb.txt
- - info on the Matrox frame buffer driver
+ - info on the Matrox frame buffer driver.
pvr2fb.txt
- - info on the PowerVR 2 frame buffer driver
+ - info on the PowerVR 2 frame buffer driver.
+pxafb.txt
+ - info on the driver for the PXA25x LCD controller.
+s3fb.txt
+ - info on the fbdev driver for S3 Trio/Virge chips.
+sa1100fb.txt
+ - information about the driver for the SA-1100 LCD controller.
+sisfb.txt
+ - info on the framebuffer device driver for various SiS chips.
+sstfb.txt
+ - info on the frame buffer driver for 3dfx' Voodoo Graphics boards.
tgafb.txt
- info on the TGA (DECChip 21030) frame buffer driver
vesafb.txt
- info on the VESA frame buffer device
+vt8623fb.txt
+ - info on the fb driver for the graphics core in VIA VT8623 chipsets.
diff --git a/Documentation/fb/uvesafb.txt b/Documentation/fb/uvesafb.txt
new file mode 100644
index 000000000000..bcfc233a0080
--- /dev/null
+++ b/Documentation/fb/uvesafb.txt
@@ -0,0 +1,188 @@
+
+uvesafb - A Generic Driver for VBE2+ compliant video cards
+==========================================================
+
+1. Requirements
+---------------
+
+uvesafb should work with any video card that has a Video BIOS compliant
+with the VBE 2.0 standard.
+
+Unlike other drivers, uvesafb makes use of a userspace helper called
+v86d. v86d is used to run the x86 Video BIOS code in a simulated and
+controlled environment. This allows uvesafb to function on arches other
+than x86. Check the v86d documentation for a list of currently supported
+arches.
+
+v86d source code can be downloaded from the following website:
+ http://dev.gentoo.org/~spock/projects/uvesafb
+
+Please refer to the v86d documentation for detailed configuration and
+installation instructions.
+
+Note that the v86d userspace helper has to be available at all times in
+order for uvesafb to work properly. If you want to use uvesafb during
+early boot, you will have to include v86d into an initramfs image, and
+either compile it into the kernel or use it as an initrd.
+
+2. Caveats and limitations
+--------------------------
+
+uvesafb is a _generic_ driver which supports a wide variety of video
+cards, but which is ultimately limited by the Video BIOS interface.
+The most important limitations are:
+
+- Lack of any type of acceleration.
+- A strict and limited set of supported video modes. Often the native
+ or most optimal resolution/refresh rate for your setup will not work
+ with uvesafb, simply because the Video BIOS doesn't support the
+ video mode you want to use. This can be especially painful with
+ widescreen panels, where native video modes don't have the 4:3 aspect
+ ratio, which is what most BIOS-es are limited to.
+- Adjusting the refresh rate is only possible with a VBE 3.0 compliant
+ Video BIOS. Note that many nVidia Video BIOS-es claim to be VBE 3.0
+ compliant, while they simply ignore any refresh rate settings.
+
+3. Configuration
+----------------
+
+uvesafb can be compiled either as a module, or directly into the kernel.
+In both cases it supports the same set of configuration options, which
+are either given on the kernel command line or as module parameters, e.g.:
+
+ video=uvesafb:1024x768-32,mtrr:3,ywrap (compiled into the kernel)
+
+ # modprobe uvesafb mode=1024x768-32 mtrr=3 scroll=ywrap (module)
+
+Accepted options:
+
+ypan Enable display panning using the VESA protected mode
+ interface. The visible screen is just a window of the
+ video memory, console scrolling is done by changing the
+ start of the window. Available on x86 only.
+
+ywrap Same as ypan, but assumes your gfx board can wrap-around
+ the video memory (i.e. starts reading from top if it
+ reaches the end of video memory). Faster than ypan.
+ Available on x86 only.
+
+redraw Scroll by redrawing the affected part of the screen, this
+ is the safe (and slow) default.
+
+(If you're using uvesafb as a module, the above three options are
+ used a parameter of the scroll option, e.g. scroll=ypan.)
+
+vgapal Use the standard VGA registers for palette changes.
+
+pmipal Use the protected mode interface for palette changes.
+ This is the default if the protected mode interface is
+ available. Available on x86 only.
+
+mtrr:n Setup memory type range registers for the framebuffer
+ where n:
+ 0 - disabled (equivalent to nomtrr) (default)
+ 1 - uncachable
+ 2 - write-back
+ 3 - write-combining
+ 4 - write-through
+
+ If you see the following in dmesg, choose the type that matches
+ the old one. In this example, use "mtrr:2".
+...
+mtrr: type mismatch for e0000000,8000000 old: write-back new: write-combining
+...
+
+nomtrr Do not use memory type range registers.
+
+vremap:n
+ Remap 'n' MiB of video RAM. If 0 or not specified, remap memory
+ according to video mode.
+
+vtotal:n
+ If the video BIOS of your card incorrectly determines the total
+ amount of video RAM, use this option to override the BIOS (in MiB).
+
+<mode> The mode you want to set, in the standard modedb format. Refer to
+ modedb.txt for a detailed description. When uvesafb is compiled as
+ a module, the mode string should be provided as a value of the
+ 'mode' option.
+
+vbemode:x
+ Force the use of VBE mode x. The mode will only be set if it's
+ found in the VBE-provided list of supported modes.
+ NOTE: The mode number 'x' should be specified in VESA mode number
+ notation, not the Linux kernel one (eg. 257 instead of 769).
+ HINT: If you use this option because normal <mode> parameter does
+ not work for you and you use a X server, you'll probably want to
+ set the 'nocrtc' option to ensure that the video mode is properly
+ restored after console <-> X switches.
+
+nocrtc Do not use CRTC timings while setting the video mode. This option
+ has any effect only if the Video BIOS is VBE 3.0 compliant. Use it
+ if you have problems with modes set the standard way. Note that
+ using this option implies that any refresh rate adjustments will
+ be ignored and the refresh rate will stay at your BIOS default (60 Hz).
+
+noedid Do not try to fetch and use EDID-provided modes.
+
+noblank Disable hardware blanking.
+
+v86d:path
+ Set path to the v86d executable. This option is only available as
+ a module parameter, and not as a part of the video= string. If you
+ need to use it and have uvesafb built into the kernel, use
+ uvesafb.v86d="path".
+
+Additionally, the following parameters may be provided. They all override the
+EDID-provided values and BIOS defaults. Refer to your monitor's specs to get
+the correct values for maxhf, maxvf and maxclk for your hardware.
+
+maxhf:n Maximum horizontal frequency (in kHz).
+maxvf:n Maximum vertical frequency (in Hz).
+maxclk:n Maximum pixel clock (in MHz).
+
+4. The sysfs interface
+----------------------
+
+uvesafb provides several sysfs nodes for configurable parameters and
+additional information.
+
+Driver attributes:
+
+/sys/bus/platform/drivers/uvesafb
+ - v86d (default: /sbin/v86d)
+ Path to the v86d executable. v86d is started by uvesafb
+ if an instance of the daemon isn't already running.
+
+Device attributes:
+
+/sys/bus/platform/drivers/uvesafb/uvesafb.0
+ - nocrtc
+ Use the default refresh rate (60 Hz) if set to 1.
+
+ - oem_product_name
+ - oem_product_rev
+ - oem_string
+ - oem_vendor
+ Information about the card and its maker.
+
+ - vbe_modes
+ A list of video modes supported by the Video BIOS along with their
+ VBE mode numbers in hex.
+
+ - vbe_version
+ A BCD value indicating the implemented VBE standard.
+
+5. Miscellaneous
+----------------
+
+Uvesafb will set a video mode with the default refresh rate and timings
+from the Video BIOS if you set pixclock to 0 in fb_var_screeninfo.
+
+
+--
+ Michal Januszewski <spock@gentoo.org>
+ Last updated: 2007-06-16
+
+ Documentation of the uvesafb options is loosely based on vesafb.txt.
+
diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking
index f0f825808ca4..fe26cc978523 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -178,15 +178,18 @@ prototypes:
locking rules:
All except set_page_dirty may block
- BKL PageLocked(page)
+ BKL PageLocked(page) i_sem
writepage: no yes, unlocks (see below)
readpage: no yes, unlocks
sync_page: no maybe
writepages: no
set_page_dirty no no
readpages: no
-prepare_write: no yes
-commit_write: no yes
+prepare_write: no yes yes
+commit_write: no yes yes
+write_begin: no locks the page yes
+write_end: no yes, unlocks yes
+perform_write: no n/a yes
bmap: yes
invalidatepage: no yes
releasepage: no yes
diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
index 045f3e055a28..6f8e16e3d6c0 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -537,6 +537,12 @@ struct address_space_operations {
struct list_head *pages, unsigned nr_pages);
int (*prepare_write)(struct file *, struct page *, unsigned, unsigned);
int (*commit_write)(struct file *, struct page *, unsigned, unsigned);
+ int (*write_begin)(struct file *, struct address_space *mapping,
+ loff_t pos, unsigned len, unsigned flags,
+ struct page **pagep, void **fsdata);
+ int (*write_end)(struct file *, struct address_space *mapping,
+ loff_t pos, unsigned len, unsigned copied,
+ struct page *page, void *fsdata);
sector_t (*bmap)(struct address_space *, sector_t);
int (*invalidatepage) (struct page *, unsigned long);
int (*releasepage) (struct page *, int);
@@ -615,11 +621,7 @@ struct address_space_operations {
any basic-blocks on storage, then those blocks should be
pre-read (if they haven't been read already) so that the
updated blocks can be written out properly.
- The page will be locked. If prepare_write wants to unlock the
- page it, like readpage, may do so and return
- AOP_TRUNCATED_PAGE.
- In this case the prepare_write will be retried one the lock is
- regained.
+ The page will be locked.
Note: the page _must not_ be marked uptodate in this function
(or anywhere else) unless it actually is uptodate right now. As
@@ -633,6 +635,45 @@ struct address_space_operations {
operations. It should avoid returning an error if possible -
errors should have been handled by prepare_write.
+ write_begin: This is intended as a replacement for prepare_write. The
+ key differences being that:
+ - it returns a locked page (in *pagep) rather than being
+ given a pre locked page;
+ - it must be able to cope with short writes (where the
+ length passed to write_begin is greater than the number
+ of bytes copied into the page).
+
+ Called by the generic buffered write code to ask the filesystem to
+ prepare to write len bytes at the given offset in the file. The
+ address_space should check that the write will be able to complete,
+ by allocating space if necessary and doing any other internal
+ housekeeping. If the write will update parts of any basic-blocks on
+ storage, then those blocks should be pre-read (if they haven't been
+ read already) so that the updated blocks can be written out properly.
+
+ The filesystem must return the locked pagecache page for the specified
+ offset, in *pagep, for the caller to write into.
+
+ flags is a field for AOP_FLAG_xxx flags, described in
+ include/linux/fs.h.
+
+ A void * may be returned in fsdata, which then gets passed into
+ write_end.
+
+ Returns 0 on success; < 0 on failure (which is the error code), in
+ which case write_end is not called.
+
+ write_end: After a successful write_begin, and data copy, write_end must
+ be called. len is the original len passed to write_begin, and copied
+ is the amount that was able to be copied (copied == len is always true
+ if write_begin was called with the AOP_FLAG_UNINTERRUPTIBLE flag).
+
+ The filesystem must take care of unlocking the page and releasing it
+ refcount, and updating i_size.
+
+ Returns < 0 on failure, otherwise the number of bytes (<= 'copied')
+ that were able to be copied into pagecache.
+
bmap: called by the VFS to map a logical block offset within object to
physical block number. This method is used by the FIBMAP
ioctl and for working with swap-files. To be able to swap to
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 085e4a095eaa..eb247997f679 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -349,6 +349,11 @@ and is between 256 and 4096 characters. It is defined in the file
blkmtd_bs=
blkmtd_count=
+ boot_delay= Milliseconds to delay each printk during boot.
+ Values larger than 10 seconds (10000) are changed to
+ no delay (0).
+ Format: integer
+
bttv.card= [HW,V4L] bttv (bt848 + bt878 based grabber cards)
bttv.radio= Most important insmod options are available as
kernel args too.
@@ -906,6 +911,11 @@ and is between 256 and 4096 characters. It is defined in the file
n must be a power of two. The default size
is set in the kernel config file.
+ logo.nologo [FB] Disables display of the built-in Linux logo.
+ This may be used to provide more screen space for
+ kernel log messages and is useful when debugging
+ kernel boot problems.
+
lp=0 [LP] Specify parallel ports to use, e.g,
lp=port[,port...] lp=none,parport0 (lp0 not configured, lp1 uses
lp=reset first parallel port). 'lp=0' disables the
diff --git a/Documentation/spi/spi-summary b/Documentation/spi/spi-summary
index 76ea6c837be5..8861e47e5a2d 100644
--- a/Documentation/spi/spi-summary
+++ b/Documentation/spi/spi-summary
@@ -156,21 +156,29 @@ using the driver model to connect controller and protocol drivers using
device tables provided by board specific initialization code. SPI
shows up in sysfs in several locations:
+ /sys/devices/.../CTLR ... physical node for a given SPI controller
+
/sys/devices/.../CTLR/spiB.C ... spi_device on bus "B",
chipselect C, accessed through CTLR.
+ /sys/bus/spi/devices/spiB.C ... symlink to that physical
+ .../CTLR/spiB.C device
+
/sys/devices/.../CTLR/spiB.C/modalias ... identifies the driver
that should be used with this device (for hotplug/coldplug)
- /sys/bus/spi/devices/spiB.C ... symlink to the physical
- spiB.C device
-
/sys/bus/spi/drivers/D ... driver for one or more spi*.* devices
- /sys/class/spi_master/spiB ... class device for the controller
- managing bus "B". All the spiB.* devices share the same
+ /sys/class/spi_master/spiB ... symlink (or actual device node) to
+ a logical node which could hold class related state for the
+ controller managing bus "B". All spiB.* devices share one
physical SPI bus segment, with SCLK, MOSI, and MISO.
+Note that the actual location of the controller's class state depends
+on whether you enabled CONFIG_SYSFS_DEPRECATED or not. At this time,
+the only class-specific state is the bus number ("B" in "spiB"), so
+those /sys/class entries are only useful to quickly identify busses.
+
How does board-specific init code declare SPI devices?
------------------------------------------------------
@@ -337,7 +345,8 @@ SPI protocol drivers somewhat resemble platform device drivers:
The driver core will autmatically attempt to bind this driver to any SPI
device whose board_info gave a modalias of "CHIP". Your probe() code
-might look like this unless you're creating a class_device:
+might look like this unless you're creating a device which is managing
+a bus (appearing under /sys/class/spi_master).
static int __devinit CHIP_probe(struct spi_device *spi)
{
@@ -442,7 +451,7 @@ An SPI controller will probably be registered on the platform_bus; write
a driver to bind to the device, whichever bus is involved.
The main task of this type of driver is to provide an "spi_master".
-Use spi_alloc_master() to allocate the master, and class_get_devdata()
+Use spi_alloc_master() to allocate the master, and spi_master_get_devdata()
to get the driver-private data allocated for that device.
struct spi_master *master;
@@ -452,7 +461,7 @@ to get the driver-private data allocated for that device.
if (!master)
return -ENODEV;
- c = class_get_devdata(&master->cdev);
+ c = spi_master_get_devdata(master);
The driver will initialize the fields of that spi_master, including the
bus number (maybe the same as the platform device ID) and three methods
diff --git a/Documentation/spi/spidev_test.c b/Documentation/spi/spidev_test.c
index 218e86215297..cf0e3ce0d526 100644
--- a/Documentation/spi/spidev_test.c
+++ b/Documentation/spi/spidev_test.c
@@ -29,7 +29,7 @@ static void pabort(const char *s)
abort();
}
-static char *device = "/dev/spidev1.1";
+static const char *device = "/dev/spidev1.1";
static uint8_t mode;
static uint8_t bits = 8;
static uint32_t speed = 500000;
@@ -69,7 +69,7 @@ static void transfer(int fd)
puts("");
}
-void print_usage(char *prog)
+void print_usage(const char *prog)
{
printf("Usage: %s [-DsbdlHOLC3]\n", prog);
puts(" -D --device device to use (default /dev/spidev1.1)\n"
@@ -88,7 +88,7 @@ void print_usage(char *prog)
void parse_opts(int argc, char *argv[])
{
while (1) {
- static struct option lopts[] = {
+ static const struct option lopts[] = {
{ "device", 1, 0, 'D' },
{ "speed", 1, 0, 's' },
{ "delay", 1, 0, 'd' },
diff --git a/Documentation/vm/numa_memory_policy.txt b/Documentation/vm/numa_memory_policy.txt
index 8242f52d0f22..dd4986497996 100644
--- a/Documentation/vm/numa_memory_policy.txt
+++ b/Documentation/vm/numa_memory_policy.txt
@@ -302,31 +302,30 @@ MEMORY POLICIES AND CPUSETS
Memory policies work within cpusets as described above. For memory policies
that require a node or set of nodes, the nodes are restricted to the set of
-nodes whose memories are allowed by the cpuset constraints. If the
-intersection of the set of nodes specified for the policy and the set of nodes
-allowed by the cpuset is the empty set, the policy is considered invalid and
-cannot be installed.
+nodes whose memories are allowed by the cpuset constraints. If the nodemask
+specified for the policy contains nodes that are not allowed by the cpuset, or
+the intersection of the set of nodes specified for the policy and the set of
+nodes with memory is the empty set, the policy is considered invalid
+and cannot be installed.
The interaction of memory policies and cpusets can be problematic for a
couple of reasons:
-1) the memory policy APIs take physical node id's as arguments. However, the
- memory policy APIs do not provide a way to determine what nodes are valid
- in the context where the application is running. An application MAY consult
- the cpuset file system [directly or via an out of tree, and not generally
- available, libcpuset API] to obtain this information, but then the
- application must be aware that it is running in a cpuset and use what are
- intended primarily as administrative APIs.
-
- However, as long as the policy specifies at least one node that is valid
- in the controlling cpuset, the policy can be used.
+1) the memory policy APIs take physical node id's as arguments. As mentioned
+ above, it is illegal to specify nodes that are not allowed in the cpuset.
+ The application must query the allowed nodes using the get_mempolicy()
+ API with the MPOL_F_MEMS_ALLOWED flag to determine the allowed nodes and
+ restrict itself to those nodes. However, the resources available to a
+ cpuset can be changed by the system administrator, or a workload manager
+ application, at any time. So, a task may still get errors attempting to
+ specify policy nodes, and must query the allowed memories again.
2) when tasks in two cpusets share access to a memory region, such as shared
memory segments created by shmget() of mmap() with the MAP_ANONYMOUS and
MAP_SHARED flags, and any of the tasks install shared policy on the region,
only nodes whose memories are allowed in both cpusets may be used in the
- policies. Again, obtaining this information requires "stepping outside"
- the memory policy APIs, as well as knowing in what cpusets other task might
- be attaching to the shared region, to use the cpuset information.
+ policies. Obtaining this information requires "stepping outside" the
+ memory policy APIs to use the cpuset information and requires that one
+ know in what cpusets other task might be attaching to the shared region.
Furthermore, if the cpusets' allowed memory sets are disjoint, "local"
allocation is the only valid policy.
diff --git a/Documentation/x86_64/mm.txt b/Documentation/x86_64/mm.txt
index f42798ed1c54..b89b6d2bebfa 100644
--- a/Documentation/x86_64/mm.txt
+++ b/Documentation/x86_64/mm.txt
@@ -9,6 +9,7 @@ ffff800000000000 - ffff80ffffffffff (=40 bits) guard hole
ffff810000000000 - ffffc0ffffffffff (=46 bits) direct mapping of all phys. memory
ffffc10000000000 - ffffc1ffffffffff (=40 bits) hole
ffffc20000000000 - ffffe1ffffffffff (=45 bits) vmalloc/ioremap space
+ffffe20000000000 - ffffe2ffffffffff (=40 bits) virtual memory map (1TB)
... unused hole ...
ffffffff80000000 - ffffffff82800000 (=40 MB) kernel text mapping, from phys 0
... unused hole ...