summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2014-06-06mm/kmemleak-test.c: use pr_fmt for loggingFabian Frederick
Signed-off-by: Fabian Frederick <fabf@skynet.be> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06fs/dlm/debug_fs.c: replace seq_printf by seq_putsFabian Frederick
Replace seq_printf where possible. This patch also fixes the following checkpatch warning "unnecessary whitespace before a quoted newline" Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Christine Caulfield <ccaulfie@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06fs/dlm/lockspace.c: convert simple_str to kstrFabian Frederick
Replace obsolete functions. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Christine Caulfield <ccaulfie@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06fs/dlm/config.c: convert simple_str to kstrFabian Frederick
Replace obsolete functions simple_strtoul/kstrtouint simple_strtol/kstrtoint (kstr __must_check requires the right function to be applied) Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Christine Caulfield <ccaulfie@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06mm: mark remap_file_pages() syscall as deprecatedKirill A. Shutemov
The remap_file_pages() system call is used to create a nonlinear mapping, that is, a mapping in which the pages of the file are mapped into a nonsequential order in memory. The advantage of using remap_file_pages() over using repeated calls to mmap(2) is that the former approach does not require the kernel to create additional VMA (Virtual Memory Area) data structures. Supporting of nonlinear mapping requires significant amount of non-trivial code in kernel virtual memory subsystem including hot paths. Also to get nonlinear mapping work kernel need a way to distinguish normal page table entries from entries with file offset (pte_file). Kernel reserves flag in PTE for this purpose. PTE flags are scarce resource especially on some CPU architectures. It would be nice to free up the flag for other usage. Fortunately, there are not many users of remap_file_pages() in the wild. It's only known that one enterprise RDBMS implementation uses the syscall on 32-bit systems to map files bigger than can linearly fit into 32-bit virtual address space. This use-case is not critical anymore since 64-bit systems are widely available. The plan is to deprecate the syscall and replace it with an emulation. The emulation will create new VMAs instead of nonlinear mappings. It's going to work slower for rare users of remap_file_pages() but ABI is preserved. One side effect of emulation (apart from performance) is that user can hit vm.max_map_count limit more easily due to additional VMAs. See comment for DEFAULT_MAX_MAP_COUNT for more details on the limit. [akpm@linux-foundation.org: fix spello] Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Dave Jones <davej@redhat.com> Cc: Armin Rigo <arigo@tunes.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06mm: memcontrol: remove unnecessary memcg argument from soft limit functionsJohannes Weiner
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Jianyu Zhan <nasa4836@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06mm: memcontrol: clean up memcg zoneinfo lookupJianyu Zhan
Memcg zoneinfo lookup sites have either the page, the zone, or the node id and zone index, but sites that only have the zone have to look up the node id and zone index themselves, whereas sites that already have those two integers use a function for a simple pointer chase. Provide mem_cgroup_zone_zoneinfo() that takes a zone pointer and let sites that already have node id and zone index - all for each node, for each zone iterators - use &memcg->nodeinfo[nid]->zoneinfo[zid]. Rename page_cgroup_zoneinfo() to mem_cgroup_page_zoneinfo() to match. Signed-off-by: Jianyu Zhan <nasa4836@gmail.com> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06mm/memblock.c: call kmemleak directly from memblock_(alloc|free)Catalin Marinas
Kmemleak could ignore memory blocks allocated via memblock_alloc() leading to false positives during scanning. This patch adds the corresponding callbacks and removes kmemleak_free_* calls in mm/nobootmem.c to avoid duplication. The kmemleak_alloc() in mm/nobootmem.c is kept since __alloc_memory_core_early() does not use memblock_alloc() directly. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06mm/mempool.c: update the kmemleak stack trace for mempool allocationsCatalin Marinas
When mempool_alloc() returns an existing pool object, kmemleak_alloc() is no longer called and the stack trace corresponds to the original object allocation. This patch updates the kmemleak allocation stack trace for such objects to make it more useful for debugging. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06lib/radix-tree.c: update the kmemleak stack trace for radix tree allocationsCatalin Marinas
Since radix_tree_preload() stack trace is not always useful for debugging an actual radix tree memory leak, this patch updates the kmemleak allocation stack trace in the radix_tree_node_alloc() function. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06mm: introduce kmemleak_update_trace()Catalin Marinas
The memory allocation stack trace is not always useful for debugging a memory leak (e.g. radix_tree_preload). This function, when called, updates the stack trace for an already allocated object. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06mm/kmemleak.c: use %u to print ->checksumJianpeng Ma
Signed-off-by: Jianpeng Ma <majianpeng@gmail.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06vmscan: memcg: always use swappiness of the reclaimed memcgMichal Hocko
Memory reclaim always uses swappiness of the reclaim target memcg (origin of the memory pressure) or vm_swappiness for global memory reclaim. This behavior was consistent (except for difference between global and hard limit reclaim) because swappiness was enforced to be consistent within each memcg hierarchy. After "mm: memcontrol: remove hierarchy restrictions for swappiness and oom_control" each memcg can have its own swappiness independent of hierarchical parents, though, so the consistency guarantee is gone. This can lead to an unexpected behavior. Say that a group is explicitly configured to not swapout by memory.swappiness=0 but its memory gets swapped out anyway when the memory pressure comes from its parent with a It is also unexpected that the knob is meaningless without setting the hard limit which would trigger the reclaim and enforce the swappiness. There are setups where the hard limit is configured higher in the hierarchy by an administrator and children groups are under control of somebody else who is interested in the swapout behavior but not necessarily about the memory limit. From a semantic point of view swappiness is an attribute defining anon vs. file proportional scanning of LRU which is memcg specific (unlike charges which are propagated up the hierarchy) so it should be applied to the particular memcg's LRU regardless where the memory pressure comes from. This patch removes vmscan_swappiness() and stores the swappiness into the scan_control structure. mem_cgroup_swappiness is then used to provide the correct value before shrink_lruvec is called. The global vm_swappiness is used for the root memcg. [hughd@google.com: oopses immediately when booted with cgroup_disable=memory] Signed-off-by: Michal Hocko <mhocko@suse.cz> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06sysrq,rcu: suppress RCU stall warnings while sysrq runsRik van Riel
Some sysrq handlers can run for a long time, because they dump a lot of data onto a serial console. Having RCU stall warnings pop up in the middle of them only makes the problem worse. This patch temporarily disables RCU stall warnings while a sysrq request is handled. Signed-off-by: Rik van Riel <riel@redhat.com> Suggested-by: Paul McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Madper Xie <cxie@redhat.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Richard Weinberger <richard@nod.at> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06sysrq: rcu-ify __handle_sysrqRik van Riel
Echoing values into /proc/sysrq-trigger seems to be a popular way to get information out of the kernel. However, dumping information about thousands of processes, or hundreds of CPUs to serial console can result in IRQs being blocked for minutes, resulting in various kinds of cascade failures. The most common failure is due to interrupts being blocked for a very long time. This can lead to things like failed IO requests, and other things the system cannot easily recover from. This problem is easily fixable by making __handle_sysrq use RCU instead of spin_lock_irqsave. This leaves the warning that RCU grace periods have not elapsed for a long time, but the system will come back from that automatically. It also leaves sysrq-from-irq-context when the sysrq keys are pressed, but that is probably desired since people want that to work in situations where the system is already hosed. The callers of register_sysrq_key and unregister_sysrq_key appear to be capable of sleeping. Signed-off-by: Rik van Riel <riel@redhat.com> Reported-by: Madper Xie <cxie@redhat.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Richard Weinberger <richard@nod.at> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06fs/reiserfs/stree.c: remove obsolete __constantFabian Frederick
__constant_cpu_to_le32 converted to cpu_to_le32 Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06fs/reiserfs/bitmap.c: coding style fixesFabian Frederick
-Trivial code clean-up -Fix endif }; (coccinelle warning) Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06blackfin/ptrace: call find_vma with the mmap_sem heldDavidlohr Bueso
Performing vma lookups without taking the mm->mmap_sem is asking for trouble. While doing the search, the vma in question can be modified or even removed before returning to the caller. Take the lock (shared) in order to avoid races while iterating through the vmacache and/or rbtree. Signed-off-by: Davidlohr Bueso <davidlohr@hp.com> Cc: Steven Miao <realmz6@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06mm: convert use of typedef ctl_table to struct ctl_tableJoe Perches
This typedef is unnecessary and should just be removed. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06sysctl: convert use of typedef ctl_table to struct ctl_tableJoe Perches
This typedef is unnecessary and should just be removed. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06ipc: convert use of typedef ctl_table to struct ctl_tableJoe Perches
This typedef is unnecessary and should just be removed. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06key: convert use of typedef ctl_table to struct ctl_tableJoe Perches
This typedef is unnecessary and should just be removed. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06fs: convert use of typedef ctl_table to struct ctl_tableJoe Perches
This typedef is unnecessary and should just be removed. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06ntfs: convert use of typedef ctl_table to struct ctl_tableJoe Perches
This typedef is unnecessary and should just be removed. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06inotify: convert use of typedef ctl_table to struct ctl_tableJoe Perches
This typedef is unnecessary and should just be removed. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06nfs: convert use of typedef ctl_table to struct ctl_tableJoe Perches
This typedef is unnecessary and should just be removed. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06lockd: convert use of typedef ctl_table to struct ctl_tableJoe Perches
This typedef is unnecessary and should just be removed. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06fscache: convert use of typedef ctl_table to struct ctl_tableJoe Perches
This typedef is unnecessary and should just be removed. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06coda: convert use of typedef ctl_table to struct ctl_tableJoe Perches
This typedef is unnecessary and should just be removed. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06scsi: convert use of typedef ctl_table to struct ctl_tableJoe Perches
This typedef is unnecessary and should just be removed. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06parport: convert use of typedef ctl_table to struct ctl_tableJoe Perches
This typedef is unnecessary and should just be removed. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06random: convert use of typedef ctl_table to struct ctl_tableJoe Perches
This typedef is unnecessary and should just be removed. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06cdrom: convert use of typedef ctl_table to struct ctl_tableJoe Perches
This typedef is unnecessary and should just be removed. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06tile: convert use of typedef ctl_table to struct ctl_tableJoe Perches
This typedef is unnecessary and should just be removed. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06ia64: convert use of typedef ctl_table to struct ctl_tableJoe Perches
This typedef is unnecessary and should just be removed. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06arm: convert use of typedef ctl_table to struct ctl_tableJoe Perches
This typedef is unnecessary and should just be removed. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06kernel/seccomp.c: kernel-doc warning fixFabian Frederick
+ fix small typo Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06ipc/sem.c: add a printk_once for semctl(GETNCNT/GETZCNT)Manfred Spraul
The actual Linux implementation for semctl(GETNCNT) and semctl(GETZCNT) always (since 0.99.10) reported a thread as sleeping on all semaphores that are listed in the semop() call. The documented behavior (both in the Linux man page and in the Single Unix Specification) is that a task should be reported on exactly one semaphore: The semaphore that caused the thread to got to sleep. This patch adds a pr_info_once() that is triggered if a thread hits the relevant case. The code triggers slightly too often, otherwise it would be necessary to replicate the old code. As there are no known users of GETNCNT or GETZCNT, this is done to prevent unnecessary bloat. The task that triggered is reported with name (tsk->comm) and pid. Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Acked-by: Davidlohr Bueso <davidlohr@hp.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06ipc/sem.c: make semctl(,,{GETNCNT,GETZCNT}) standard compliantManfred Spraul
SUSv4 clearly defines how semncnt and semzcnt must be calculated: A task waits on exactly one semaphore: The semaphore from the first operation in the sop array that cannot proceed. The Linux implementation never followed the standard, it tried to count all semaphores that might be the reason why a task sleeps. This patch fixes that. Note: a) The implementation assumes that GETNCNT and GETZCNT are rare operations, therefore the code counts them only on demand. (If they wouldn't be rare, then the non-compliance would have been found earlier) b) compared to the initial version of the patch, the BUG_ONs were removed and it was clarified that the new behavior conforms to SUS. Back-compatibility concerns: Manfred: : - there is no application in Fedora that uses GETNCNT or GETZCNT. : : - application that use only single-sop semop() are also safe, the : difference only affects complex apps. : : - portable application are also safe, the new behavior is standard : compliant. : : But that's it. The old behavior existed in Linux from 0.99.something : until now. Michael: : * These operations seem to be very little used. Grepping the public : source that is contained Fedora 20 source DVD, there appear to be no : uses. Of course, this says nothing about uses in private / : non-mainstream FOSS code, but it seems likely that the same pattern : is followed there. : : * The existing behavior is hard enough to understand that I suspect : that no one understood it well enough to rely on it anyway : (especially as that behavior contradicted both man page and POSIX). : : So, there's a chance of breakage, but I estimate that it's minute. Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Cc: Davidlohr Bueso <davidlohr.bueso@hp.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06ipc/sem.c: store which operation blocks in perform_atomic_semop()Manfred Spraul
Preparation for the next patch: In the slow-path of perform_atomic_semop(), store a pointer to the operation that caused the operation to block. Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Cc: Davidlohr Bueso <davidlohr.bueso@hp.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06ipc/sem.c: change perform_atomic_semop parametersManfred Spraul
Right now, perform_atomic_semop gets the content of sem_queue as individual fields. Changes that, instead pass a pointer to sem_queue. This is a preparation for the next patch: it uses sem_queue to store the reason why a task must sleep. Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Cc: Davidlohr Bueso <davidlohr.bueso@hp.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06ipc/sem.c: remove code duplicationManfred Spraul
count_semzcnt and count_semncnt are more of less identical. The patch creates a single function that either counts the number of tasks waiting for zero or waiting due to a decrease operation. Compared to the initial version, the BUG_ONs were removed. Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Cc: Davidlohr Bueso <davidlohr.bueso@hp.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06ipc/sem.c: bugfix for semctl(,,GETZCNT)Manfred Spraul
GETZCNT is supposed to return the number of threads that wait until a semaphore value becomes 0. The current implementation overlooks complex operations that contain both wait-for-zero operation and operations that alter at least one semaphore. The patch fixes that. It's intentionally copy&paste, this will be cleaned up in the next patch. Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Cc: Davidlohr Bueso <davidlohr.bueso@hp.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06ipc,msg: document volatile r_msgDavidlohr Bueso
The need for volatile is not obvious, document it. Signed-off-by: Davidlohr Bueso <davidlohr@hp.com> Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Cc: Aswin Chandramouleeswaran <aswin@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06ipc,msg: move some msgq ns code aroundDavidlohr Bueso
Nothing big and no logical changes, just get rid of some redundant function declarations. Move msg_[init/exit]_ns down the end of the file. Signed-off-by: Davidlohr Bueso <davidlohr@hp.com> Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Cc: Aswin Chandramouleeswaran <aswin@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06ipc,msg: use current->state helpersDavidlohr Bueso
Call __set_current_state() instead of assigning the new state directly. Signed-off-by: Davidlohr Bueso <davidlohr@hp.com> Signed-off-by: Manfred Spraul <manfred@colorfullif.com> Cc: Aswin Chandramouleeswaran <aswin@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06ipc,shm: document new limits in the uapi headerDavidlohr Bueso
This is useful in the future and allows users to better understand the reasoning behind the changes. Also use UL as we're dealing with it anyways. Signed-off-by: Davidlohr Bueso <davidlohr@hp.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06ipc/shm.c: increase the defaults for SHMALL, SHMMAXManfred Spraul
System V shared memory a) can be abused to trigger out-of-memory conditions and the standard measures against out-of-memory do not work: - it is not possible to use setrlimit to limit the size of shm segments. - segments can exist without association with any processes, thus the oom-killer is unable to free that memory. b) is typically used for shared information - today often multiple GB. (e.g. database shared buffers) The current default is a maximum segment size of 32 MB and a maximum total size of 8 GB. This is often too much for a) and not enough for b), which means that lots of users must change the defaults. This patch increases the default limits (nearly) to the maximum, which is perfect for case b). The defaults are used after boot and as the initial value for each new namespace. Admins/distros that need a protection against a) should reduce the limits and/or enable shm_rmid_forced. Unix has historically required setting these limits for shared memory, and Linux inherited such behavior. The consequence of this is added complexity for users and administrators. One very common example are Database setup/installation documents and scripts, where users must manually calculate the values for these limits. This also requires (some) knowledge of how the underlying memory management works, thus causing, in many occasions, the limits to just be flat out wrong. Disabling these limits sooner could have saved companies a lot of time, headaches and money for support. But it's never too late, simplify users life now. Further notes: - The patch only changes default, overrides behave as before: # sysctl kernel.shmall=33554432 would recreate the previous limit for SHMMAX (for the current namespace). - Disabling sysv shm allocation is possible with: # sysctl kernel.shmall=0 (not a new feature, also per-namespace) - The limits are intentionally set to a value slightly less than ULONG_MAX, to avoid triggering overflows in user space apps. [not unreasonable, see http://marc.info/?l=linux-mm&m=139638334330127] Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Davidlohr Bueso <davidlohr@hp.com> Reported-by: Davidlohr Bueso <davidlohr@hp.com> Acked-by: Michael Kerrisk <mtk.manpages@gmail.com> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06ipc/shm.c: check for integer overflow during shmget.Manfred Spraul
SHMMAX is the upper limit for the size of a shared memory segment, counted in bytes. The actual allocation is that size, rounded up to the next full page. Add a check that prevents the creation of segments where the rounded up size causes an integer overflow. Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Acked-by: Davidlohr Bueso <davidlohr@hp.com> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06ipc/shm.c: check for overflows of shm_totManfred Spraul
shm_tot counts the total number of pages used by shm segments. If SHMALL is ULONG_MAX (or nearly ULONG_MAX), then the number can overflow. Subsequent calls to shmctl(,SHM_INFO,) would return wrong values for shm_tot. The patch adds a detection for overflows. Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Acked-by: Davidlohr Bueso <davidlohr@hp.com> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>