summaryrefslogtreecommitdiff
path: root/init
AgeCommit message (Collapse)Author
2017-05-06initramfs: avoid "label at end of compound statement" errorLinus Torvalds
Commit 17a9be317475 ("initramfs: Always do fput() and load modules after rootfs populate") introduced an error for the CONFIG_BLK_DEV_RAM=y case, because even though the code looks fine, the compiler really wants a statement after a label, or you'll get complaints: init/initramfs.c: In function 'populate_rootfs': init/initramfs.c:644:2: error: label at end of compound statement That commit moved the subsequent statements to outside the compound statement, leaving the label without any associated statements. Reported-by: Jörg Otte <jrg.otte@gmail.com> Fixes: 17a9be317475 ("initramfs: Always do fput() and load modules after rootfs populate") Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: stable@vger.kernel.org # if 17a9be317475 gets backported Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-05Merge tag 'initramfs-fix-4.12-rc1' of git://github.com/stffrdhrn/linuxLinus Torvalds
Pull initramfs fix from Stafford Horne: "This is a fix for an issue that has caused 4.11 to not boot on OpenRISC. I should have caught this during the 4.11 cycle but I had been busy on testing some other series of patches. I would have considered pushing it though a different path but Al Viro suggested submitting directly to you. Also, its just one as I havent really got anything else ready on my queue for 4.12. Summary: - Ensure fput() flush is done even for builtin initramfs" * tag 'initramfs-fix-4.12-rc1' of git://github.com/stffrdhrn/linux: initramfs: Always do fput() and load modules after rootfs populate
2017-05-05initramfs: Always do fput() and load modules after rootfs populateStafford Horne
In OpenRISC we do not have a bootloader passed initrd, but the built in initramfs does contain the /init and other binaries, including modules. The previous commit 08865514805d2 ("initramfs: finish fput() before accessing any binary from initramfs") made a change to only call fput() if the bootloader initrd was available, this caused intermittent crashes for OpenRISC. This patch changes the fput() to happen unconditionally if any rootfs is loaded. Also, I added some comments to make it a bit more clear why we call unpack_to_rootfs() multiple times. Fixes: 08865514805d2 ("initramfs: finish fput() before accessing any binary from initramfs") Cc: stable@vger.kernel.org Cc: Lokesh Vutla <lokeshvutla@ti.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-05-03Merge tag 'trace-v4.12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "New features for this release: - Pretty much a full rewrite of the processing of function plugins. i.e. echo do_IRQ:stacktrace > set_ftrace_filter - The rewrite was needed to add plugins to be unique to tracing instances. i.e. mkdir instance/foo; cd instances/foo; echo do_IRQ:stacktrace > set_ftrace_filter The old way was written very hacky. This removes a lot of those hacks. - New "function-fork" tracing option. When set, pids in the set_ftrace_pid will have their children added when the processes with their pids listed in the set_ftrace_pid file forks. - Exposure of "maxactive" for kretprobe in kprobe_events - Allow for builtin init functions to be traced by the function tracer (via the kernel command line). Module init function tracing will come in the next release. - Added more selftests, and have selftests also test in an instance" * tag 'trace-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (60 commits) ring-buffer: Return reader page back into existing ring buffer selftests: ftrace: Allow some event trigger tests to run in an instance selftests: ftrace: Have some basic tests run in a tracing instance too selftests: ftrace: Have event tests also run in an tracing instance selftests: ftrace: Make func_event_triggers and func_traceonoff_triggers tests do instances selftests: ftrace: Allow some tests to be run in a tracing instance tracing/ftrace: Allow for instances to trigger their own stacktrace probes tracing/ftrace: Allow for the traceonoff probe be unique to instances tracing/ftrace: Enable snapshot function trigger to work with instances tracing/ftrace: Allow instances to have their own function probes tracing/ftrace: Add a better way to pass data via the probe functions ftrace: Dynamically create the probe ftrace_ops for the trace_array tracing: Pass the trace_array into ftrace_probe_ops functions tracing: Have the trace_array hold the list of registered func probes ftrace: If the hash for a probe fails to update then free what was initialized ftrace: Have the function probes call their own function ftrace: Have each function probe use its own ftrace_ops ftrace: Have unregister_ftrace_function_probe_func() return a value ftrace: Add helper function ftrace_hash_move_and_update_ops() ftrace: Remove data field from ftrace_func_probe structure ...
2017-05-02Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree updates from Jiri Kosina. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: tty: fix comment for __tty_alloc_driver() init/main: properly align the multi-line comment init/main: Fix double "the" in comment Fix dead URLs to ftp.kernel.org drivers: Clean up duplicated email address treewide: Fix typo in xml/driver-api/basics.xml tools/testing/selftests/powerpc: remove redundant CFLAGS in Makefile: "-Wall -O2 -Wall" -> "-O2 -Wall" selftests/timers: Spelling s/privledges/privileges/ HID: picoLCD: Spelling s/REPORT_WRTIE_MEMORY/REPORT_WRITE_MEMORY/ net: phy: dp83848: Fix Typo UBI: Fix typos Documentation: ftrace.txt: Correct nice value of 120 priority net: fec: Fix typo in error msg and comment treewide: Fix typos in printk
2017-04-24init/main: properly align the multi-line commentViresh Kumar
Add a tab before it to follow standard practices. Also add the missing full stop '.'. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2017-04-24init/main: Fix double "the" in commentViresh Kumar
s/the\ the/the Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2017-04-03ftrace: Have init/main.c call ftrace directly to free init memorySteven Rostedt (VMware)
Relying on free_reserved_area() to call ftrace to free init memory proved to not be sufficient. The issue is that on x86, when debug_pagealloc is enabled, the init memory is not freed, but simply set as not present. Since ftrace was uninformed of this, starting function tracing still tries to update pages that are not present according to the page tables, causing ftrace to bug, as well as killing the kernel itself. Instead of relying on free_reserved_area(), have init/main.c call ftrace directly just before it frees the init memory. Then it needs to use __init_begin and __init_end to know where the init memory location is. Looking at all archs (and testing what I can), it appears that this should work for each of them. Reported-by: kernel test robot <xiaolong.ye@intel.com> Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-03-31mm: move mm_percpu_wq initialization earlierMichal Hocko
Yang Li has reported that drain_all_pages triggers a WARN_ON which means that this function is called earlier than the mm_percpu_wq is initialized on arm64 with CMA configured: WARNING: CPU: 2 PID: 1 at mm/page_alloc.c:2423 drain_all_pages+0x244/0x25c Modules linked in: CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.11.0-rc1-next-20170310-00027-g64dfbc5 #127 Hardware name: Freescale Layerscape 2088A RDB Board (DT) task: ffffffc07c4a6d00 task.stack: ffffffc07c4a8000 PC is at drain_all_pages+0x244/0x25c LR is at start_isolate_page_range+0x14c/0x1f0 [...] drain_all_pages+0x244/0x25c start_isolate_page_range+0x14c/0x1f0 alloc_contig_range+0xec/0x354 cma_alloc+0x100/0x1fc dma_alloc_from_contiguous+0x3c/0x44 atomic_pool_init+0x7c/0x208 arm64_dma_init+0x44/0x4c do_one_initcall+0x38/0x128 kernel_init_freeable+0x1a0/0x240 kernel_init+0x10/0xfc ret_from_fork+0x10/0x20 Fix this by moving the whole setup_vmstat which is an initcall right now to init_mm_internals which will be called right after the WQ subsystem is initialized. Link: http://lkml.kernel.org/r/20170315164021.28532-1-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Yang Li <pku.leo@gmail.com> Tested-by: Yang Li <pku.leo@gmail.com> Tested-by: Xiaolong Ye <xiaolong.ye@intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-03-24ftrace: Move ftrace_init() to right after memory initializationSteven Rostedt (VMware)
Initialize the ftrace records immediately after memory initialization, as that is all that is required for the records to be created. This will allow for future work to get function tracing started earlier in the boot process. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-03-24tracing: Split tracing initialization into two for early initializationSteven Rostedt (VMware)
Create an early_trace_init() function that will initialize the buffers and allow for ealier use of trace_printk(). This will also allow for future work to have function tracing start earlier at boot up. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-03-11Merge tag 'random_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random Pull random updates from Ted Ts'o: "Change get_random_{int,log} to use the CRNG used by /dev/urandom and getrandom(2). It's faster and arguably more secure than cut-down MD5 that we had been using. Also do some code cleanup" * tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random: random: move random_min_urandom_seed into CONFIG_SYSCTL ifdef block random: convert get_random_int/long into get_random_u32/u64 random: use chacha20 for get_random_int/long random: fix comment for unused random_min_urandom_seed random: remove variable limit random: remove stale urandom_init_wait random: remove stale maybe_reseed_primary_crng
2017-03-02sched/headers: Prepare to move _init() prototypes from <linux/sched.h> to ↵Ingo Molnar
<linux/sched/init.h> But first introduce a trivial header and update usage sites. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02sched/headers, vfs/execve: Prepare to move the do_execve*() prototypes from ↵Ingo Molnar
<linux/sched.h> to <linux/binfmts.h> But first update the usage sites with the new header dependency. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02sched/headers: Prepare to move 'init_task' and 'init_thread_union' from ↵Ingo Molnar
<linux/sched.h> to <linux/sched/task.h> Update all usage sites first. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02sched/headers: Prepare for new header dependencies before moving code to ↵Ingo Molnar
<linux/sched/task_stack.h> We are going to split <linux/sched/task_stack.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. Create a trivial placeholder <linux/sched/task_stack.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. Include the new header in the files that are going to need it. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02sched/headers: Prepare for new header dependencies before moving code to ↵Ingo Molnar
<linux/sched/task.h> We are going to split <linux/sched/task.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. Create a trivial placeholder <linux/sched/task.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. Include the new header in the files that are going to need it. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02sched/headers: Prepare for new header dependencies before moving code to ↵Ingo Molnar
<linux/sched/nmi.h> We are going to move softlockup APIs out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. <linux/nmi.h> already includes <linux/sched.h>. Include the <linux/nmi.h> header in the files that are going to need it. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-28Merge branch 'idr-4.11' of git://git.infradead.org/users/willy/linux-daxLinus Torvalds
Pull IDR rewrite from Matthew Wilcox: "The most significant part of the following is the patch to rewrite the IDR & IDA to be clients of the radix tree. But there's much more, including an enhancement of the IDA to be significantly more space efficient, an IDR & IDA test suite, some improvements to the IDR API (and driver changes to take advantage of those improvements), several improvements to the radix tree test suite and RCU annotations. The IDR & IDA rewrite had a good spin in linux-next and Andrew's tree for most of the last cycle. Coupled with the IDR test suite, I feel pretty confident that any remaining bugs are quite hard to hit. 0-day did a great job of watching my git tree and pointing out problems; as it hit them, I added new test-cases to be sure not to be caught the same way twice" Willy goes on to expand a bit on the IDR rewrite rationale: "The radix tree and the IDR use very similar data structures. Merging the two codebases lets us share the memory allocation pools, and results in a net deletion of 500 lines of code. It also opens up the possibility of exposing more of the features of the radix tree to users of the IDR (and I have some interesting patches along those lines waiting for 4.12) It also shrinks the size of the 'struct idr' from 40 bytes to 24 which will shrink a fair few data structures that embed an IDR" * 'idr-4.11' of git://git.infradead.org/users/willy/linux-dax: (32 commits) radix tree test suite: Add config option for map shift idr: Add missing __rcu annotations radix-tree: Fix __rcu annotations radix-tree: Add rcu_dereference and rcu_assign_pointer calls radix tree test suite: Run iteration tests for longer radix tree test suite: Fix split/join memory leaks radix tree test suite: Fix leaks in regression2.c radix tree test suite: Fix leaky tests radix tree test suite: Enable address sanitizer radix_tree_iter_resume: Fix out of bounds error radix-tree: Store a pointer to the root in each node radix-tree: Chain preallocated nodes through ->parent radix tree test suite: Dial down verbosity with -v radix tree test suite: Introduce kmalloc_verbose idr: Return the deleted entry from idr_remove radix tree test suite: Build separate binaries for some tests ida: Use exceptional entries for small IDAs ida: Move ida_bitmap to a percpu variable Reimplement IDR and IDA using the radix tree radix-tree: Add radix_tree_iter_delete ...
2017-02-27Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge yet more updates from Andrew Morton: - a few MM remainders - misc things - autofs updates - signals - affs updates - ipc - nilfs2 - spelling.txt updates * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (78 commits) mm, x86: fix HIGHMEM64 && PARAVIRT build config for native_pud_clear() mm: add arch-independent testcases for RODATA hfs: atomically read inode size mm: clarify mm_struct.mm_{users,count} documentation mm: use mmget_not_zero() helper mm: add new mmget() helper mm: add new mmgrab() helper checkpatch: warn when formats use %Z and suggest %z lib/vsprintf.c: remove %Z support scripts/spelling.txt: add some typo-words scripts/spelling.txt: add "followings" pattern and fix typo instances scripts/spelling.txt: add "therfore" pattern and fix typo instances scripts/spelling.txt: add "overwriten" pattern and fix typo instances scripts/spelling.txt: add "overwritting" pattern and fix typo instances scripts/spelling.txt: add "deintialize(d)" pattern and fix typo instances scripts/spelling.txt: add "disassocation" pattern and fix typo instances scripts/spelling.txt: add "omited" pattern and fix typo instances scripts/spelling.txt: add "explictely" pattern and fix typo instances scripts/spelling.txt: add "applys" pattern and fix typo instances scripts/spelling.txt: add "configuartion" pattern and fix typo instances ...
2017-02-27Merge branch 'for-4.11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: "Several noteworthy changes. - Parav's rdma controller is finally merged. It is very straight forward and can limit the abosolute numbers of common rdma constructs used by different cgroups. - kernel/cgroup.c got too chubby and disorganized. Created kernel/cgroup/ subdirectory and moved all cgroup related files under kernel/ there and reorganized the core code. This hurts for backporting patches but was long overdue. - cgroup v2 process listing reimplemented so that it no longer depends on allocating a buffer large enough to cache the entire result to sort and uniq the output. v2 has always mangled the sort order to ensure that users don't depend on the sorted output, so this shouldn't surprise anybody. This makes the pid listing functions use the same iterators that are used internally, which have to have the same iterating capabilities anyway. - perf cgroup filtering now works automatically on cgroup v2. This patch was posted a long time ago but somehow fell through the cracks. - misc fixes asnd documentation updates" * 'for-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (27 commits) kernfs: fix locking around kernfs_ops->release() callback cgroup: drop the matching uid requirement on migration for cgroup v2 cgroup, perf_event: make perf_event controller work on cgroup2 hierarchy cgroup: misc cleanups cgroup: call subsys->*attach() only for subsystems which are actually affected by migration cgroup: track migration context in cgroup_mgctx cgroup: cosmetic update to cgroup_taskset_add() rdmacg: Fixed uninitialized current resource usage cgroup: Add missing cgroup-v2 PID controller documentation. rdmacg: Added documentation for rdmacg IB/core: added support to use rdma cgroup controller rdmacg: Added rdma cgroup controller cgroup: fix a comment typo cgroup: fix RCU related sparse warnings cgroup: move namespace code to kernel/cgroup/namespace.c cgroup: rename functions for consistency cgroup: move v1 mount functions to kernel/cgroup/cgroup-v1.c cgroup: separate out cgroup1_kf_syscall_ops cgroup: refactor mount path and clearly distinguish v1 and v2 paths cgroup: move cgroup v1 specific code to kernel/cgroup/cgroup-v1.c ...
2017-02-27mm: add arch-independent testcases for RODATAJinbum Park
This patch makes arch-independent testcases for RODATA. Both x86 and x86_64 already have testcases for RODATA, But they are arch-specific because using inline assembly directly. And cacheflush.h is not a suitable location for rodata-test related things. Since they were in cacheflush.h, If someone change the state of CONFIG_DEBUG_RODATA_TEST, It cause overhead of kernel build. To solve the above issues, write arch-independent testcases and move it to shared location. [jinb.park7@gmail.com: fix config dependency] Link: http://lkml.kernel.org/r/20170209131625.GA16954@pjb1027-Latitude-E5410 Link: http://lkml.kernel.org/r/20170129105436.GA9303@pjb1027-Latitude-E5410 Signed-off-by: Jinbum Park <jinb.park7@gmail.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Laura Abbott <labbott@redhat.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Valentin Rothberg <valentinrothberg@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27initramfs: finish fput() before accessing any binary from initramfsLokesh Vutla
Commit 4a9d4b024a31 ("switch fput to task_work_add") implements a schedule_work() for completing fput(), but did not guarantee calling __fput() after unpacking initramfs. Because of this, there is a possibility that during boot a driver can see ETXTBSY when it tries to load a binary from initramfs as fput() is still pending on that binary. This patch makes sure that fput() is completed after unpacking initramfs and removes the call to flush_delayed_fput() in kernel_init() which happens very late after unpacking initramfs. Link: http://lkml.kernel.org/r/20170201140540.22051-1-lokeshvutla@ti.com Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com> Reported-by: Murali Karicheri <m-karicheri2@ti.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Tero Kristo <t-kristo@ti.com> Cc: Sekhar Nori <nsekhar@ti.com> Cc: Nishanth Menon <nm@ti.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-22Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge updates from Andrew Morton: "142 patches: - DAX updates - various misc bits - OCFS2 updates - most of MM" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (142 commits) mm/z3fold.c: limit first_num to the actual range of possible buddy indexes mm: fix <linux/pagemap.h> stray kernel-doc notation zram: remove obsolete sysfs attrs mm/memblock.c: remove unnecessary log and clean up oom-reaper: use madvise_dontneed() logic to decide if unmap the VMA mm: drop unused argument of zap_page_range() mm: drop zap_details::check_swap_entries mm: drop zap_details::ignore_dirty mm, page_alloc: warn_alloc nodemask is NULL when cpusets are disabled mm: help __GFP_NOFAIL allocations which do not trigger OOM killer mm, oom: do not enforce OOM killer for __GFP_NOFAIL automatically mm: consolidate GFP_NOFAIL checks in the allocator slowpath lib/show_mem.c: teach show_mem to work with the given nodemask arch, mm: remove arch specific show_mem mm, page_alloc: warn_alloc print nodemask mm, page_alloc: do not report all nodes in show_mem Revert "mm: bail out in shrink_inactive_list()" mm, vmscan: consider eligible zones in get_scan_count mm, vmscan: cleanup lru size claculations mm, vmscan: do not count freed pages as PGDEACTIVATE ...
2017-02-22Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk Pull printk updates from Petr Mladek: - Add Petr Mladek, Sergey Senozhatsky as printk maintainers, and Steven Rostedt as the printk reviewer. This idea came up after the discussion about printk issues at Kernel Summit. It was formulated and discussed at lkml[1]. - Extend a lock-less NMI per-cpu buffers idea to handle recursive printk() calls by Sergey Senozhatsky[2]. It is the first step in sanitizing printk as discussed at Kernel Summit. The change allows to see messages that would normally get ignored or would cause a deadlock. Also it allows to enable lockdep in printk(). This already paid off. The testing in linux-next helped to discover two old problems that were hidden before[3][4]. - Remove unused parameter by Sergey Senozhatsky. Clean up after a past change. [1] http://lkml.kernel.org/r/1481798878-31898-1-git-send-email-pmladek@suse.com [2] http://lkml.kernel.org/r/20161227141611.940-1-sergey.senozhatsky@gmail.com [3] http://lkml.kernel.org/r/20170215044332.30449-1-sergey.senozhatsky@gmail.com [4] http://lkml.kernel.org/r/20170217015932.11898-1-sergey.senozhatsky@gmail.com * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk: printk: drop call_console_drivers() unused param printk: convert the rest to printk-safe printk: remove zap_locks() function printk: use printk_safe buffers in printk printk: report lost messages in printk safe/nmi contexts printk: always use deferred printk when flush printk_safe lines printk: introduce per-cpu safe_print seq buffer printk: rename nmi.c and exported api printk: use vprintk_func in vprintk() MAINTAINERS: Add printk maintainers
2017-02-22slub: make sysfs directories for memcg sub-caches optionalTejun Heo
SLUB creates a per-cache directory under /sys/kernel/slab which hosts a bunch of debug files. Usually, there aren't that many caches on a system and this doesn't really matter; however, if memcg is in use, each cache can have per-cgroup sub-caches. SLUB creates the same directories for these sub-caches under /sys/kernel/slab/$CACHE/cgroup. Unfortunately, because there can be a lot of cgroups, active or draining, the product of the numbers of caches, cgroups and files in each directory can reach a very high number - hundreds of thousands is commonplace. Millions and beyond aren't difficult to reach either. What's under /sys/kernel/slab is primarily for debugging and the information and control on the a root cache already cover its sub-caches. While having a separate directory for each sub-cache can be helpful for development, it doesn't make much sense to pay this amount of overhead by default. This patch introduces a boot parameter slub_memcg_sysfs which determines whether to create sysfs directories for per-memcg sub-caches. It also adds CONFIG_SLUB_MEMCG_SYSFS_ON which determines the boot parameter's default value and defaults to 0. [akpm@linux-foundation.org: kset_unregister(NULL) is legal] Link: http://lkml.kernel.org/r/20170204145203.GB26958@mtj.duckdns.org Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-22Merge tag 'char-misc-4.11-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver updates from Greg KH: "Here is the big char/misc driver patchset for 4.11-rc1. Lots of different driver subsystems updated here: rework for the hyperv subsystem to handle new platforms better, mei and w1 and extcon driver updates, as well as a number of other "minor" driver updates. All of these have been in linux-next for a while with no reported issues" * tag 'char-misc-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (169 commits) goldfish: Sanitize the broken interrupt handler x86/platform/goldfish: Prevent unconditional loading vmbus: replace modulus operation with subtraction vmbus: constify parameters where possible vmbus: expose hv_begin/end_read vmbus: remove conditional locking of vmbus_write vmbus: add direct isr callback mode vmbus: change to per channel tasklet vmbus: put related per-cpu variable together vmbus: callback is in softirq not workqueue binder: Add support for file-descriptor arrays binder: Add support for scatter-gather binder: Add extra size to allocator binder: Refactor binder_transact() binder: Support multiple /dev instances binder: Deal with contexts in debugfs binder: Support multiple context managers binder: Split flat_binder_object auxdisplay: ht16k33: remove private workqueue auxdisplay: ht16k33: rework input device initialization ...
2017-02-21Merge tag 'rodata-v4.11-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull rodata updates from Kees Cook: "This renames the (now inaccurate) DEBUG_RODATA and related SET_MODULE_RONX configs to the more sensible STRICT_KERNEL_RWX and STRICT_MODULE_RWX" * tag 'rodata-v4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: arch: Rename CONFIG_DEBUG_RODATA and CONFIG_DEBUG_MODULE_RONX arch: Move CONFIG_DEBUG_RODATA and CONFIG_SET_MODULE_RONX to be common
2017-02-21Merge tag 'extable-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux Pull exception table module split from Paul Gortmaker: "Final extable.h related changes. This completes the separation of exception table content from the module.h header file. This is achieved with the final commit that removes the one line back compatible change that sourced extable.h into the module.h file. The commits are unchanged since January, with the exception of a couple Acks that came in for the last two commits a bit later. The changes have been in linux-next for quite some time[1] and have got widespread arch coverage via toolchains I have and also from additional ones the kbuild bot has. Maintaners of the various arch were Cc'd during the postings to lkml[2] and informed that the intention was to take the remaining arch specific changes and lump them together with the final two non-arch specific changes and submit for this merge window. The ia64 diffstat stands out and probably warrants a mention. In an earlier review, Al Viro made a valid comment that the original header separation of content left something to be desired, and that it get fixed as a part of this change, hence the larger diffstat" * tag 'extable-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (21 commits) module.h: remove extable.h include now users have migrated core: migrate exception table users off module.h and onto extable.h cris: migrate exception table users off module.h and onto extable.h hexagon: migrate exception table users off module.h and onto extable.h microblaze: migrate exception table users off module.h and onto extable.h unicore32: migrate exception table users off module.h and onto extable.h score: migrate exception table users off module.h and onto extable.h metag: migrate exception table users off module.h and onto extable.h arc: migrate exception table users off module.h and onto extable.h nios2: migrate exception table users off module.h and onto extable.h sparc: migrate exception table users onto extable.h openrisc: migrate exception table users off module.h and onto extable.h frv: migrate exception table users off module.h and onto extable.h sh: migrate exception table users off module.h and onto extable.h xtensa: migrate exception table users off module.h and onto extable.h mn10300: migrate exception table users off module.h and onto extable.h alpha: migrate exception table users off module.h and onto extable.h arm: migrate exception table users off module.h and onto extable.h m32r: migrate exception table users off module.h and onto extable.h ia64: ensure exception table search users include extable.h ...
2017-02-20Merge branch 'locking-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "The main changes in this cycle were: - Implement wraparound-safe refcount_t and kref_t types based on generic atomic primitives (Peter Zijlstra) - Improve and fix the ww_mutex code (Nicolai Hähnle) - Add self-tests to the ww_mutex code (Chris Wilson) - Optimize percpu-rwsems with the 'rcuwait' mechanism (Davidlohr Bueso) - Micro-optimize the current-task logic all around the core kernel (Davidlohr Bueso) - Tidy up after recent optimizations: remove stale code and APIs, clean up the code (Waiman Long) - ... plus misc fixes, updates and cleanups" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits) fork: Fix task_struct alignment locking/spinlock/debug: Remove spinlock lockup detection code lockdep: Fix incorrect condition to print bug msgs for MAX_LOCKDEP_CHAIN_HLOCKS lkdtm: Convert to refcount_t testing kref: Implement 'struct kref' using refcount_t refcount_t: Introduce a special purpose refcount type sched/wake_q: Clarify queue reinit comment sched/wait, rcuwait: Fix typo in comment locking/mutex: Fix lockdep_assert_held() fail locking/rtmutex: Flip unlikely() branch to likely() in __rt_mutex_slowlock() locking/rwsem: Reinit wake_q after use locking/rwsem: Remove unnecessary atomic_long_t casts jump_labels: Move header guard #endif down where it belongs locking/atomic, kref: Implement kref_put_lock() locking/ww_mutex: Turn off __must_check for now locking/atomic, kref: Avoid more abuse locking/atomic, kref: Use kref_get_unless_zero() more locking/atomic, kref: Kill kref_sub() locking/atomic, kref: Add kref_read() locking/atomic, kref: Add KREF_INIT() ...
2017-02-20Merge branch 'sched-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The main changes in this (fairly busy) cycle were: - There was a class of scheduler bugs related to forgetting to update the rq-clock timestamp which can cause weird and hard to debug problems, so there's a new debug facility for this: which uncovered a whole lot of bugs which convinced us that we want to keep the debug facility. (Peter Zijlstra, Matt Fleming) - Various cputime related updates: eliminate cputime and use u64 nanoseconds directly, simplify and improve the arch interfaces, implement delayed accounting more widely, etc. - (Frederic Weisbecker) - Move code around for better structure plus cleanups (Ingo Molnar) - Move IO schedule accounting deeper into the scheduler plus related changes to improve the situation (Tejun Heo) - ... plus a round of sched/rt and sched/deadline fixes, plus other fixes, updats and cleanups" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (85 commits) sched/core: Remove unlikely() annotation from sched_move_task() sched/autogroup: Rename auto_group.[ch] to autogroup.[ch] sched/topology: Split out scheduler topology code from core.c into topology.c sched/core: Remove unnecessary #include headers sched/rq_clock: Consolidate the ordering of the rq_clock methods delayacct: Include <uapi/linux/taskstats.h> sched/core: Clean up comments sched/rt: Show the 'sched_rr_timeslice' SCHED_RR timeslice tuning knob in milliseconds sched/clock: Add dummy clear_sched_clock_stable() stub function sched/cputime: Remove generic asm headers sched/cputime: Remove unused nsec_to_cputime() s390, sched/cputime: Remove unused cputime definitions powerpc, sched/cputime: Remove unused cputime definitions s390, sched/cputime: Make arch_cpu_idle_time() to return nsecs ia64, sched/cputime: Remove unused cputime definitions ia64: Convert vtime to use nsec units directly ia64, sched/cputime: Move the nsecs based cputime headers to the last arch using it sched/cputime: Remove jiffies based cputime sched/cputime, vtime: Return nsecs instead of cputime_t to account sched/cputime: Complete nsec conversion of tick based accounting ...
2017-02-20Merge branch 'efi-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI updates from Ingo Molnar: "The main changes in this cycle were: - Changes to the EFI init code to establish whether secure boot authentication was performed at boot time. (Josh Boyer, David Howells) - Wire up the UEFI memory attributes table for x86. This eliminates any runtime memory regions that are both writable and executable, on recent firmware versions. (Sai Praneeth) - Move the BGRT init code to an earlier stage so that we can still use efi_mem_reserve(). (Dave Young) - Preserve debug symbols in the ARM/arm64 UEFI stub (Ard Biesheuvel) - Code deduplication work and various other cleanups (Lukas Wunner) - ... plus various other fixes and cleanups" * 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi/libstub: Make file I/O chunking x86-specific efi: Print the secure boot status in x86 setup_arch() efi: Disable secure boot if shim is in insecure mode efi: Get and store the secure boot status efi: Add SHIM and image security database GUID definitions arm/efi: Allow invocation of arbitrary runtime services x86/efi: Allow invocation of arbitrary runtime services efi/libstub: Preserve .debug sections after absolute relocation check efi/x86: Add debug code to print cooked memmap efi/x86: Move the EFI BGRT init code to early init code efi: Use typed function pointers for the runtime services table efi/esrt: Fix typo in pr_err() message x86/efi: Add support for EFI_MEMORY_ATTRIBUTES_TABLE efi: Introduce the EFI_MEM_ATTR bit and set it from the memory attributes table efi: Make EFI_MEMORY_ATTRIBUTES_TABLE initialization common across all architectures x86/efi: Deduplicate efi_char16_printk() efi: Deduplicate efi_file_size() / _read() / _close()
2017-02-20Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "The RCU changes in this cycle are: - Dynticks updates, consolidating open-coded counter accesses into a well-defined API - SRCU updates: Simplify algorithm, add formal verification - Documentation updates - Miscellaneous fixes - Torture-test updates Most of the diffstat comes from the relatively large documentation update" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits) srcu: Reduce probability of SRCU ->unlock_count[] counter overflow rcutorture: Add CBMC-based formal verification for SRCU srcu: Force full grace-period ordering srcu: Implement more-efficient reader counts rcu: Adjust FQS offline checks for exact online-CPU detection rcu: Check cond_resched_rcu_qs() state less often to reduce GP overhead rcu: Abstract extended quiescent state determination rcu: Abstract dynticks extended quiescent state enter/exit operations rcu: Add lockdep checks to synchronous expedited primitives rcu: Eliminate unused expedited_normal counter llist: Clarify comments about when locking is needed rcu: Fix comment in rcu_organize_nocb_kthreads() rcu: Enable RCU tracepoints by default to aid in debugging rcu: Make rcu_cpu_starting() use its "cpu" argument rcu: Add comment headers to expedited-grace-period counter functions rcu: Don't wake rcuc/X kthreads on NOCB CPUs rcu: Re-enable TASKS_RCU for User Mode Linux rcu: Once again use NMI-based stack traces in stall warnings rcu: Remove short-term CPU kicking rcu: Add long-term CPU kicking ...
2017-02-13Reimplement IDR and IDA using the radix treeMatthew Wilcox
The IDR is very similar to the radix tree. It has some functionality that the radix tree did not have (alloc next free, cyclic allocation, a callback-based for_each, destroy tree), which is readily implementable on top of the radix tree. A few small changes were needed in order to use a tag to represent nodes with free space below them. More extensive changes were needed to support storing NULL as a valid entry in an IDR. Plain radix trees still interpret NULL as a not-present entry. The IDA is reimplemented as a client of the newly enhanced radix tree. As in the current implementation, it uses a bitmap at the last level of the tree. Signed-off-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2017-02-09core: migrate exception table users off module.h and onto extable.hPaul Gortmaker
These files were including module.h for exception table related functions. We've now separated that content out into its own file "extable.h" so now move over to that and where possible, avoid all the extra header content in module.h that we don't really need to compile these non-modular files. Note: init/main.c still needs module.h for __init_or_module kernel/extable.c still needs module.h for is_module_text_address ...and so we don't get the benefit of removing module.h from the cpp feed for these two files, unlike the almost universal 1:1 exchange of module.h for extable.h we were able to do in the arch dirs. Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Jessica Yu <jeyu@redhat.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-08printk: rename nmi.c and exported apiSergey Senozhatsky
A preparation patch for printk_safe work. No functional change. - rename nmi.c to print_safe.c - add `printk_safe' prefix to some (which used both by printk-safe and printk-nmi) of the exported functions. Link: http://lkml.kernel.org/r/20161227141611.940-3-sergey.senozhatsky@gmail.com Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jan Kara <jack@suse.cz> Cc: Tejun Heo <tj@kernel.org> Cc: Calvin Owens <calvinowens@fb.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Petr Mladek <pmladek@suse.com>
2017-02-07arch: Rename CONFIG_DEBUG_RODATA and CONFIG_DEBUG_MODULE_RONXLaura Abbott
Both of these options are poorly named. The features they provide are necessary for system security and should not be considered debug only. Change the names to CONFIG_STRICT_KERNEL_RWX and CONFIG_STRICT_MODULE_RWX to better describe what these options do. Signed-off-by: Laura Abbott <labbott@redhat.com> Acked-by: Jessica Yu <jeyu@redhat.com> Signed-off-by: Kees Cook <keescook@chromium.org>
2017-02-07Merge tag 'v4.10-rc7' into efi/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-06Merge 4.10-rc7 into char-misc-nextGreg Kroah-Hartman
We want the hv and other fixes in here as well to handle merge and testing issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-03kbuild: modversions: add infrastructure for emitting relative CRCsArd Biesheuvel
This add the kbuild infrastructure that will allow architectures to emit vmlinux symbol CRCs as 32-bit offsets to another location in the kernel where the actual value is stored. This works around problems with CRCs being mistaken for relocatable symbols on kernels that self relocate at runtime (i.e., powerpc with CONFIG_RELOCATABLE=y) For the kbuild side of things, this comes down to the following: - introducing a Kconfig symbol MODULE_REL_CRCS - adding a -R switch to genksyms to instruct it to emit the CRC symbols as references into the .rodata section - making modpost distinguish such references from absolute CRC symbols by the section index (SHN_ABS) - making kallsyms disregard non-absolute symbols with a __crc_ prefix Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-01efi/x86: Move the EFI BGRT init code to early init codeDave Young
Before invoking the arch specific handler, efi_mem_reserve() reserves the given memory region through memblock. efi_bgrt_init() will call efi_mem_reserve() after mm_init(), at which time memblock is dead and should not be used anymore. The EFI BGRT code depends on ACPI initialization to get the BGRT ACPI table, so move parsing of the BGRT table to ACPI early boot code to ensure that efi_mem_reserve() in EFI BGRT code still use memblock safely. Tested-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-acpi@vger.kernel.org Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1485868902-20401-9-git-send-email-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-31Merge branch 'for-mingo' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU changes from Paul E. McKenney: - Dynticks updates, consolidating open-coded counter accesses into a well-defined API - SRCU updates: Simplify algorithm, add formal verification - Documentation updates - Miscellaneous fixes - Torture-test updates Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-27random: use chacha20 for get_random_int/longJason A. Donenfeld
Now that our crng uses chacha20, we can rely on its speedy characteristics for replacing MD5, while simultaneously achieving a higher security guarantee. Before the idea was to use these functions if you wanted random integers that aren't stupidly insecure but aren't necessarily secure either, a vague gray zone, that hopefully was "good enough" for its users. With chacha20, we can strengthen this claim, since either we're using an rdrand-like instruction, or we're using the same crng as /dev/urandom. And it's faster than what was before. We could have chosen to replace this with a SipHash-derived function, which might be slightly faster, but at the cost of having yet another RNG construction in the kernel. By moving to chacha20, we have a single RNG to analyze and verify, and we also already get good performance improvements on all platforms. Implementation-wise, rather than use a generic buffer for both get_random_int/long and memcpy based on the size needs, we use a specific buffer for 32-bit reads and for 64-bit reads. This way, we're guaranteed to always have aligned accesses on all platforms. While slightly more verbose in C, the assembly this generates is a lot simpler than otherwise. Finally, on 32-bit platforms where longs and ints are the same size, we simply alias get_random_int to get_random_long. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Suggested-by: Theodore Ts'o <tytso@mit.edu> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2017-01-23rcu: Re-enable TASKS_RCU for User Mode LinuxPaul E. McKenney
Now that User Mode Linux supports arch_irqs_disabled_flags(), this commit re-enables TASKS_RCU for User Mode Linux. Reported-by: Richard Weinberger <richard@nod.at> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2017-01-19pc104: Introduce the PC104 Kconfig optionWilliam Breathitt Gray
PC/104 form factor devices serve a specific niche of embedded system users; most Linux users will not have PC/104 form factor devices. This patch introduces the PC104 Kconfig option, which should be used to filter PC/104 specific device drivers and options, so that only those users interested in PC/104 related options are exposed to them. Signed-off-by: William Breathitt Gray <vilhelm.gray@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-16rcu: update: Make RCU_EXPEDITE_BOOT be the defaultSebastian Andrzej Siewior
RCU_EXPEDITE_BOOT should speed up the boot process by enforcing synchronize_rcu_expedited() instead of synchronize_rcu() during the boot process. There should be no reason why one does not want this and there is no need worry about real time latency at this point. Therefore make it default. Note that users wishing to avoid expediting entirely, for example when bringing up new hardware possibly having flaky IPIs, can use the rcu_normal boot parameter to override boot-time expediting. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> [ paulmck: Reworded commit log. ] Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2017-01-14locking/atomic, kref: Add KREF_INIT()Peter Zijlstra
Since we need to change the implementation, stop exposing internals. Provide KREF_INIT() to allow static initialization of struct kref. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-14sched/clock: Delay switching sched_clock to stablePeter Zijlstra
Currently we switch to the stable sched_clock if we guess the TSC is usable, and then switch back to the unstable path if it turns out TSC isn't stable during SMP bringup after all. Delay switching to the stable path until after SMP bringup is complete. This way we'll avoid switching during the time we detect the worst of the TSC offences. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-11cgroup: move CONFIG_SOCK_CGROUP_DATA to init/KconfigArnd Bergmann
We now 'select SOCK_CGROUP_DATA' but Kconfig complains that this is not right when CONFIG_NET is disabled and there is no socket interface: warning: (CGROUP_BPF) selects SOCK_CGROUP_DATA which has unmet direct dependencies (NET) I don't know what the correct solution for this is, but simply removing the dependency on NET from SOCK_CGROUP_DATA by moving it out of the 'if NET' section avoids the warning and does not produce other build errors. Fixes: 483c4933ea09 ("cgroup: Fix CGROUP_BPF config") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-10rdmacg: Added rdma cgroup controllerParav Pandit
Added rdma cgroup controller that does accounting, limit enforcement on rdma/IB resources. Added rdma cgroup header file which defines its APIs to perform charging/uncharging functionality. It also defined APIs for RDMA/IB stack for device registration. Devices which are registered will participate in controller functions of accounting and limit enforcements. It define rdmacg_device structure to bind IB stack and RDMA cgroup controller. RDMA resources are tracked using resource pool. Resource pool is per device, per cgroup entity which allows setting up accounting limits on per device basis. Currently resources are defined by the RDMA cgroup. Resource pool is created/destroyed dynamically whenever charging/uncharging occurs respectively and whenever user configuration is done. Its a tradeoff of memory vs little more code space that creates resource pool object whenever necessary, instead of creating them during cgroup creation and device registration time. Signed-off-by: Parav Pandit <pandit.parav@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>