summaryrefslogtreecommitdiff
path: root/kernel/cgroup/rstat.c
AgeCommit message (Collapse)Author
2020-05-28cgroup: add cpu.stat file to root cgroupBoris Burkov
Currently, the root cgroup does not have a cpu.stat file. Add one which is consistent with /proc/stat to capture global cpu statistics that might not fall under cgroup accounting. We haven't done this in the past because the data are already presented in /proc/stat and we didn't want to add overhead from collecting root cgroup stats when cgroups are configured, but no cgroups have been created. By keeping the data consistent with /proc/stat, I think we avoid the first problem, while improving the usability of cgroups stats. We avoid the second problem by computing the contents of cpu.stat from existing data collected for /proc/stat anyway. Signed-off-by: Boris Burkov <boris@bur.io> Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Tejun Heo <tj@kernel.org>
2020-04-09Revert "cgroup: Add memory barriers to plug cgroup_rstat_updated() race window"Tejun Heo
This reverts commit 9a9e97b2f1f2 ("cgroup: Add memory barriers to plug cgroup_rstat_updated() race window"). The commit was added in anticipation of memcg rstat conversion which needed synchronous accounting for the event counters (e.g. oom kill count). However, the conversion didn't get merged due to percpu memory overhead concern which couldn't be addressed at the time. Unfortunately, the patch's addition of smp_mb() to cgroup_rstat_updated() meant that every scheduling event now had to go through an additional full barrier and Mel Gorman noticed it as 1% regression in netperf UDP_STREAM test. There's no need to have this barrier in tree now and even if we need synchronous accounting in the future, the right thing to do is separating that out to a separate function so that hot paths which don't care about synchronous behavior don't have to pay the overhead of the full barrier. Let's revert. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Mel Gorman <mgorman@techsingularity.net> Link: http://lkml.kernel.org/r/20200409154413.GK3818@techsingularity.net Cc: v4.18+
2020-01-15cgroup: fix function name in commentChen Zhou
Function name cgroup_rstat_cpu_pop_upated() in comment should be cgroup_rstat_cpu_pop_updated(). Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2019-11-06cgroup: use cgroup->last_bstat instead of cgroup->bstat_pending for consistencyTejun Heo
cgroup->bstat_pending is used to determine the base stat delta to propagate to the parent. While correct, this is different from how percpu delta is determined for no good reason and the inconsistency makes the code more difficult to understand. This patch makes parent propagation delta calculation use the same method as percpu to global propagation. * cgroup_base_stat_accumulate() is renamed to cgroup_base_stat_add() and cgroup_base_stat_sub() is added. * percpu propagation calculation is updated to use the above helpers. * cgroup->bstat_pending is replaced with cgroup->last_bstat and updated to use the same calculation as percpu propagation. Signed-off-by: Tejun Heo <tj@kernel.org>
2019-05-21treewide: Add SPDX license identifier for missed filesThomas Gleixner
Add SPDX license identifiers to all files which: - Have no license information of any form - Have EXPORT_.*_SYMBOL_GPL inside which was used in the initial scan/conversion to ignore the file These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-15cgroup, rstat: Don't flush subtree root unless necessaryTejun Heo
cgroup_rstat_cpu_pop_updated() is used to traverse the updated cgroups on flush. While it was only visiting updated ones in the subtree, it was visiting @root unconditionally. We can easily check whether @root is updated or not by looking at its ->updated_next just as with the cgroups in the subtree. * Remove the unnecessary cgroup_parent() test. The system root cgroup is never updated and thus its ->updated_next is always NULL. No need to test whether cgroup_parent() exists in addition to ->updated_next. * Terminate traverse if ->updated_next is NULL. This can only happen for subtree @root and there's no reason to visit it if it's not marked updated. This reduces cpu consumption when reading a lot of rstat backed files. In a micro benchmark reading stat from ~1600 cgroups, the sys time was lowered by >40%. Signed-off-by: Tejun Heo <tj@kernel.org>
2018-04-26cgroup: Make cgroup_rstat_updated() ready for root cgroup usageTejun Heo
cgroup_rstat_updated() ensures that the cgroup's rstat is linked to the parent. If there's no parent, it never gets linked and the function ends up grabbing and releasing the cgroup_rstat_lock each time for no reason which can be expensive. This hasn't been a problem till now because nobody was calling the function for the root cgroup but rstat is gonna be exposed to controllers and use cases, so let's get ready. Make cgroup_rstat_updated() an no-op for the root cgroup. Signed-off-by: Tejun Heo <tj@kernel.org>
2018-04-26cgroup: Add memory barriers to plug cgroup_rstat_updated() race windowTejun Heo
cgroup_rstat_updated() has a small race window where an updated signaling can race with flush and could be lost till the next update. This wasn't a problem for the existing usages, but we plan to use rstat to track counters which need to be accurate. This patch plugs the race window by synchronizing cgroup_rstat_updated() and flush path with memory barriers around cgroup_rstat_cpu->updated_next pointer. Signed-off-by: Tejun Heo <tj@kernel.org>
2018-04-26cgroup: Add cgroup_subsys->css_rstat_flush()Tejun Heo
This patch adds cgroup_subsys->css_rstat_flush(). If a subsystem has this callback, its csses are linked on cgrp->css_rstat_list and rstat will call the function whenever the associated cgroup is flushed. Flush is also performed when such csses are released so that residual counts aren't lost. Combined with the rstat API previous patches factored out, this allows controllers to plug into rstat to manage their statistics in a scalable way. Signed-off-by: Tejun Heo <tj@kernel.org>
2018-04-26cgroup: Replace cgroup_rstat_mutex with a spinlockTejun Heo
Currently, rstat flush path is protected with a mutex which is fine as all the existing users are from interface file show path. However, rstat is being generalized for use by controllers and flushing from atomic contexts will be necessary. This patch replaces cgroup_rstat_mutex with a spinlock and adds a irq-safe flush function - cgroup_rstat_flush_irqsafe(). Explicit yield handling is added to the flush path so that other flush functions can yield to other threads and flushers. Signed-off-by: Tejun Heo <tj@kernel.org>
2018-04-26cgroup: Factor out and expose cgroup_rstat_*() interface functionsTejun Heo
cgroup_rstat is being generalized so that controllers can use it too. This patch factors out and exposes the following interface functions. * cgroup_rstat_updated(): Renamed from cgroup_rstat_cpu_updated() for consistency. * cgroup_rstat_flush_hold/release(): Factored out from base stat implementation. * cgroup_rstat_flush(): Verbatim expose. While at it, drop assert on cgroup_rstat_mutex in cgroup_base_stat_flush() as it crosses layers and make a minor comment update. v2: Added EXPORT_SYMBOL_GPL(cgroup_rstat_updated) to fix a build bug. Signed-off-by: Tejun Heo <tj@kernel.org>
2018-04-26cgroup: Reorganize kernel/cgroup/rstat.cTejun Heo
Currently, rstat.c has rstat and base stat implementations intermixed. Collect base stat implementation at the end of the file. Also, reorder the prototypes. This patch doesn't make any functional changes. Signed-off-by: Tejun Heo <tj@kernel.org>
2018-04-26cgroup: Distinguish base resource stat implementation from rstatTejun Heo
Base resource stat accounts universial (not specific to any controller) resource consumptions on top of rstat. Currently, its implementation is intermixed with rstat implementation making the code confusing to follow. This patch clarifies the distintion by doing the followings. * Encapsulate base resource stat counters, currently only cputime, in struct cgroup_base_stat. * Move prev_cputime into struct cgroup and initialize it with cgroup. * Rename the related functions so that they start with cgroup_base_stat. * Prefix the related variables and field names with b. This patch doesn't make any functional changes. Signed-off-by: Tejun Heo <tj@kernel.org>
2018-04-26cgroup: Rename stat to rstatTejun Heo
stat is too generic a name and ends up causing subtle confusions. It'll be made generic so that controllers can plug into it, which will make the problem worse. Let's rename it to something more specific - cgroup_rstat for cgroup recursive stat. This patch does the following renames. No other changes. * cpu_stat -> rstat_cpu * stat -> rstat * ?cstat -> ?rstatc Note that the renames are selective. The unrenamed are the ones which implement basic resource statistics on top of rstat. This will be further cleaned up in the following patches. Signed-off-by: Tejun Heo <tj@kernel.org>
2018-04-26cgroup: Rename kernel/cgroup/stat.c to kernel/cgroup/rstat.cTejun Heo
stat is too generic a name and ends up causing subtle confusions. It'll be made generic so that controllers can plug into it, which will make the problem worse. Let's rename it to something more specific - cgroup_rstat for cgroup recursive stat. First, rename kernel/cgroup/stat.c to kernel/cgroup/rstat.c. No content changes. Signed-off-by: Tejun Heo <tj@kernel.org>