diff options
author | Delyan Kratunov <delyank@meta.com> | 2022-10-21 19:36:38 +0000 |
---|---|---|
committer | Alexei Starovoitov <ast@kernel.org> | 2022-10-21 13:58:09 -0700 |
commit | eb814cf1adea0ce24413c26c22e9f1a556a45d34 (patch) | |
tree | 57f74ecffec4ad4400e21b78f6afe43007d9d3a9 /tools/testing/selftests/bpf/prog_tests/task_local_storage.c | |
parent | 12f96823a9d7724f819424f6232d9e46a56e80d1 (diff) |
selftests/bpf: fix task_local_storage/exit_creds rcu usage
BPF CI has revealed flakiness in the task_local_storage/exit_creds test.
The failure point in CI [1] is that null_ptr_count is equal to 0,
which indicates that the program hasn't run yet. This points to the
kern_sync_rcu (sys_membarrier -> synchronize_rcu underneath) not
waiting sufficiently.
Indeed, synchronize_rcu only waits for read-side sections that started
before the call. If the program execution starts *during* the
synchronize_rcu invocation (due to, say, preemption), the test won't
wait long enough.
As a speculative fix, make the synchornize_rcu calls in a loop until
an explicit run counter has gone up.
[1]: https://github.com/kernel-patches/bpf/actions/runs/3268263235/jobs/5374940791
Signed-off-by: Delyan Kratunov <delyank@meta.com>
Link: https://lore.kernel.org/r/156d4ef82275a074e8da8f4cffbd01b0c1466493.camel@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Diffstat (limited to 'tools/testing/selftests/bpf/prog_tests/task_local_storage.c')
-rw-r--r-- | tools/testing/selftests/bpf/prog_tests/task_local_storage.c | 18 |
1 files changed, 15 insertions, 3 deletions
diff --git a/tools/testing/selftests/bpf/prog_tests/task_local_storage.c b/tools/testing/selftests/bpf/prog_tests/task_local_storage.c index 035c263aab1b..99a42a2b6e14 100644 --- a/tools/testing/selftests/bpf/prog_tests/task_local_storage.c +++ b/tools/testing/selftests/bpf/prog_tests/task_local_storage.c @@ -39,7 +39,8 @@ out: static void test_exit_creds(void) { struct task_local_storage_exit_creds *skel; - int err; + int err, run_count, sync_rcu_calls = 0; + const int MAX_SYNC_RCU_CALLS = 1000; skel = task_local_storage_exit_creds__open_and_load(); if (!ASSERT_OK_PTR(skel, "skel_open_and_load")) @@ -53,8 +54,19 @@ static void test_exit_creds(void) if (CHECK_FAIL(system("ls > /dev/null"))) goto out; - /* sync rcu to make sure exit_creds() is called for "ls" */ - kern_sync_rcu(); + /* kern_sync_rcu is not enough on its own as the read section we want + * to wait for may start after we enter synchronize_rcu, so our call + * won't wait for the section to finish. Loop on the run counter + * as well to ensure the program has run. + */ + do { + kern_sync_rcu(); + run_count = __atomic_load_n(&skel->bss->run_count, __ATOMIC_SEQ_CST); + } while (run_count == 0 && ++sync_rcu_calls < MAX_SYNC_RCU_CALLS); + + ASSERT_NEQ(sync_rcu_calls, MAX_SYNC_RCU_CALLS, + "sync_rcu count too high"); + ASSERT_NEQ(run_count, 0, "run_count"); ASSERT_EQ(skel->bss->valid_ptr_count, 0, "valid_ptr_count"); ASSERT_NEQ(skel->bss->null_ptr_count, 0, "null_ptr_count"); out: |