diff options
author | Martin KaFai Lau <kafai@fb.com> | 2022-05-11 17:06:05 -0700 |
---|---|---|
committer | Jakub Kicinski <kuba@kernel.org> | 2022-05-12 16:52:18 -0700 |
commit | cae3873c5b3a4fcd9706fb461ff4e91bdf1f0120 (patch) | |
tree | 54650bd157a612c462c1fb7856c1800babd1790b /net/dccp | |
parent | e8d0059000b20c4745c5b6a713f6adb269cff8ff (diff) |
net: inet: Retire port only listening_hash
The listen sk is currently stored in two hash tables,
listening_hash (hashed by port) and lhash2 (hashed by port and address).
After commit 0ee58dad5b06 ("net: tcp6: prefer listeners bound to an address")
and commit d9fbc7f6431f ("net: tcp: prefer listeners bound to an address"),
the TCP-SYN lookup fast path does not use listening_hash.
The commit 05c0b35709c5 ("tcp: seq_file: Replace listening_hash with lhash2")
also moved the seq_file (/proc/net/tcp) iteration usage from
listening_hash to lhash2.
There are still a few listening_hash usages left.
One of them is inet_reuseport_add_sock() which uses the listening_hash
to search a listen sk during the listen() system call. This turns
out to be very slow on use cases that listen on many different
VIPs at a popular port (e.g. 443). [ On top of the slowness in
adding to the tail in the IPv6 case ]. The latter patch has a
selftest to demonstrate this case.
This patch takes this chance to move all remaining listening_hash
usages to lhash2 and then retire listening_hash.
Since most changes need to be done together, it is hard to cut
the listening_hash to lhash2 switch into small patches. The
changes in this patch is highlighted here for the review
purpose.
1. Because of the listening_hash removal, lhash2 can use the
sk->sk_nulls_node instead of the icsk->icsk_listen_portaddr_node.
This will also keep the sk_unhashed() check to work as is
after stop adding sk to listening_hash.
The union is removed from inet_listen_hashbucket because
only nulls_head is needed.
2. icsk->icsk_listen_portaddr_node and its helpers are removed.
3. The current lhash2 users needs to iterate with sk_nulls_node
instead of icsk_listen_portaddr_node.
One case is in the inet[6]_lhash2_lookup().
Another case is the seq_file iterator in tcp_ipv4.c.
One thing to note is sk_nulls_next() is needed
because the old inet_lhash2_for_each_icsk_continue()
does a "next" first before iterating.
4. Move the remaining listening_hash usage to lhash2
inet_reuseport_add_sock() which this series is
trying to improve.
inet_diag.c and mptcp_diag.c are the final two
remaining use cases and is moved to lhash2 now also.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Diffstat (limited to 'net/dccp')
-rw-r--r-- | net/dccp/proto.c | 1 |
1 files changed, 0 insertions, 1 deletions
diff --git a/net/dccp/proto.c b/net/dccp/proto.c index 58421f94427e..eb8e128e43e8 100644 --- a/net/dccp/proto.c +++ b/net/dccp/proto.c @@ -1110,7 +1110,6 @@ static int __init dccp_init(void) BUILD_BUG_ON(sizeof(struct dccp_skb_cb) > sizeof_field(struct sk_buff, cb)); - inet_hashinfo_init(&dccp_hashinfo); rc = inet_hashinfo2_init_mod(&dccp_hashinfo); if (rc) goto out_fail; |