| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| | |\ \ \ \ \ \ \ \ \ \
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | | |
Daniel Borkmann says:
====================
pull-request: bpf-next 2022-01-24
We've added 80 non-merge commits during the last 14 day(s) which contain
a total of 128 files changed, 4990 insertions(+), 895 deletions(-).
The main changes are:
1) Add XDP multi-buffer support and implement it for the mvneta driver,
from Lorenzo Bianconi, Eelco Chaudron and Toke Høiland-Jørgensen.
2) Add unstable conntrack lookup helpers for BPF by using the BPF kfunc
infra, from Kumar Kartikeya Dwivedi.
3) Extend BPF cgroup programs to export custom ret value to userspace via
two helpers bpf_get_retval() and bpf_set_retval(), from YiFei Zhu.
4) Add support for AF_UNIX iterator batching, from Kuniyuki Iwashima.
5) Complete missing UAPI BPF helper description and change bpf_doc.py script
to enforce consistent & complete helper documentation, from Usama Arif.
6) Deprecate libbpf's legacy BPF map definitions and streamline XDP APIs to
follow tc-based APIs, from Andrii Nakryiko.
7) Support BPF_PROG_QUERY for BPF programs attached to sockmap, from Di Zhu.
8) Deprecate libbpf's bpf_map__def() API and replace users with proper getters
and setters, from Christy Lee.
9) Extend libbpf's btf__add_btf() with an additional hashmap for strings to
reduce overhead, from Kui-Feng Lee.
10) Fix bpftool and libbpf error handling related to libbpf's hashmap__new()
utility function, from Mauricio Vásquez.
11) Add support to BTF program names in bpftool's program dump, from Raman Shukhau.
12) Fix resolve_btfids build to pick up host flags, from Connor O'Brien.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (80 commits)
selftests, bpf: Do not yet switch to new libbpf XDP APIs
selftests, xsk: Fix rx_full stats test
bpf: Fix flexible_array.cocci warnings
xdp: disable XDP_REDIRECT for xdp frags
bpf: selftests: add CPUMAP/DEVMAP selftests for xdp frags
bpf: selftests: introduce bpf_xdp_{load,store}_bytes selftest
net: xdp: introduce bpf_xdp_pointer utility routine
bpf: generalise tail call map compatibility check
libbpf: Add SEC name for xdp frags programs
bpf: selftests: update xdp_adjust_tail selftest to include xdp frags
bpf: test_run: add xdp_shared_info pointer in bpf_test_finish signature
bpf: introduce frags support to bpf_prog_test_run_xdp()
bpf: move user_size out of bpf_test_init
bpf: add frags support to xdp copy helpers
bpf: add frags support to the bpf_xdp_adjust_tail() API
bpf: introduce bpf_xdp_get_buff_len helper
net: mvneta: enable jumbo frames if the loaded XDP program support frags
bpf: introduce BPF_F_XDP_HAS_FRAGS flag in prog_flags loading the ebpf program
net: mvneta: add frags support to XDP_TX
xdp: add frags support to xdp_return_{buff/frame}
...
====================
Link: https://lore.kernel.org/r/20220124221235.18993-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
| | | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | | |
This change adds support for tail growing and shrinking for XDP frags.
When called on a non-linear packet with a grow request, it will work
on the last fragment of the packet. So the maximum grow size is the
last fragments tailroom, i.e. no new buffer will be allocated.
A XDP frags capable driver is expected to set frag_size in xdp_rxq_info
data structure to notify the XDP core the fragment size.
frag_size set to 0 is interpreted by the XDP core as tail growing is
not allowed.
Introduce __xdp_rxq_info_reg utility routine to initialize frag_size field.
When shrinking, it will work from the last fragment, all the way down to
the base buffer depending on the shrinking size. It's important to mention
that once you shrink down the fragment(s) are freed, so you can not grow
again to the original size.
Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Link: https://lore.kernel.org/r/eabda3485dda4f2f158b477729337327e609461d.1642758637.git.lorenzo@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
| | | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | | |
Introduce bpf_xdp_get_buff_len helper in order to return the xdp buffer
total size (linear and paged area)
Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/aac9ac3504c84026cf66a3c71b7c5ae89bc991be.1642758637.git.lorenzo@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
| | | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | | |
Take into account if the received xdp_buff/xdp_frame is non-linear
recycling/returning the frame memory to the allocator or into
xdp_frame_bulk.
Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/a961069febc868508ce1bdf5e53a343eb4e57cb2.1642758637.git.lorenzo@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
| | | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | | |
Introduce xdp_update_skb_shared_info routine to update frags array
metadata in skb_shared_info data structure converting to a skb from
a xdp_buff or xdp_frame.
According to the current skb_shared_info architecture in
xdp_frame/xdp_buff and to the xdp frags support, there is
no need to run skb_add_rx_frag() and reset frags array converting the buffer
to a skb since the frag array will be in the same position for xdp_buff/xdp_frame
and for the skb, we just need to update memory metadata.
Introduce XDP_FLAGS_PF_MEMALLOC flag in xdp_buff_flags in order to mark
the xdp_buff or xdp_frame as under memory-pressure if pages of the frags array
are under memory pressure. Doing so we can avoid looping over all fragments in
xdp_update_skb_shared_info routine. The driver is expected to set the
flag constructing the xdp_buffer using xdp_buff_set_frag_pfmemalloc
utility routine.
Rely on xdp_update_skb_shared_info in __xdp_build_skb_from_frame routine
converting the non-linear xdp_frame to a skb after performing a XDP_REDIRECT.
Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/bfd23fb8a8d7438724f7819c567cdf99ffd6226f.1642758637.git.lorenzo@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
| | | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | | |
Introduce flags field in xdp_frame and xdp_buffer data structures
to define additional buffer features. At the moment the only
supported buffer feature is frags bit (XDP_FLAGS_HAS_FRAGS).
frags bit is used to specify if this is a linear buffer
(XDP_FLAGS_HAS_FRAGS not set) or a frags frame (XDP_FLAGS_HAS_FRAGS
set). In the latter case the driver is expected to initialize the
skb_shared_info structure at the end of the first buffer to link together
subsequent buffers belonging to the same frame.
Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/e389f14f3a162c0a5bc6a2e1aa8dd01a90be117d.1642758637.git.lorenzo@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
| | | | |/ / / / / / / /
| | |/| | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | | |
This change adds conntrack lookup helpers using the unstable kfunc call
interface for the XDP and TC-BPF hooks. The primary usecase is
implementing a synproxy in XDP, see Maxim's patchset [0].
Export get_net_ns_by_id as nf_conntrack_bpf.c needs to call it.
This object is only built when CONFIG_DEBUG_INFO_BTF_MODULES is enabled.
[0]: https://lore.kernel.org/bpf/20211019144655.3483197-1-maximmi@nvidia.com
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20220114163953.1455836-7-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
| |\ \ \ \ \ \ \ \ \ \ \
| |_|_|_|_|_|_|_|_|_|/
|/| | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux
Pull flexible-array transformations from Gustavo Silva:
"Treewide patch that replaces zero-length arrays with flexible-array
members.
This has been baking in linux-next for a whole development cycle"
* tag 'flexible-array-transformations-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
treewide: Replace zero-length arrays with flexible-array members
|
| | | |_|/ / / / / / /
| |/| | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
There is a regular need in the kernel to provide a way to declare
having a dynamically sized set of trailing elements in a structure.
Kernel code should always use “flexible array members”[1] for these
cases. The older style of one-element or zero-length arrays should
no longer be used[2].
This code was transformed with the help of Coccinelle:
(next-20220214$ spatch --jobs $(getconf _NPROCESSORS_ONLN) --sp-file script.cocci --include-headers --dir . > output.patch)
@@
identifier S, member, array;
type T1, T2;
@@
struct S {
...
T1 member;
T2 array[
- 0
];
};
UAPI and wireless changes were intentionally excluded from this patch
and will be sent out separately.
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://www.kernel.org/doc/html/v5.16/process/deprecated.html#zero-length-and-one-element-arrays
Link: https://github.com/KSPP/linux/issues/78
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
|
| |\ \ \ \ \ \ \ \ \ \
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | | |
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net coming late
in the 5.17-rc process:
1) Revert port remap to mitigate shadowing service ports, this is causing
problems in existing setups and this mitigation can be achieved with
explicit ruleset, eg.
... tcp sport < 16386 tcp dport >= 32768 masquerade random
This patches provided a built-in policy similar to the one described above.
2) Disable register tracking infrastructure in nf_tables. Florian reported
two issues:
- Existing expressions with no implemented .reduce interface
that causes data-store on register should cancel the tracking.
- Register clobbering might be possible storing data on registers that
are larger than 32-bits.
This might lead to generating incorrect ruleset bytecode. These two
issues are scheduled to be addressed in the next release cycle.
* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nf_tables: disable register tracking
Revert "netfilter: conntrack: tag conntracks picked up in local out hook"
Revert "netfilter: nat: force port remap to prevent shadowing well-known ports"
====================
Link: https://lore.kernel.org/r/20220312220315.64531-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
| | | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | | |
This was a prerequisite for the ill-fated
"netfilter: nat: force port remap to prevent shadowing well-known ports".
As this has been reverted, this change can be backed out too.
Signed-off-by: Florian Westphal <fw@strlen.de>
|
| | |_|_|_|_|_|_|_|_|/
|/| | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
When iterating over sockets using vsock_for_each_connected_socket, make
sure that a transport filters out sockets that don't belong to the
transport.
There actually was an issue caused by this; in a nested VM
configuration, destroying the nested VM (which often involves the
closing of /dev/vhost-vsock if there was h2g connections to the nested
VM) kills not only the h2g connections, but also all existing g2h
connections to the (outmost) host which are totally unrelated.
Tested: Executed the following steps on Cuttlefish (Android running on a
VM) [1]: (1) Enter into an `adb shell` session - to have a g2h
connection inside the VM, (2) open and then close /dev/vhost-vsock by
`exec 3< /dev/vhost-vsock && exec 3<&-`, (3) observe that the adb
session is not reset.
[1] https://android.googlesource.com/device/google/cuttlefish/
Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jiyong Park <jiyong@google.com>
Link: https://lore.kernel.org/r/20220311020017.1509316-1-jiyong@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
| |\ \ \ \ \ \ \ \ \ \
| |_|_|_|_|_|_|_|_|/
|/| | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:
====================
pull request (net): ipsec 2022-03-09
1) Fix IPv6 PMTU discovery for xfrm interfaces.
From Lina Wang.
2) Revert failing for policies and states that are
configured with XFRMA_IF_ID 0. It broke a
user configuration. From Kai Lueke.
3) Fix a possible buffer overflow in the ESP output path.
4) Fix ESP GSO for tunnel and BEET mode on inter address
family tunnels.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
The maximum message size that can be send is bigger than
the maximum site that skb_page_frag_refill can allocate.
So it is possible to write beyond the allocated buffer.
Fix this by doing a fallback to COW in that case.
v2:
Avoid get get_order() costs as suggested by Linus Torvalds.
Fixes: cac2661c53f3 ("esp4: Avoid skb_cow_data whenever possible")
Fixes: 03e2a30f6a27 ("esp6: Avoid skb_cow_data whenever possible")
Reported-by: valis <sec@valis.email>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
| | |/ / / / / / / /
|/| | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
While investigating on why a synchronize_net() has been added recently
in ipv6_mc_down(), I found that igmp6_event_query() and igmp6_event_report()
might drop skbs in some cases.
Discussion about removing synchronize_net() from ipv6_mc_down()
will happen in a different thread.
Fixes: f185de28d9ae ("mld: add new workqueues for process mld events")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Taehee Yoo <ap420073@gmail.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220303173728.937869-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
| |\ \ \ \ \ \ \ \ \
| |/ / / / / / / /
|/| | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
1) Use kfree_rcu(ptr, rcu) variant, using kfree_rcu(ptr) was not
intentional. From Eric Dumazet.
2) Use-after-free in netfilter hook core, from Eric Dumazet.
3) Missing rcu read lock side for netfilter egress hook,
from Florian Westphal.
4) nf_queue assume state->sk is full socket while it might not be.
Invoke sock_gen_put(), from Florian Westphal.
5) Add selftest to exercise the reported KASAN splat in 4)
6) Fix possible use-after-free in nf_queue in case sk_refcnt is 0.
Also from Florian.
7) Use input interface index only for hardware offload, not for
the software plane. This breaks tc ct action. Patch from Paul Blakey.
* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
net/sched: act_ct: Fix flow table lookup failure with no originating ifindex
netfilter: nf_queue: handle socket prefetch
netfilter: nf_queue: fix possible use-after-free
selftests: netfilter: add nfqueue TCP_NEW_SYN_RECV socket race test
netfilter: nf_queue: don't assume sk is full socket
netfilter: egress: silence egress hook lockdep splats
netfilter: fix use-after-free in __nf_register_net_hook()
netfilter: nf_tables: prefer kfree_rcu(ptr, rcu) variant
====================
Link: https://lore.kernel.org/r/20220301215337.378405-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
| | | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
After cited commit optimizted hw insertion, flow table entries are
populated with ifindex information which was intended to only be used
for HW offload. This tuple ifindex is hashed in the flow table key, so
it must be filled for lookup to be successful. But tuple ifindex is only
relevant for the netfilter flowtables (nft), so it's not filled in
act_ct flow table lookup, resulting in lookup failure, and no SW
offload and no offload teardown for TCP connection FIN/RST packets.
To fix this, add new tc ifindex field to tuple, which will
only be used for offloading, not for lookup, as it will not be
part of the tuple hash.
Fixes: 9795ded7f924 ("net/sched: act_ct: Fill offloading tuple iifidx")
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
| | | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Eric Dumazet says:
The sock_hold() side seems suspect, because there is no guarantee
that sk_refcnt is not already 0.
On failure, we cannot queue the packet and need to indicate an
error. The packet will be dropped by the caller.
v2: split skb prefetch hunk into separate change
Fixes: 271b72c7fa82c ("udp: RCU handling for Unicast packets.")
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
|
| |\ \ \ \ \ \ \ \ \
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:
====================
1) Fix PMTU for IPv6 if the reported MTU minus the ESP overhead is
smaller than 1280. From Jiri Bohac.
2) Fix xfrm interface ID and inter address family tunneling when
migrating xfrm states. From Yan Yan.
3) Add missing xfrm intrerface ID initialization on xfrmi_changelink.
From Antony Antony.
4) Enforce validity of xfrm offload input flags so that userspace can't
send undefined flags to the offload driver.
From Leon Romanovsky.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
This reverts commit b515d2637276a3810d6595e10ab02c13bfd0b63a.
Commit b515d2637276a3810d6595e10ab02c13bfd0b63a ("xfrm: xfrm_state_mtu
should return at least 1280 for ipv6") in v5.14 breaks the TCP MSS
calculation in ipsec transport mode, resulting complete stalls of TCP
connections. This happens when the (P)MTU is 1280 or slighly larger.
The desired formula for the MSS is:
MSS = (MTU - ESP_overhead) - IP header - TCP header
However, the above commit clamps the (MTU - ESP_overhead) to a
minimum of 1280, turning the formula into
MSS = max(MTU - ESP overhead, 1280) - IP header - TCP header
With the (P)MTU near 1280, the calculated MSS is too large and the
resulting TCP packets never make it to the destination because they
are over the actual PMTU.
The above commit also causes suboptimal double fragmentation in
xfrm tunnel mode, as described in
https://lore.kernel.org/netdev/20210429202529.codhwpc7w6kbudug@dwarf.suse.cz/
The original problem the above commit was trying to fix is now fixed
by commit 6596a0229541270fb8d38d989f91b78838e5e9da ("xfrm: fix MTU
regression").
Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
| | | |_|/ / / / / /
| |/| | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
This patch enables distinguishing SAs and SPs based on if_id during
the xfrm_migrate flow. This ensures support for xfrm interfaces
throughout the SA/SP lifecycle.
When there are multiple existing SPs with the same direction,
the same xfrm_selector and different endpoint addresses,
xfrm_migrate might fail with ENODATA.
Specifically, the code path for performing xfrm_migrate is:
Stage 1: find policy to migrate with
xfrm_migrate_policy_find(sel, dir, type, net)
Stage 2: find and update state(s) with
xfrm_migrate_state_find(mp, net)
Stage 3: update endpoint address(es) of template(s) with
xfrm_policy_migrate(pol, m, num_migrate)
Currently "Stage 1" always returns the first xfrm_policy that
matches, and "Stage 3" looks for the xfrm_tmpl that matches the
old endpoint address. Thus if there are multiple xfrm_policy
with same selector, direction, type and net, "Stage 1" might
rertun a wrong xfrm_policy and "Stage 3" will fail with ENODATA
because it cannot find a xfrm_tmpl with the matching endpoint
address.
The fix is to allow userspace to pass an if_id and add if_id
to the matching rule in Stage 1 and Stage 2 since if_id is a
unique ID for xfrm_policy and xfrm_state. For compatibility,
if_id will only be checked if the attribute is set.
Tested with additions to Android's kernel unit test suite:
https://android-review.googlesource.com/c/kernel/tests/+/1668886
Signed-off-by: Yan Yan <evitayan@google.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
| | | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
When using hci_le_create_conn_sync it shall wait for the conn_timeout
since the connection complete may take longer than just 2 seconds.
Also fix the masking of HCI_EV_LE_ENHANCED_CONN_COMPLETE and
HCI_EV_LE_CONN_COMPLETE so they are never both set so we can predict
which one the controller will use in case of HCI_OP_LE_CREATE_CONN.
Fixes: 6cd29ec6ae5e3 ("Bluetooth: hci_sync: Wait for proper events when connecting LE")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
|
| | |_|_|_|_|_|_|/
|/| | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
Since bt_skb_sendmmsg can be used with the likes of SOCK_STREAM it
shall return the partial chunks it could allocate instead of freeing
everything as otherwise it can cause problems like bellow.
Fixes: 81be03e026dc ("Bluetooth: RFCOMM: Replace use of memcpy_from_msg with bt_skb_sendmmsg")
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Link: https://lore.kernel.org/r/d7206e12-1b99-c3be-84f4-df22af427ef5@molgen.mpg.de
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=215594
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Tested-by: Paul Menzel <pmenzel@molgen.mpg.de> (Nokia N9 (MeeGo/Harmattan)
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
|
| | |/ / / / / /
|/| | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Ipv6 ttl, label and tos fields are modified without first
pulling/pushing the ipv6 header, which would have updated
the hw csum (if available). This might cause csum validation
when sending the packet to the stack, as can be seen in
the trace below.
Fix this by updating skb->csum if available.
Trace resulted by ipv6 ttl dec and then sending packet
to conntrack [actions: set(ipv6(hlimit=63)),ct(zone=99)]:
[295241.900063] s_pf0vf2: hw csum failure
[295241.923191] Call Trace:
[295241.925728] <IRQ>
[295241.927836] dump_stack+0x5c/0x80
[295241.931240] __skb_checksum_complete+0xac/0xc0
[295241.935778] nf_conntrack_tcp_packet+0x398/0xba0 [nf_conntrack]
[295241.953030] nf_conntrack_in+0x498/0x5e0 [nf_conntrack]
[295241.958344] __ovs_ct_lookup+0xac/0x860 [openvswitch]
[295241.968532] ovs_ct_execute+0x4a7/0x7c0 [openvswitch]
[295241.979167] do_execute_actions+0x54a/0xaa0 [openvswitch]
[295242.001482] ovs_execute_actions+0x48/0x100 [openvswitch]
[295242.006966] ovs_dp_process_packet+0x96/0x1d0 [openvswitch]
[295242.012626] ovs_vport_receive+0x6c/0xc0 [openvswitch]
[295242.028763] netdev_frame_hook+0xc0/0x180 [openvswitch]
[295242.034074] __netif_receive_skb_core+0x2ca/0xcb0
[295242.047498] netif_receive_skb_internal+0x3e/0xc0
[295242.052291] napi_gro_receive+0xba/0xe0
[295242.056231] mlx5e_handle_rx_cqe_mpwrq_rep+0x12b/0x250 [mlx5_core]
[295242.062513] mlx5e_poll_rx_cq+0xa0f/0xa30 [mlx5_core]
[295242.067669] mlx5e_napi_poll+0xe1/0x6b0 [mlx5_core]
[295242.077958] net_rx_action+0x149/0x3b0
[295242.086762] __do_softirq+0xd7/0x2d6
[295242.090427] irq_exit+0xf7/0x100
[295242.093748] do_IRQ+0x7f/0xd0
[295242.096806] common_interrupt+0xf/0xf
[295242.100559] </IRQ>
[295242.102750] RIP: 0033:0x7f9022e88cbd
[295242.125246] RSP: 002b:00007f9022282b20 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffda
[295242.132900] RAX: 0000000000000005 RBX: 0000000000000010 RCX: 0000000000000000
[295242.140120] RDX: 00007f9022282ba8 RSI: 00007f9022282a30 RDI: 00007f9014005c30
[295242.147337] RBP: 00007f9014014d60 R08: 0000000000000020 R09: 00007f90254a8340
[295242.154557] R10: 00007f9022282a28 R11: 0000000000000246 R12: 0000000000000000
[295242.161775] R13: 00007f902308c000 R14: 000000000000002b R15: 00007f9022b71f40
Fixes: 3fdbd1ce11e5 ("openvswitch: add ipv6 'set' action")
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Link: https://lore.kernel.org/r/20220223163416.24096-1-paulb@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
| |\ \ \ \ \ \ \
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
This is fixing up the use without proper initialization in patch 5/5
-o-
Hi,
The following patchset contains Netfilter fixes for net:
1) Missing #ifdef CONFIG_IP6_NF_IPTABLES in recent xt_socket fix.
2) Fix incorrect flow action array size in nf_tables.
3) Unregister flowtable hooks from netns exit path.
4) Fix missing limit object release, from Florian Westphal.
5) Memleak in nf_tables object update path, also from Florian.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | |_|_|_|/ /
| |/| | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
immediate verdict expression needs to allocate one slot in the flow offload
action array, however, immediate data expression does not need to do so.
fwd and dup expression need to allocate one slot, this is missing.
Add a new offload_action interface to report if this expression needs to
allocate one slot in the flow offload action array.
Fixes: be2861dc36d7 ("netfilter: nft_{fwd,dup}_netdev: add offload support")
Reported-and-tested-by: Nick Gregory <Nick.Gregory@Sophos.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
| | | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
All functions defined as static inline in net/checksum.h are
meant to be inlined for performance reason.
But since commit ac7c3e4ff401 ("compiler: enable
CONFIG_OPTIMIZE_INLINING forcibly") the compiler is allowed to
uninline functions when it wants.
Fair enough in the general case, but for tiny performance critical
checksum helpers that's counter-productive.
The problem mainly arises when selecting CONFIG_CC_OPTIMISE_FOR_SIZE,
Those helpers being 'static inline' in header files you suddenly find
them duplicated many times in the resulting vmlinux.
Here is a typical exemple when building powerpc pmac32_defconfig
with CONFIG_CC_OPTIMISE_FOR_SIZE. csum_sub() appears 4 times:
c04a23cc <csum_sub>:
c04a23cc: 7c 84 20 f8 not r4,r4
c04a23d0: 7c 63 20 14 addc r3,r3,r4
c04a23d4: 7c 63 01 94 addze r3,r3
c04a23d8: 4e 80 00 20 blr
...
c04a2ce8: 4b ff f6 e5 bl c04a23cc <csum_sub>
...
c04a2d2c: 4b ff f6 a1 bl c04a23cc <csum_sub>
...
c04a2d54: 4b ff f6 79 bl c04a23cc <csum_sub>
...
c04a754c <csum_sub>:
c04a754c: 7c 84 20 f8 not r4,r4
c04a7550: 7c 63 20 14 addc r3,r3,r4
c04a7554: 7c 63 01 94 addze r3,r3
c04a7558: 4e 80 00 20 blr
...
c04ac930: 4b ff ac 1d bl c04a754c <csum_sub>
...
c04ad264: 4b ff a2 e9 bl c04a754c <csum_sub>
...
c04e3b08 <csum_sub>:
c04e3b08: 7c 84 20 f8 not r4,r4
c04e3b0c: 7c 63 20 14 addc r3,r3,r4
c04e3b10: 7c 63 01 94 addze r3,r3
c04e3b14: 4e 80 00 20 blr
...
c04e5788: 4b ff e3 81 bl c04e3b08 <csum_sub>
...
c04e65c8: 4b ff d5 41 bl c04e3b08 <csum_sub>
...
c0512d34 <csum_sub>:
c0512d34: 7c 84 20 f8 not r4,r4
c0512d38: 7c 63 20 14 addc r3,r3,r4
c0512d3c: 7c 63 01 94 addze r3,r3
c0512d40: 4e 80 00 20 blr
...
c0512dfc: 4b ff ff 39 bl c0512d34 <csum_sub>
...
c05138bc: 4b ff f4 79 bl c0512d34 <csum_sub>
...
Restore the expected behaviour by using __always_inline for all
functions defined in net/checksum.h
vmlinux size is even reduced by 256 bytes with this patch:
text data bss dec hex filename
6980022 2515362 194384 9689768 93daa8 vmlinux.before
6979862 2515266 194384 9689512 93d9a8 vmlinux.now
Fixes: ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING forcibly")
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |_|_|_|_|/
|/| | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
UDP sendmsg() can be lockless, this is causing all kinds
of data races.
This patch converts sk->sk_tskey to remove one of these races.
BUG: KCSAN: data-race in __ip_append_data / __ip_append_data
read to 0xffff8881035d4b6c of 4 bytes by task 8877 on cpu 1:
__ip_append_data+0x1c1/0x1de0 net/ipv4/ip_output.c:994
ip_make_skb+0x13f/0x2d0 net/ipv4/ip_output.c:1636
udp_sendmsg+0x12bd/0x14c0 net/ipv4/udp.c:1249
inet_sendmsg+0x5f/0x80 net/ipv4/af_inet.c:819
sock_sendmsg_nosec net/socket.c:705 [inline]
sock_sendmsg net/socket.c:725 [inline]
____sys_sendmsg+0x39a/0x510 net/socket.c:2413
___sys_sendmsg net/socket.c:2467 [inline]
__sys_sendmmsg+0x267/0x4c0 net/socket.c:2553
__do_sys_sendmmsg net/socket.c:2582 [inline]
__se_sys_sendmmsg net/socket.c:2579 [inline]
__x64_sys_sendmmsg+0x53/0x60 net/socket.c:2579
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
write to 0xffff8881035d4b6c of 4 bytes by task 8880 on cpu 0:
__ip_append_data+0x1d8/0x1de0 net/ipv4/ip_output.c:994
ip_make_skb+0x13f/0x2d0 net/ipv4/ip_output.c:1636
udp_sendmsg+0x12bd/0x14c0 net/ipv4/udp.c:1249
inet_sendmsg+0x5f/0x80 net/ipv4/af_inet.c:819
sock_sendmsg_nosec net/socket.c:705 [inline]
sock_sendmsg net/socket.c:725 [inline]
____sys_sendmsg+0x39a/0x510 net/socket.c:2413
___sys_sendmsg net/socket.c:2467 [inline]
__sys_sendmmsg+0x267/0x4c0 net/socket.c:2553
__do_sys_sendmmsg net/socket.c:2582 [inline]
__se_sys_sendmmsg net/socket.c:2579 [inline]
__x64_sys_sendmmsg+0x53/0x60 net/socket.c:2579
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
value changed: 0x0000054d -> 0x0000054e
Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 8880 Comm: syz-executor.5 Not tainted 5.17.0-rc2-syzkaller-00167-gdcb85f85fa6f-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Fixes: 09c2d251b707 ("net-timestamp: add key to disambiguate concurrent datagrams")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Because fib6_info_hw_flags_set() is called without any synchronization,
all accesses to gi6->offload, fi->trap and fi->offload_failed
need some basic protection like READ_ONCE()/WRITE_ONCE().
BUG: KCSAN: data-race in fib6_info_hw_flags_set / fib6_purge_rt
read to 0xffff8881087d5886 of 1 bytes by task 13953 on cpu 0:
fib6_drop_pcpu_from net/ipv6/ip6_fib.c:1007 [inline]
fib6_purge_rt+0x4f/0x580 net/ipv6/ip6_fib.c:1033
fib6_del_route net/ipv6/ip6_fib.c:1983 [inline]
fib6_del+0x696/0x890 net/ipv6/ip6_fib.c:2028
__ip6_del_rt net/ipv6/route.c:3876 [inline]
ip6_del_rt+0x83/0x140 net/ipv6/route.c:3891
__ipv6_dev_ac_dec+0x2b5/0x370 net/ipv6/anycast.c:374
ipv6_dev_ac_dec net/ipv6/anycast.c:387 [inline]
__ipv6_sock_ac_close+0x141/0x200 net/ipv6/anycast.c:207
ipv6_sock_ac_close+0x79/0x90 net/ipv6/anycast.c:220
inet6_release+0x32/0x50 net/ipv6/af_inet6.c:476
__sock_release net/socket.c:650 [inline]
sock_close+0x6c/0x150 net/socket.c:1318
__fput+0x295/0x520 fs/file_table.c:280
____fput+0x11/0x20 fs/file_table.c:313
task_work_run+0x8e/0x110 kernel/task_work.c:164
tracehook_notify_resume include/linux/tracehook.h:189 [inline]
exit_to_user_mode_loop kernel/entry/common.c:175 [inline]
exit_to_user_mode_prepare+0x160/0x190 kernel/entry/common.c:207
__syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline]
syscall_exit_to_user_mode+0x20/0x40 kernel/entry/common.c:300
do_syscall_64+0x50/0xd0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x44/0xae
write to 0xffff8881087d5886 of 1 bytes by task 1912 on cpu 1:
fib6_info_hw_flags_set+0x155/0x3b0 net/ipv6/route.c:6230
nsim_fib6_rt_hw_flags_set drivers/net/netdevsim/fib.c:668 [inline]
nsim_fib6_rt_add drivers/net/netdevsim/fib.c:691 [inline]
nsim_fib6_rt_insert drivers/net/netdevsim/fib.c:756 [inline]
nsim_fib6_event drivers/net/netdevsim/fib.c:853 [inline]
nsim_fib_event drivers/net/netdevsim/fib.c:886 [inline]
nsim_fib_event_work+0x284f/0x2cf0 drivers/net/netdevsim/fib.c:1477
process_one_work+0x3f6/0x960 kernel/workqueue.c:2307
worker_thread+0x616/0xa70 kernel/workqueue.c:2454
kthread+0x2c7/0x2e0 kernel/kthread.c:327
ret_from_fork+0x1f/0x30
value changed: 0x22 -> 0x2a
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 1912 Comm: kworker/1:3 Not tainted 5.16.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events nsim_fib_event_work
Fixes: 0c5fcf9e249e ("IPv6: Add "offload failed" indication to routes")
Fixes: bb3c4ab93e44 ("ipv6: Add "offload" and "trap" indications to routes")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Amit Cohen <amcohen@nvidia.com>
Cc: Ido Schimmel <idosch@nvidia.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Link: https://lore.kernel.org/r/20220216173217.3792411-2-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
| | | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Ipv6 flowlabels historically require a reservation before use.
Optionally in exclusive mode (e.g., user-private).
Commit 59c820b2317f ("ipv6: elide flowlabel check if no exclusive
leases exist") introduced a fastpath that avoids this check when no
exclusive leases exist in the system, and thus any flowlabel use
will be granted.
That allows skipping the control operation to reserve a flowlabel
entirely. Though with a warning if the fast path fails:
This is an optimization. Robust applications still have to revert to
requesting leases if the fast path fails due to an exclusive lease.
Still, this is subtle. Better isolate network namespaces from each
other. Flowlabels are per-netns. Also record per-netns whether
exclusive leases are in use. Then behavior does not change based on
activity in other netns.
Changes
v2
- wrap in IS_ENABLED(CONFIG_IPV6) to avoid breakage if disabled
Fixes: 59c820b2317f ("ipv6: elide flowlabel check if no exclusive leases exist")
Link: https://lore.kernel.org/netdev/MWHPR2201MB1072BCCCFCE779E4094837ACD0329@MWHPR2201MB1072.namprd22.prod.outlook.com/
Reported-by: Congyu Liu <liu3101@purdue.edu>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Tested-by: Congyu Liu <liu3101@purdue.edu>
Link: https://lore.kernel.org/r/20220215160037.1976072-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
| | | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
syzbot reported that two threads might write over agg_select_timer
at the same time. Make agg_select_timer atomic to fix the races.
BUG: KCSAN: data-race in bond_3ad_initiate_agg_selection / bond_3ad_state_machine_handler
read to 0xffff8881242aea90 of 4 bytes by task 1846 on cpu 1:
bond_3ad_state_machine_handler+0x99/0x2810 drivers/net/bonding/bond_3ad.c:2317
process_one_work+0x3f6/0x960 kernel/workqueue.c:2307
worker_thread+0x616/0xa70 kernel/workqueue.c:2454
kthread+0x1bf/0x1e0 kernel/kthread.c:377
ret_from_fork+0x1f/0x30
write to 0xffff8881242aea90 of 4 bytes by task 25910 on cpu 0:
bond_3ad_initiate_agg_selection+0x18/0x30 drivers/net/bonding/bond_3ad.c:1998
bond_open+0x658/0x6f0 drivers/net/bonding/bond_main.c:3967
__dev_open+0x274/0x3a0 net/core/dev.c:1407
dev_open+0x54/0x190 net/core/dev.c:1443
bond_enslave+0xcef/0x3000 drivers/net/bonding/bond_main.c:1937
do_set_master net/core/rtnetlink.c:2532 [inline]
do_setlink+0x94f/0x2500 net/core/rtnetlink.c:2736
__rtnl_newlink net/core/rtnetlink.c:3414 [inline]
rtnl_newlink+0xfeb/0x13e0 net/core/rtnetlink.c:3529
rtnetlink_rcv_msg+0x745/0x7e0 net/core/rtnetlink.c:5594
netlink_rcv_skb+0x14e/0x250 net/netlink/af_netlink.c:2494
rtnetlink_rcv+0x18/0x20 net/core/rtnetlink.c:5612
netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
netlink_unicast+0x602/0x6d0 net/netlink/af_netlink.c:1343
netlink_sendmsg+0x728/0x850 net/netlink/af_netlink.c:1919
sock_sendmsg_nosec net/socket.c:705 [inline]
sock_sendmsg net/socket.c:725 [inline]
____sys_sendmsg+0x39a/0x510 net/socket.c:2413
___sys_sendmsg net/socket.c:2467 [inline]
__sys_sendmsg+0x195/0x230 net/socket.c:2496
__do_sys_sendmsg net/socket.c:2505 [inline]
__se_sys_sendmsg net/socket.c:2503 [inline]
__x64_sys_sendmsg+0x42/0x50 net/socket.c:2503
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
value changed: 0x00000050 -> 0x0000004f
Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 25910 Comm: syz-executor.1 Tainted: G W 5.17.0-rc4-syzkaller-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
mv88e6xxx is special among DSA drivers in that it requires the VTU to
contain the VID of the FDB entry it modifies in
mv88e6xxx_port_db_load_purge(), otherwise it will return -EOPNOTSUPP.
Sometimes due to races this is not always satisfied even if external
code does everything right (first deletes the FDB entries, then the
VLAN), because DSA commits to hardware FDB entries asynchronously since
commit c9eb3e0f8701 ("net: dsa: Add support for learning FDB through
notification").
Therefore, the mv88e6xxx driver must close this race condition by
itself, by asking DSA to flush the switchdev workqueue of any FDB
deletions in progress, prior to exiting a VLAN.
Fixes: c9eb3e0f8701 ("net: dsa: Add support for learning FDB through notification")
Reported-by: Rafael Richter <rafael.richter@gin.de>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |/ / / / /
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Some time ago 8965779d2c0e ("ipv6,mcast: always hold idev->lock before mca_lock")
switched ipv6_get_lladdr() to __ipv6_get_lladdr(), which is rcu-unsafe
version. That was OK, because idev->lock was held for these codepaths.
In 88e2ca308094 ("mld: convert ifmcaddr6 to RCU") these external locks were
removed, so we probably need to restore the original rcu-safe call.
Otherwise, we occasionally get a machine crashed/stalled with the following
in dmesg:
[ 3405.966610][T230589] general protection fault, probably for non-canonical address 0xdead00000000008c: 0000 [#1] SMP NOPTI
[ 3405.982083][T230589] CPU: 44 PID: 230589 Comm: kworker/44:3 Tainted: G O 5.15.19-cloudflare-2022.2.1 #1
[ 3405.998061][T230589] Hardware name: SUPA-COOL-SERV
[ 3406.009552][T230589] Workqueue: mld mld_ifc_work
[ 3406.017224][T230589] RIP: 0010:__ipv6_get_lladdr+0x34/0x60
[ 3406.025780][T230589] Code: 57 10 48 83 c7 08 48 89 e5 48 39 d7 74 3e 48 8d 82 38 ff ff ff eb 13 48 8b 90 d0 00 00 00 48 8d 82 38 ff ff ff 48 39 d7 74 22 <66> 83 78 32 20 77 1b 75 e4 89 ca 23 50 2c 75 dd 48 8b 50 08 48 8b
[ 3406.055748][T230589] RSP: 0018:ffff94e4b3fc3d10 EFLAGS: 00010202
[ 3406.065617][T230589] RAX: dead00000000005a RBX: ffff94e4b3fc3d30 RCX: 0000000000000040
[ 3406.077477][T230589] RDX: dead000000000122 RSI: ffff94e4b3fc3d30 RDI: ffff8c3a31431008
[ 3406.089389][T230589] RBP: ffff94e4b3fc3d10 R08: 0000000000000000 R09: 0000000000000000
[ 3406.101445][T230589] R10: ffff8c3a31430000 R11: 000000000000000b R12: ffff8c2c37887100
[ 3406.113553][T230589] R13: ffff8c3a39537000 R14: 00000000000005dc R15: ffff8c3a31431000
[ 3406.125730][T230589] FS: 0000000000000000(0000) GS:ffff8c3b9fc80000(0000) knlGS:0000000000000000
[ 3406.138992][T230589] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3406.149895][T230589] CR2: 00007f0dfea1db60 CR3: 000000387b5f2000 CR4: 0000000000350ee0
[ 3406.162421][T230589] Call Trace:
[ 3406.170235][T230589] <TASK>
[ 3406.177736][T230589] mld_newpack+0xfe/0x1a0
[ 3406.186686][T230589] add_grhead+0x87/0xa0
[ 3406.195498][T230589] add_grec+0x485/0x4e0
[ 3406.204310][T230589] ? newidle_balance+0x126/0x3f0
[ 3406.214024][T230589] mld_ifc_work+0x15d/0x450
[ 3406.223279][T230589] process_one_work+0x1e6/0x380
[ 3406.232982][T230589] worker_thread+0x50/0x3a0
[ 3406.242371][T230589] ? rescuer_thread+0x360/0x360
[ 3406.252175][T230589] kthread+0x127/0x150
[ 3406.261197][T230589] ? set_kthread_struct+0x40/0x40
[ 3406.271287][T230589] ret_from_fork+0x22/0x30
[ 3406.280812][T230589] </TASK>
[ 3406.288937][T230589] Modules linked in: ... [last unloaded: kheaders]
[ 3406.476714][T230589] ---[ end trace 3525a7655f2f3b9e ]---
Fixes: 88e2ca308094 ("mld: convert ifmcaddr6 to RCU")
Reported-by: David Pinilla Caparros <dpini@cloudflare.com>
Signed-off-by: Ignat Korchagin <ignat@cloudflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
When uncloning an skb dst and its associated metadata, a new
dst+metadata is allocated and later replaces the old one in the skb.
This is helpful to have a non-shared dst+metadata attached to a specific
skb.
The issue is the uncloned dst+metadata is initialized with a refcount of
1, which is increased to 2 before attaching it to the skb. When
tun_dst_unclone returns, the dst+metadata is only referenced from a
single place (the skb) while its refcount is 2. Its refcount will never
drop to 0 (when the skb is consumed), leading to a memory leak.
Fix this by removing the call to dst_hold in tun_dst_unclone, as the
dst+metadata refcount is already 1.
Fixes: fc4099f17240 ("openvswitch: Fix egress tunnel info.")
Cc: Pravin B Shelar <pshelar@ovn.org>
Reported-by: Vlad Buslov <vladbu@nvidia.com>
Tested-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |_|_|/
|/| | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
When uncloning an skb dst and its associated metadata a new dst+metadata
is allocated and the tunnel information from the old metadata is copied
over there.
The issue is the tunnel metadata has references to cached dst, which are
copied along the way. When a dst+metadata refcount drops to 0 the
metadata is freed including the cached dst entries. As they are also
referenced in the initial dst+metadata, this ends up in UaFs.
In practice the above did not happen because of another issue, the
dst+metadata was never freed because its refcount never dropped to 0
(this will be fixed in a subsequent patch).
Fix this by initializing the dst cache after copying the tunnel
information from the old metadata to also unshare the dst cache.
Fixes: d71785ffc7e7 ("net: add dst_cache to ovs vxlan lwtunnel")
Cc: Paolo Abeni <pabeni@redhat.com>
Reported-by: Vlad Buslov <vladbu@nvidia.com>
Tested-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The previous commit d01ffb9eee4a ("ax25: add refcount in ax25_dev
to avoid UAF bugs") introduces refcount into ax25_dev, but there
are reference leak paths in ax25_ctl_ioctl(), ax25_fwd_ioctl(),
ax25_rt_add(), ax25_rt_del() and ax25_rt_opt().
This patch uses ax25_dev_put() and adjusts the position of
ax25_addr_ax25dev() to fix reference cout leaks of ax25_dev.
Fixes: d01ffb9eee4a ("ax25: add refcount in ax25_dev to avoid UAF bugs")
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20220203150811.42256-1-duoming@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
neigh_managed_work
syzkaller was able to trigger a deadlock for NTF_MANAGED entries [0]:
kworker/0:16/14617 is trying to acquire lock:
ffffffff8d4dd370 (&tbl->lock){++-.}-{2:2}, at: ___neigh_create+0x9e1/0x2990 net/core/neighbour.c:652
[...]
but task is already holding lock:
ffffffff8d4dd370 (&tbl->lock){++-.}-{2:2}, at: neigh_managed_work+0x35/0x250 net/core/neighbour.c:1572
The neighbor entry turned to NUD_FAILED state, where __neigh_event_send()
triggered an immediate probe as per commit cd28ca0a3dd1 ("neigh: reduce
arp latency") via neigh_probe() given table lock was held.
One option to fix this situation is to defer the neigh_probe() back to
the neigh_timer_handler() similarly as pre cd28ca0a3dd1. For the case
of NTF_MANAGED, this deferral is acceptable given this only happens on
actual failure state and regular / expected state is NUD_VALID with the
entry already present.
The fix adds a parameter to __neigh_event_send() in order to communicate
whether immediate probe is allowed or disallowed. Existing call-sites
of neigh_event_send() default as-is to immediate probe. However, the
neigh_managed_work() disables it via use of neigh_event_send_probe().
[0] <TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
print_deadlock_bug kernel/locking/lockdep.c:2956 [inline]
check_deadlock kernel/locking/lockdep.c:2999 [inline]
validate_chain kernel/locking/lockdep.c:3788 [inline]
__lock_acquire.cold+0x149/0x3ab kernel/locking/lockdep.c:5027
lock_acquire kernel/locking/lockdep.c:5639 [inline]
lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5604
__raw_write_lock_bh include/linux/rwlock_api_smp.h:202 [inline]
_raw_write_lock_bh+0x2f/0x40 kernel/locking/spinlock.c:334
___neigh_create+0x9e1/0x2990 net/core/neighbour.c:652
ip6_finish_output2+0x1070/0x14f0 net/ipv6/ip6_output.c:123
__ip6_finish_output net/ipv6/ip6_output.c:191 [inline]
__ip6_finish_output+0x61e/0xe90 net/ipv6/ip6_output.c:170
ip6_finish_output+0x32/0x200 net/ipv6/ip6_output.c:201
NF_HOOK_COND include/linux/netfilter.h:296 [inline]
ip6_output+0x1e4/0x530 net/ipv6/ip6_output.c:224
dst_output include/net/dst.h:451 [inline]
NF_HOOK include/linux/netfilter.h:307 [inline]
ndisc_send_skb+0xa99/0x17f0 net/ipv6/ndisc.c:508
ndisc_send_ns+0x3a9/0x840 net/ipv6/ndisc.c:650
ndisc_solicit+0x2cd/0x4f0 net/ipv6/ndisc.c:742
neigh_probe+0xc2/0x110 net/core/neighbour.c:1040
__neigh_event_send+0x37d/0x1570 net/core/neighbour.c:1201
neigh_event_send include/net/neighbour.h:470 [inline]
neigh_managed_work+0x162/0x250 net/core/neighbour.c:1574
process_one_work+0x9ac/0x1650 kernel/workqueue.c:2307
worker_thread+0x657/0x1110 kernel/workqueue.c:2454
kthread+0x2e9/0x3a0 kernel/kthread.c:377
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
</TASK>
Fixes: 7482e3841d52 ("net, neigh: Add NTF_MANAGED flag for managed neighbor entries")
Reported-by: syzbot+5239d0e1778a500d477a@syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Roopa Prabhu <roopa@nvidia.com>
Tested-by: syzbot+5239d0e1778a500d477a@syzkaller.appspotmail.com
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220201193942.5055-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
| | |/ /
|/| |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
If we dereference ax25_dev after we call kfree(ax25_dev) in
ax25_dev_device_down(), it will lead to concurrency UAF bugs.
There are eight syscall functions suffer from UAF bugs, include
ax25_bind(), ax25_release(), ax25_connect(), ax25_ioctl(),
ax25_getname(), ax25_sendmsg(), ax25_getsockopt() and
ax25_info_show().
One of the concurrency UAF can be shown as below:
(USE) | (FREE)
| ax25_device_event
| ax25_dev_device_down
ax25_bind | ...
... | kfree(ax25_dev)
ax25_fillin_cb() | ...
ax25_fillin_cb_from_dev() |
... |
The root cause of UAF bugs is that kfree(ax25_dev) in
ax25_dev_device_down() is not protected by any locks.
When ax25_dev, which there are still pointers point to,
is released, the concurrency UAF bug will happen.
This patch introduces refcount into ax25_dev in order to
guarantee that there are no pointers point to it when ax25_dev
is released.
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
./include/net/route.h:373:48: warning: incorrect type in argument 2 (different base types)
./include/net/route.h:373:48: expected unsigned int [usertype] key
./include/net/route.h:373:48: got restricted __be32 [usertype] daddr
Fixes: 5c9f7c1dfc2e ("ipv4: Add helpers for neigh lookup for nexthop")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220127013404.1279313-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
ip_select_ident_segs() has been very conservative about using
the connected socket private generator only for packets with IP_DF
set, claiming it was needed for some VJ compression implementations.
As mentioned in this referenced document, this can be abused.
(Ref: Off-Path TCP Exploits of the Mixed IPID Assignment)
Before switching to pure random IPID generation and possibly hurt
some workloads, lets use the private inet socket generator.
Not only this will remove one vulnerability, this will also
improve performance of TCP flows using pmtudisc==IP_PMTUDISC_DONT
Fixes: 73f156a6e8c1 ("inetpeer: get rid of ip_id_count")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reported-by: Ray Che <xijiache@gmail.com>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
| |/ /
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This reverts commit b75326c201242de9495ff98e5d5cff41d7fc0d9d.
This commit breaks Linux compatibility with USGv6 tests. The RFC this
commit was based on is actually an expired draft: no published RFC
currently allows the new behaviour it introduced.
Without full IETF endorsement, the flash renumbering scenario this
patch was supposed to enable is never going to work, as other IPv6
equipements on the same LAN will keep the 2 hours limit.
Fixes: b75326c20124 ("ipv6: Honor all IPv6 PIO Valid Lifetime values")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
bond_option_active_slave_get_rcu() should not be used in rtnl_mutex as it
use rcu_dereference(). Replace to rcu_dereference_rtnl() so we also can use
this function in rtnl protected context.
With this update, we can rmeove the rcu_read_lock/unlock in
bonding .ndo_eth_ioctl and .get_ts_info.
Reported-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Fixes: 94dd016ae538 ("bond: pass get_ts_info and SIOC[SG]HWTSTAMP ioctl to active device")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
struct fib6_node's fn_sernum field can be
read while other threads change it.
Add READ_ONCE()/WRITE_ONCE() annotations.
Do not change existing smp barriers in fib6_get_cookie_safe()
and __fib6_update_sernum_upto_root()
syzbot reported:
BUG: KCSAN: data-race in fib6_clean_node / inet6_csk_route_socket
write to 0xffff88813df62e2c of 4 bytes by task 1920 on cpu 1:
fib6_clean_node+0xc2/0x260 net/ipv6/ip6_fib.c:2178
fib6_walk_continue+0x38e/0x430 net/ipv6/ip6_fib.c:2112
fib6_walk net/ipv6/ip6_fib.c:2160 [inline]
fib6_clean_tree net/ipv6/ip6_fib.c:2240 [inline]
__fib6_clean_all+0x1a9/0x2e0 net/ipv6/ip6_fib.c:2256
fib6_flush_trees+0x6c/0x80 net/ipv6/ip6_fib.c:2281
rt_genid_bump_ipv6 include/net/net_namespace.h:488 [inline]
addrconf_dad_completed+0x57f/0x870 net/ipv6/addrconf.c:4230
addrconf_dad_work+0x908/0x1170
process_one_work+0x3f6/0x960 kernel/workqueue.c:2307
worker_thread+0x616/0xa70 kernel/workqueue.c:2454
kthread+0x1bf/0x1e0 kernel/kthread.c:359
ret_from_fork+0x1f/0x30
read to 0xffff88813df62e2c of 4 bytes by task 15701 on cpu 0:
fib6_get_cookie_safe include/net/ip6_fib.h:285 [inline]
rt6_get_cookie include/net/ip6_fib.h:306 [inline]
ip6_dst_store include/net/ip6_route.h:234 [inline]
inet6_csk_route_socket+0x352/0x3c0 net/ipv6/inet6_connection_sock.c:109
inet6_csk_xmit+0x91/0x1e0 net/ipv6/inet6_connection_sock.c:121
__tcp_transmit_skb+0x1323/0x1840 net/ipv4/tcp_output.c:1402
tcp_transmit_skb net/ipv4/tcp_output.c:1420 [inline]
tcp_write_xmit+0x1450/0x4460 net/ipv4/tcp_output.c:2680
__tcp_push_pending_frames+0x68/0x1c0 net/ipv4/tcp_output.c:2864
tcp_push+0x2d9/0x2f0 net/ipv4/tcp.c:725
mptcp_push_release net/mptcp/protocol.c:1491 [inline]
__mptcp_push_pending+0x46c/0x490 net/mptcp/protocol.c:1578
mptcp_sendmsg+0x9ec/0xa50 net/mptcp/protocol.c:1764
inet6_sendmsg+0x5f/0x80 net/ipv6/af_inet6.c:643
sock_sendmsg_nosec net/socket.c:705 [inline]
sock_sendmsg net/socket.c:725 [inline]
kernel_sendmsg+0x97/0xd0 net/socket.c:745
sock_no_sendpage+0x84/0xb0 net/core/sock.c:3086
inet_sendpage+0x9d/0xc0 net/ipv4/af_inet.c:834
kernel_sendpage+0x187/0x200 net/socket.c:3492
sock_sendpage+0x5a/0x70 net/socket.c:1007
pipe_to_sendpage+0x128/0x160 fs/splice.c:364
splice_from_pipe_feed fs/splice.c:418 [inline]
__splice_from_pipe+0x207/0x500 fs/splice.c:562
splice_from_pipe fs/splice.c:597 [inline]
generic_splice_sendpage+0x94/0xd0 fs/splice.c:746
do_splice_from fs/splice.c:767 [inline]
direct_splice_actor+0x80/0xa0 fs/splice.c:936
splice_direct_to_actor+0x345/0x650 fs/splice.c:891
do_splice_direct+0x106/0x190 fs/splice.c:979
do_sendfile+0x675/0xc40 fs/read_write.c:1245
__do_sys_sendfile64 fs/read_write.c:1310 [inline]
__se_sys_sendfile64 fs/read_write.c:1296 [inline]
__x64_sys_sendfile64+0x102/0x140 fs/read_write.c:1296
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
value changed: 0x0000026f -> 0x00000271
Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 15701 Comm: syz-executor.2 Not tainted 5.16.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
The Fixes tag I chose is probably arbitrary, I do not think
we need to backport this patch to older kernels.
Fixes: c5cff8561d2d ("ipv6: add rcu grace period before freeing fib6_node")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Link: https://lore.kernel.org/r/20220120174112.1126644-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When compiling the kernel with CONFIG_INET disabled, the
sk_defer_free_flush() should be defined as a nop.
This resolves the following compilation error:
ld: net/core/sock.o: in function `sk_defer_free_flush':
./include/net/tcp.h:1378: undefined reference to `__sk_defer_free_flush'
Fixes: 79074a72d335 ("net: Flush deferred skb free on socket destroy")
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20220120123440.9088-1-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
| |\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from netfilter, bpf.
Quite a handful of old regression fixes but most of those are
pre-5.16.
Current release - regressions:
- fix memory leaks in the skb free deferral scheme if upper layer
protocols are used, i.e. in-kernel TCP readers like TLS
Current release - new code bugs:
- nf_tables: fix NULL check typo in _clone() functions
- change the default to y for Vertexcom vendor Kconfig
- a couple of fixes to incorrect uses of ref tracking
- two fixes for constifying netdev->dev_addr
Previous releases - regressions:
- bpf:
- various verifier fixes mainly around register offset handling
when passed to helper functions
- fix mount source displayed for bpffs (none -> bpffs)
- bonding:
- fix extraction of ports for connection hash calculation
- fix bond_xmit_broadcast return value when some devices are down
- phy: marvell: add Marvell specific PHY loopback
- sch_api: don't skip qdisc attach on ingress, prevent ref leak
- htb: restore minimal packet size handling in rate control
- sfp: fix high power modules without diagnostic monitoring
- mscc: ocelot:
- don't let phylink re-enable TX PAUSE on the NPI port
- don't dereference NULL pointers with shared tc filters
- smsc95xx: correct reset handling for LAN9514
- cpsw: avoid alignment faults by taking NET_IP_ALIGN into account
- phy: micrel: use kszphy_suspend/_resume for irq aware devices,
avoid races with the interrupt
Previous releases - always broken:
- xdp: check prog type before updating BPF link
- smc: resolve various races around abnormal connection termination
- sit: allow encapsulated IPv6 traffic to be delivered locally
- axienet: fix init/reset handling, add missing barriers, read the
right status words, stop queues correctly
- add missing dev_put() in sock_timestamping_bind_phc()
Misc:
- ipv4: prevent accidentally passing RTO_ONLINK to
ip_route_output_key_hash() by sanitizing flags
- ipv4: avoid quadratic behavior in netns dismantle
- stmmac: dwmac-oxnas: add support for OX810SE
- fsl: xgmac_mdio: add workaround for erratum A-009885"
* tag 'net-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (92 commits)
ipv4: add net_hash_mix() dispersion to fib_info_laddrhash keys
ipv4: avoid quadratic behavior in netns dismantle
net/fsl: xgmac_mdio: Fix incorrect iounmap when removing module
powerpc/fsl/dts: Enable WA for erratum A-009885 on fman3l MDIO buses
dt-bindings: net: Document fsl,erratum-a009885
net/fsl: xgmac_mdio: Add workaround for erratum A-009885
net: mscc: ocelot: fix using match before it is set
net: phy: micrel: use kszphy_suspend()/kszphy_resume for irq aware devices
net: cpsw: avoid alignment faults by taking NET_IP_ALIGN into account
nfc: llcp: fix NULL error pointer dereference on sendmsg() after failed bind()
net: axienet: increase default TX ring size to 128
net: axienet: fix for TX busy handling
net: axienet: fix number of TX ring slots for available check
net: axienet: Fix TX ring slot available check
net: axienet: limit minimum TX ring size
net: axienet: add missing memory barriers
net: axienet: reset core on initialization prior to MDIO access
net: axienet: Wait for PhyRstCmplt after core reset
net: axienet: increase reset timeout
bpf, selftests: Add ringbuf memory type confusion test
...
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
commit 56b765b79e9a ("htb: improved accuracy at high rates") broke
"overhead X", "linklayer atm" and "mpu X" attributes.
"overhead X" and "linklayer atm" have already been fixed. This restores
the "mpu X" handling, as might be used by DOCSIS or Ethernet shaping:
tc class add ... htb rate X overhead 4 mpu 64
The code being fixed is used by htb, tbf and act_police. Cake has its
own mpu handling. qdisc_calculate_pkt_len still uses the size table
containing values adjusted for mpu by user space.
iproute2 tc has always passed mpu into the kernel via a tc_ratespec
structure, but the kernel never directly acted on it, merely stored it
so that it could be read back by `tc class show`.
Rather, tc would generate length-to-time tables that included the mpu
(and linklayer) in their construction, and the kernel used those tables.
Since v3.7, the tables were no longer used. Along with "mpu", this also
broke "overhead" and "linklayer" which were fixed in 01cb71d2d47b
("net_sched: restore "overhead xxx" handling", v3.10) and 8a8e3d84b171
("net_sched: restore "linklayer atm" handling", v3.11).
"overhead" was fixed by simply restoring use of tc_ratespec::overhead -
this had originally been used by the kernel but was initially omitted
from the new non-table-based calculations.
"linklayer" had been handled in the table like "mpu", but the mode was
not originally passed in tc_ratespec. The new implementation was made to
handle it by getting new versions of tc to pass the mode in an extended
tc_ratespec, and for older versions of tc the table contents were analysed
at load time to deduce linklayer.
As "mpu" has always been given to the kernel in tc_ratespec,
accompanying the mpu-based table, we can restore system functionality
with no userspace change by making the kernel act on the tc_ratespec
value.
Fixes: 56b765b79e9a ("htb: improved accuracy at high rates")
Signed-off-by: Kevin Bracey <kevin@bracey.fi>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: Vimalkumar <j.vimal@gmail.com>
Link: https://lore.kernel.org/r/20220112170210.1014351-1-kevin@bracey.fi
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Both fields can be read/written without synchronization,
add proper accessors and documentation.
Fixes: d5dd88794a13 ("inet: fix various use-after-free in defrags units")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
While struct tcf_exts has a net pointer, it is not refcounted
until tcf_exts_get_net() is called.
Fixes: dbdcda634ce3 ("net: sched: add netns refcount tracker to struct tcf_exts")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Link: https://lore.kernel.org/r/20220110094750.236478-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
| |\ \
| |/
|/|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Pull 9p updates from Dominique Martinet:
"Fixes, split 9p_net_fd, and new reviewer:
- fix possible uninitialized memory usage for setattr
- fix fscache reading hole in a file just after it's been grown
- split net/9p/trans_fd.c in its own module like other transports.
The new transport module defaults to 9P_NET and is autoloaded if
required so users should not be impacted
- add Christian Schoenebeck to 9p reviewers
- some more trivial cleanup"
* tag '9p-for-5.17-rc1' of git://github.com/martinetd/linux:
9p: fix enodata when reading growing file
net/9p: show error message if user 'msize' cannot be satisfied
MAINTAINERS: 9p: add Christian Schoenebeck as reviewer
9p: only copy valid iattrs in 9P2000.L setattr implementation
9p: Use BUG_ON instead of if condition followed by BUG.
net/p9: load default transports
9p/xen: autoload when xenbus service is available
9p/trans_fd: split into dedicated module
fs: 9p: remove unneeded variable
9p/trans_virtio: Fix typo in the comment for p9_virtio_create()
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Now that all transports are split into modules it may happen that no
transports are registered when v9fs_get_default_trans() is called.
When that is the case try to load more transports from modules.
Link: https://lkml.kernel.org/r/20211103193823.111007-5-linux@weissschuh.net
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
[Dominique: constify v9fs_get_trans_by_name argument as per patch1v2]
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
|