This reverts commit 0ba9348532 which is
commit 8e56b063c8 uptream.
It breaks the Android ABI so revert it for now, if it is needed in the
future, it can be brought back in an ABI-safe way.
Bug: 161946584
Change-Id: Ia03ea49365e6ce063194738b22f77d2a403ea3a4
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Changes in 5.10.198
NFS: Use the correct commit info in nfs_join_page_group()
NFS/pNFS: Report EINVAL errors from connect() to the server
SUNRPC: Mark the cred for revalidation if the server rejects it
tracing: Increase trace array ref count on enable and filter files
ata: ahci: Drop pointless VPRINTK() calls and convert the remaining ones
ata: libahci: clear pending interrupt status
ext4: remove the 'group' parameter of ext4_trim_extent
ext4: add new helper interface ext4_try_to_trim_range()
ext4: scope ret locally in ext4_try_to_trim_range()
ext4: change s_last_trim_minblks type to unsigned long
ext4: mark group as trimmed only if it was fully scanned
ext4: replace the traditional ternary conditional operator with with max()/min()
ext4: move setting of trimmed bit into ext4_try_to_trim_range()
ext4: do not let fstrim block system suspend
tracing: Have event inject files inc the trace array ref count
netfilter: nf_tables: integrate pipapo into commit protocol
netfilter: nf_tables: don't skip expired elements during walk
netfilter: nf_tables: GC transaction API to avoid race with control plane
netfilter: nf_tables: adapt set backend to use GC transaction API
netfilter: nft_set_hash: mark set element as dead when deleting from packet path
netfilter: nf_tables: remove busy mark and gc batch API
netfilter: nf_tables: don't fail inserts if duplicate has expired
netfilter: nf_tables: fix GC transaction races with netns and netlink event exit path
netfilter: nf_tables: GC transaction race with netns dismantle
netfilter: nf_tables: GC transaction race with abort path
netfilter: nf_tables: use correct lock to protect gc_list
netfilter: nf_tables: defer gc run if previous batch is still pending
netfilter: nft_set_rbtree: skip sync GC for new elements in this transaction
netfilter: nft_set_rbtree: use read spinlock to avoid datapath contention
netfilter: nft_set_pipapo: stop GC iteration if GC transaction allocation fails
netfilter: nft_set_hash: try later when GC hits EAGAIN on iteration
netfilter: nf_tables: fix memleak when more than 255 elements expired
ASoC: meson: spdifin: start hw on dai probe
netfilter: nf_tables: disallow element removal on anonymous sets
bpf: Avoid deadlock when using queue and stack maps from NMI
selftests/tls: Add {} to avoid static checker warning
selftests: tls: swap the TX and RX sockets in some tests
ASoC: imx-audmix: Fix return error with devm_clk_get()
i40e: Fix VF VLAN offloading when port VLAN is configured
ipv4: fix null-deref in ipv4_link_failure
powerpc/perf/hv-24x7: Update domain value check
dccp: fix dccp_v4_err()/dccp_v6_err() again
platform/x86: intel_scu_ipc: Check status after timeout in busy_loop()
platform/x86: intel_scu_ipc: Check status upon timeout in ipc_wait_for_interrupt()
platform/x86: intel_scu_ipc: Don't override scu in intel_scu_ipc_dev_simple_command()
platform/x86: intel_scu_ipc: Fail IPC send if still busy
x86/srso: Fix srso_show_state() side effect
x86/srso: Fix SBPB enablement for spec_rstack_overflow=off
net: hns3: only enable unicast promisc when mac table full
net: hns3: add 5ms delay before clear firmware reset irq source
net: bridge: use DEV_STATS_INC()
team: fix null-ptr-deref when team device type is changed
netfilter: ipset: Fix race between IPSET_CMD_CREATE and IPSET_CMD_SWAP
seqlock: avoid -Wshadow warnings
seqlock: Rename __seqprop() users
seqlock: Prefix internal seqcount_t-only macros with a "do_"
locking/seqlock: Do the lockdep annotation before locking in do_write_seqcount_begin_nested()
bnxt_en: Flush XDP for bnxt_poll_nitroa0()'s NAPI
net: rds: Fix possible NULL-pointer dereference
gpio: tb10x: Fix an error handling path in tb10x_gpio_probe()
i2c: mux: demux-pinctrl: check the return value of devm_kstrdup()
netfilter: nf_tables: unregister flowtable hooks on netns exit
netfilter: nf_tables: double hook unregistration in netns path
Input: i8042 - rename i8042-x86ia64io.h to i8042-acpipnpio.h
Input: i8042 - add quirk for TUXEDO Gemini 17 Gen1/Clevo PD70PN
mmc: renesas_sdhi: probe into TMIO after SCC parameters have been setup
mmc: renesas_sdhi: populate SCC pointer at the proper place
mmc: tmio: support custom irq masks
mmc: renesas_sdhi: register irqs before registering controller
media: venus: core: Add io base variables for each block
media: venus: hfi,pm,firmware: Convert to block relative addressing
media: venus: hfi: Define additional 6xx registers
media: venus: core: Add differentiator IS_V6(core)
media: venus: hfi: Add a 6xx boot logic
media: venus: hfi_venus: Write to VIDC_CTRL_INIT after unmasking interrupts
netfilter: use actual socket sk for REJECT action
netfilter: nft_exthdr: Support SCTP chunks
netfilter: nf_tables: add and use nft_sk helper
netfilter: nf_tables: add and use nft_thoff helper
netfilter: nft_exthdr: break evaluation if setting TCP option fails
netfilter: exthdr: add support for tcp option removal
netfilter: nft_exthdr: Fix non-linear header modification
ata: libata: Rename link flag ATA_LFLAG_NO_DB_DELAY
ata: ahci: Add support for AMD A85 FCH (Hudson D4)
ata: ahci: Rename board_ahci_mobile
ata: ahci: Add Elkhart Lake AHCI controller
btrfs: reset destination buffer when read_extent_buffer() gets invalid range
MIPS: Alchemy: only build mmc support helpers if au1xmmc is enabled
bus: ti-sysc: Use fsleep() instead of usleep_range() in sysc_reset()
bus: ti-sysc: Fix missing AM35xx SoC matching
clk: tegra: fix error return case for recalc_rate
ARM: dts: omap: correct indentation
ARM: dts: ti: omap: Fix bandgap thermal cells addressing for omap3/4
ARM: dts: motorola-mapphone: Configure lower temperature passive cooling
ARM: dts: motorola-mapphone: Add 1.2GHz OPP
ARM: dts: motorola-mapphone: Drop second ti,wlcore compatible value
ARM: dts: am335x: Guardian: Update beeper label
ARM: dts: Unify pwm-omap-dmtimer node names
ARM: dts: ti: omap: motorola-mapphone: Fix abe_clkctrl warning on boot
bus: ti-sysc: Fix SYSC_QUIRK_SWSUP_SIDLE_ACT handling for uart wake-up
power: supply: ucs1002: fix error code in ucs1002_get_property()
xtensa: add default definition for XCHAL_HAVE_DIV32
xtensa: iss/network: make functions static
xtensa: boot: don't add include-dirs
xtensa: boot/lib: fix function prototypes
gpio: pmic-eic-sprd: Add can_sleep flag for PMIC EIC chip
i2c: npcm7xx: Fix callback completion ordering
dma-debug: don't call __dma_entry_alloc_check_leak() under free_entries_lock
parisc: sba: Fix compile warning wrt list of SBA devices
parisc: iosapic.c: Fix sparse warnings
parisc: drivers: Fix sparse warning
parisc: irq: Make irq_stack_union static to avoid sparse warning
scsi: qedf: Add synchronization between I/O completions and abort
selftests/ftrace: Correctly enable event in instance-event.tc
ring-buffer: Avoid softlockup in ring_buffer_resize()
selftests: fix dependency checker script
ring-buffer: Do not attempt to read past "commit"
platform/mellanox: mlxbf-bootctl: add NET dependency into Kconfig
scsi: pm80xx: Use phy-specific SAS address when sending PHY_START command
scsi: pm80xx: Avoid leaking tags when processing OPC_INB_SET_CONTROLLER_CONFIG command
ata: libata-eh: do not clear ATA_PFLAG_EH_PENDING in ata_eh_reset()
spi: nxp-fspi: reset the FLSHxCR1 registers
bpf: Clarify error expectations from bpf_clone_redirect
media: vb2: frame_vector.c: replace WARN_ONCE with a comment
powerpc/watchpoints: Disable preemption in thread_change_pc()
ncsi: Propagate carrier gain/loss events to the NCSI controller
fbdev/sh7760fb: Depend on FB=y
perf build: Define YYNOMEM as YYNOABORT for bison < 3.81
sched/cpuacct: Fix user/system in shown cpuacct.usage*
sched/cpuacct: Fix charge percpu cpuusage
sched/cpuacct: Optimize away RCU read lock
cgroup: Fix suspicious rcu_dereference_check() usage warning
ACPI: Check StorageD3Enable _DSD property in ACPI code
nvme-pci: factor the iod mempool creation into a helper
nvme-pci: factor out a nvme_pci_alloc_dev helper
nvme-pci: do not set the NUMA node of device if it has none
watchdog: iTCO_wdt: No need to stop the timer in probe
watchdog: iTCO_wdt: Set NO_REBOOT if the watchdog is not already running
netfilter: nft_exthdr: Search chunks in SCTP packets only
netfilter: nft_exthdr: Fix for unsafe packet data read
nvme-pci: always return an ERR_PTR from nvme_pci_alloc_dev
smack: Record transmuting in smk_transmuted
smack: Retrieve transmuting information in smack_inode_getsecurity()
Smack:- Use overlay inode label in smack_inode_copy_up()
Revert "tty: n_gsm: fix UAF in gsm_cleanup_mux"
serial: 8250_port: Check IRQ data before use
nilfs2: fix potential use after free in nilfs_gccache_submit_read_data()
netfilter: nf_tables: disallow rule removal from chain binding
ALSA: hda: Disable power save for solving pop issue on Lenovo ThinkCentre M70q
ata: libata-scsi: ignore reserved bits for REPORT SUPPORTED OPERATION CODES
i2c: i801: unregister tco_pdev in i801_probe() error path
Revert "SUNRPC dont update timeout value on connection reset"
proc: nommu: /proc/<pid>/maps: release mmap read lock
ring-buffer: Update "shortest_full" in polling
btrfs: properly report 0 avail for very full file systems
bpf: Fix BTF_ID symbol generation collision
bpf: Fix BTF_ID symbol generation collision in tools/
net: thunderbolt: Fix TCPv6 GSO checksum calculation
ata: libata-core: Fix ata_port_request_pm() locking
ata: libata-core: Fix port and device removal
ata: libata-core: Do not register PM operations for SAS ports
ata: libata-sata: increase PMP SRST timeout to 10s
fs: binfmt_elf_efpic: fix personality for ELF-FDPIC
spi: spi-zynqmp-gqspi: Fix runtime PM imbalance in zynqmp_qspi_probe
spi: zynqmp-gqspi: fix clock imbalance on probe failure
NFS: Cleanup unused rpc_clnt variable
NFS: rename nfs_client_kset to nfs_kset
NFSv4: Fix a state manager thread deadlock regression
ring-buffer: remove obsolete comment for free_buffer_page()
ring-buffer: Fix bytes info in per_cpu buffer stats
drm/mediatek: Fix backport issue in mtk_drm_gem_prime_vmap()
rbd: move rbd_dev_refresh() definition
rbd: decouple header read-in from updating rbd_dev->header
rbd: decouple parent info read-in from updating rbd_dev
rbd: take header_rwsem in rbd_dev_refresh() only when updating
block: fix use-after-free of q->q_usage_counter
Revert "clk: imx: pll14xx: dynamically configure PLL for 393216000/361267200Hz"
Revert "PCI: qcom: Disable write access to read only registers for IP v2.3.3"
scsi: zfcp: Fix a double put in zfcp_port_enqueue()
qed/red_ll2: Fix undefined behavior bug in struct qed_ll2_info
wifi: mwifiex: Fix tlv_buf_left calculation
net: replace calls to sock->ops->connect() with kernel_connect()
net: prevent rewrite of msg_name in sock_sendmsg()
arm64: Add Cortex-A520 CPU part definition
ubi: Refuse attaching if mtd's erasesize is 0
wifi: iwlwifi: dbg_ini: fix structure packing
wifi: mwifiex: Fix oob check condition in mwifiex_process_rx_packet
bpf: Fix tr dereferencing
drivers/net: process the result of hdlc_open() and add call of hdlc_close() in uhdlc_close()
wifi: mt76: mt76x02: fix MT76x0 external LNA gain handling
regmap: rbtree: Fix wrong register marked as in-cache when creating new node
ima: Finish deprecation of IMA_TRUSTED_KEYRING Kconfig
scsi: target: core: Fix deadlock due to recursive locking
ima: rework CONFIG_IMA dependency block
NFSv4: Fix a nfs4_state_manager() race
modpost: add missing else to the "of" check
net: fix possible store tearing in neigh_periodic_work()
ipv4, ipv6: Fix handling of transhdrlen in __ip{,6}_append_data()
net: dsa: mv88e6xxx: Avoid EEPROM timeout when EEPROM is absent
net: usb: smsc75xx: Fix uninit-value access in __smsc75xx_read_reg
net: nfc: llcp: Add lock when modifying device list
net: ethernet: ti: am65-cpsw: Fix error code in am65_cpsw_nuss_init_tx_chns()
netfilter: handle the connecting collision properly in nf_conntrack_proto_sctp
netfilter: nf_tables: nft_set_rbtree: fix spurious insertion failure
net: stmmac: dwmac-stm32: fix resume on STM32 MCU
tipc: fix a potential deadlock on &tx->lock
tcp: fix quick-ack counting to count actual ACKs of new data
tcp: fix delayed ACKs for MSS boundary condition
sctp: update transport state when processing a dupcook packet
sctp: update hb timer immediately after users change hb_interval
cpupower: add Makefile dependencies for install targets
dm zoned: free dmz->ddev array in dmz_put_zoned_devices
RDMA/core: Require admin capabilities to set system parameters
of: dynamic: Fix potential memory leak in of_changeset_action()
IB/mlx4: Fix the size of a buffer in add_port_entries()
gpio: aspeed: fix the GPIO number passed to pinctrl_gpio_set_config()
gpio: pxa: disable pinctrl calls for MMP_GPIO
RDMA/cma: Initialize ib_sa_multicast structure to 0 when join
RDMA/cma: Fix truncation compilation warning in make_cma_ports
RDMA/uverbs: Fix typo of sizeof argument
RDMA/siw: Fix connection failure handling
RDMA/mlx5: Fix NULL string error
parisc: Restore __ldcw_align for PA-RISC 2.0 processors
netfilter: nf_tables: fix kdoc warnings after gc rework
netfilter: nftables: exthdr: fix 4-byte stack OOB write
mmc: renesas_sdhi: only reset SCC when its pointer is populated
xen/events: replace evtchn_rwlock with RCU
Linux 5.10.198
Change-Id: Iabfdf919ae63e41a565e523087d800ebc20e5448
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
[ Upstream commit 7433b6d2af ]
Kyle Zeng reported that there is a race between IPSET_CMD_ADD and IPSET_CMD_SWAP
in netfilter/ip_set, which can lead to the invocation of `__ip_set_put` on a
wrong `set`, triggering the `BUG_ON(set->ref == 0);` check in it.
The race is caused by using the wrong reference counter, i.e. the ref counter instead
of ref_netlink.
Bug: 303172721
Fixes: 24e227896b ("netfilter: ipset: Add schedule point in call_ad().")
Reported-by: Kyle Zeng <zengyhkyle@gmail.com>
Closes: https://lore.kernel.org/netfilter-devel/ZPZqetxOmH+w%2Fmyc@westworld/#r
Tested-by: Kyle Zeng <zengyhkyle@gmail.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit ea5a61d588)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I33a6a6234830c600a4ebd62ed1fee3a48876b98d
[ Upstream commit f4f8a78031 ]
The opt_num field is controlled by user mode and is not currently
validated inside the kernel. An attacker can take advantage of this to
trigger an OOB read and potentially leak information.
BUG: KASAN: slab-out-of-bounds in nf_osf_match_one+0xbed/0xd10 net/netfilter/nfnetlink_osf.c:88
Read of size 2 at addr ffff88804bc64272 by task poc/6431
CPU: 1 PID: 6431 Comm: poc Not tainted 6.0.0-rc4 #1
Call Trace:
nf_osf_match_one+0xbed/0xd10 net/netfilter/nfnetlink_osf.c:88
nf_osf_find+0x186/0x2f0 net/netfilter/nfnetlink_osf.c:281
nft_osf_eval+0x37f/0x590 net/netfilter/nft_osf.c:47
expr_call_ops_eval net/netfilter/nf_tables_core.c:214
nft_do_chain+0x2b0/0x1490 net/netfilter/nf_tables_core.c:264
nft_do_chain_ipv4+0x17c/0x1f0 net/netfilter/nft_chain_filter.c:23
[..]
Also add validation to genre, subtype and version fields.
Bug: 304913642
Fixes: 11eeef41d5 ("netfilter: passive OS fingerprint xtables match")
Reported-by: Lucas Leong <wmliang@infosec.exchange>
Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 7bb8d52b42)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: If79c79e3f55de8c81b70c19661cb0084b02c3da2
commit e994764976 upstream.
sctp_mt_check doesn't validate the flag_count field. An attacker can
take advantage of that to trigger a OOB read and leak memory
information.
Add the field validation in the checkentry function.
Bug: 304913898
Fixes: 2e4e6a17af ("[NETFILTER] x_tables: Abstraction layer for {ip,ip6,arp}_tables")
Cc: stable@vger.kernel.org
Reported-by: Lucas Leong <wmliang@infosec.exchange>
Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 4921f9349b)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: Ife4e69f6218fdaca2a8647b5ed00d875a5ed0d34
commit 69c5d284f6 upstream.
The xt_u32 module doesn't validate the fields in the xt_u32 structure.
An attacker may take advantage of this to trigger an OOB read by setting
the size fields with a value beyond the arrays boundaries.
Add a checkentry function to validate the structure.
This was originally reported by the ZDI project (ZDI-CAN-18408).
Bug: 304913716
Fixes: 1b50b8a371 ("[NETFILTER]: Add u32 match")
Cc: stable@vger.kernel.org
Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 1c164c1e9e)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: Ic2ff70b303f55f9c3c5db24295bcb223ed7175a7
[ Upstream commit f15f29fd47 ]
Chain binding only requires the rule addition/insertion command within
the same transaction. Removal of rules from chain bindings within the
same transaction makes no sense, userspace does not utilize this
feature. Replace nft_chain_is_bound() check to nft_chain_binding() in
rule deletion commands. Replace command implies a rule deletion, reject
this command too.
Rule flush command can also safely rely on this nft_chain_binding()
check because unbound chains are not allowed since 62e1e94b24
("netfilter: nf_tables: reject unbound chain set before commit phase").
Bug: 302085977
Fixes: d0e2c7de92 ("netfilter: nf_tables: add NFT_CHAIN_BINDING")
Reported-by: Kevin Rich <kevinrich1337@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 5a03b42ae1)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I8b05dc37062824db4c2901000fdf701b38605d32
commit 1689f25924 upstream.
Overflow use refcount checks are not complete.
Add helper function to deal with object reference counter tracking.
Report -EMFILE in case UINT_MAX is reached.
nft_use_dec() splats in case that reference counter underflows,
which should not ever happen.
Add nft_use_inc_restore() and nft_use_dec_restore() which are used
to restore reference counter from error and abort paths.
Use u32 in nft_flowtable and nft_object since helper functions cannot
work on bitfields.
Remove the few early incomplete checks now that the helper functions
are in place and used to check for refcount overflow.
Bug: 302085977
Fixes: 96518518cc ("netfilter: add nftables")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 039ce5eb6b)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I0f2d48b1246de2421edd7d566ae966f02ef63b54
commit fd94d9dade upstream.
If priv->len is a multiple of 4, then dst[len / 4] can write past
the destination array which leads to stack corruption.
This construct is necessary to clean the remainder of the register
in case ->len is NOT a multiple of the register size, so make it
conditional just like nft_payload.c does.
The bug was added in 4.1 cycle and then copied/inherited when
tcp/sctp and ip option support was added.
Bug reported by Zero Day Initiative project (ZDI-CAN-21950,
ZDI-CAN-21951, ZDI-CAN-21961).
Fixes: 49499c3e6e ("netfilter: nf_tables: switch registers to 32 bit addressing")
Fixes: 935b7f6430 ("netfilter: nft_exthdr: add TCP option matching")
Fixes: 133dc203d7 ("netfilter: nft_exthdr: Support SCTP chunks")
Fixes: dbb5281a1f ("netfilter: nf_tables: add support for matching IPv4 options")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 087388278e ]
nft_rbtree_gc_elem() walks back and removes the end interval element that
comes before the expired element.
There is a small chance that we've cached this element as 'rbe_ge'.
If this happens, we hold and test a pointer that has been queued for
freeing.
It also causes spurious insertion failures:
$ cat test-testcases-sets-0044interval_overlap_0.1/testout.log
Error: Could not process rule: File exists
add element t s { 0 - 2 }
^^^^^^
Failed to insert 0 - 2 given:
table ip t {
set s {
type inet_service
flags interval,timeout
timeout 2s
gc-interval 2s
}
}
The set (rbtree) is empty. The 'failure' doesn't happen on next attempt.
Reason is that when we try to insert, the tree may hold an expired
element that collides with the range we're adding.
While we do evict/erase this element, we can trip over this check:
if (rbe_ge && nft_rbtree_interval_end(rbe_ge) && nft_rbtree_interval_end(new))
return -ENOTEMPTY;
rbe_ge was erased by the synchronous gc, we should not have done this
check. Next attempt won't find it, so retry results in successful
insertion.
Restart in-kernel to avoid such spurious errors.
Such restart are rare, unless userspace intentionally adds very large
numbers of elements with very short timeouts while setting a huge
gc interval.
Even in this case, this cannot loop forever, on each retry an existing
element has been removed.
As the caller is holding the transaction mutex, its impossible
for a second entity to add more expiring elements to the tree.
After this it also becomes feasible to remove the async gc worker
and perform all garbage collection from the commit path.
Fixes: c9e6978e27 ("netfilter: nft_set_rbtree: Switch to node list walk for overlap detection")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8e56b063c8 ]
In Scenario A and B below, as the delayed INIT_ACK always changes the peer
vtag, SCTP ct with the incorrect vtag may cause packet loss.
Scenario A: INIT_ACK is delayed until the peer receives its own INIT_ACK
192.168.1.2 > 192.168.1.1: [INIT] [init tag: 1328086772]
192.168.1.1 > 192.168.1.2: [INIT] [init tag: 1414468151]
192.168.1.2 > 192.168.1.1: [INIT ACK] [init tag: 1328086772]
192.168.1.1 > 192.168.1.2: [INIT ACK] [init tag: 1650211246] *
192.168.1.2 > 192.168.1.1: [COOKIE ECHO]
192.168.1.1 > 192.168.1.2: [COOKIE ECHO]
192.168.1.2 > 192.168.1.1: [COOKIE ACK]
Scenario B: INIT_ACK is delayed until the peer completes its own handshake
192.168.1.2 > 192.168.1.1: sctp (1) [INIT] [init tag: 3922216408]
192.168.1.1 > 192.168.1.2: sctp (1) [INIT] [init tag: 144230885]
192.168.1.2 > 192.168.1.1: sctp (1) [INIT ACK] [init tag: 3922216408]
192.168.1.1 > 192.168.1.2: sctp (1) [COOKIE ECHO]
192.168.1.2 > 192.168.1.1: sctp (1) [COOKIE ACK]
192.168.1.1 > 192.168.1.2: sctp (1) [INIT ACK] [init tag: 3914796021] *
This patch fixes it as below:
In SCTP_CID_INIT processing:
- clear ct->proto.sctp.init[!dir] if ct->proto.sctp.init[dir] &&
ct->proto.sctp.init[!dir]. (Scenario E)
- set ct->proto.sctp.init[dir].
In SCTP_CID_INIT_ACK processing:
- drop it if !ct->proto.sctp.init[!dir] && ct->proto.sctp.vtag[!dir] &&
ct->proto.sctp.vtag[!dir] != ih->init_tag. (Scenario B, Scenario C)
- drop it if ct->proto.sctp.init[dir] && ct->proto.sctp.init[!dir] &&
ct->proto.sctp.vtag[!dir] != ih->init_tag. (Scenario A)
In SCTP_CID_COOKIE_ACK processing:
- clear ct->proto.sctp.init[dir] and ct->proto.sctp.init[!dir].
(Scenario D)
Also, it's important to allow the ct state to move forward with cookie_echo
and cookie_ack from the opposite dir for the collision scenarios.
There are also other Scenarios where it should allow the packet through,
addressed by the processing above:
Scenario C: new CT is created by INIT_ACK.
Scenario D: start INIT on the existing ESTABLISHED ct.
Scenario E: start INIT after the old collision on the existing ESTABLISHED
ct.
192.168.1.2 > 192.168.1.1: sctp (1) [INIT] [init tag: 3922216408]
192.168.1.1 > 192.168.1.2: sctp (1) [INIT] [init tag: 144230885]
(both side are stopped, then start new connection again in hours)
192.168.1.2 > 192.168.1.1: sctp (1) [INIT] [init tag: 242308742]
Fixes: 9fb9cbb108 ("[NETFILTER]: Add nf_conntrack subsystem.")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f15f29fd47 ]
Chain binding only requires the rule addition/insertion command within
the same transaction. Removal of rules from chain bindings within the
same transaction makes no sense, userspace does not utilize this
feature. Replace nft_chain_is_bound() check to nft_chain_binding() in
rule deletion commands. Replace command implies a rule deletion, reject
this command too.
Rule flush command can also safely rely on this nft_chain_binding()
check because unbound chains are not allowed since 62e1e94b24
("netfilter: nf_tables: reject unbound chain set before commit phase").
Fixes: d0e2c7de92 ("netfilter: nf_tables: add NFT_CHAIN_BINDING")
Reported-by: Kevin Rich <kevinrich1337@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cf6b5ffdce ]
While iterating through an SCTP packet's chunks, skb_header_pointer() is
called for the minimum expected chunk header size. If (that part of) the
skbuff is non-linear, the following memcpy() may read data past
temporary buffer '_sch'. Use skb_copy_bits() instead which does the
right thing in this situation.
Fixes: 133dc203d7 ("netfilter: nft_exthdr: Support SCTP chunks")
Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5acc44f394 ]
Since user space does not generate a payload dependency, plain sctp
chunk matches cause searching in non-SCTP packets, too. Avoid this
potential mis-interpretation of packet data by checking pkt->tprot.
Fixes: 133dc203d7 ("netfilter: nft_exthdr: Support SCTP chunks")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 28427f368f ]
Fix skb_ensure_writable() size. Don't use nft_tcp_header_pointer() to
make it explicit that pointers point to the packet (not local buffer).
Fixes: 99d1712bc4 ("netfilter: exthdr: tcp option set support")
Fixes: 7890cbea66 ("netfilter: exthdr: add support for tcp option removal")
Cc: stable@vger.kernel.org
Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7890cbea66 ]
This allows to replace a tcp option with nop padding to selectively disable
a particular tcp option.
Optstrip mode is chosen when userspace passes the exthdr expression with
neither a source nor a destination register attribute.
This is identical to xtables TCPOPTSTRIP extension.
The only difference is that TCPOPTSTRIP allows to pass in a bitmap
of options to remove rather than a single number.
Unlike TCPOPTSTRIP this expression can be used multiple times
in the same rule to get the same effect.
We could add a new nested attribute later on in case there is a
use case for single-expression-multi-remove.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Stable-dep-of: 28427f368f ("netfilter: nft_exthdr: Fix non-linear header modification")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 133dc203d7 ]
Chunks are SCTP header extensions similar in implementation to IPv6
extension headers or TCP options. Reusing exthdr expression to find and
extract field values from them is therefore pretty straightforward.
For now, this supports extracting data from chunks at a fixed offset
(and length) only - chunks themselves are an extensible data structure;
in order to make all fields available, a nested extension search is
needed.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Stable-dep-of: 28427f368f ("netfilter: nft_exthdr: Fix non-linear header modification")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 04295878be ]
True to the message of commit v5.10-rc1-105-g46d6c5ae953c, _do_
actually make use of state->sk when possible, such as in the REJECT
modules.
Reported-by: Minqiang Chen <ptpt52@gmail.com>
Cc: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Stable-dep-of: 28427f368f ("netfilter: nft_exthdr: Fix non-linear header modification")
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit f9a43007d3 upstream.
[ This backport includes ab5e5c062f ("netfilter: nf_tables: use
kfree_rcu(ptr, rcu) to release hooks in clean_net path") ]
__nft_release_hooks() is called from pre_netns exit path which
unregisters the hooks, then the NETDEV_UNREGISTER event is triggered
which unregisters the hooks again.
[ 565.221461] WARNING: CPU: 18 PID: 193 at net/netfilter/core.c:495 __nf_unregister_net_hook+0x247/0x270
[...]
[ 565.246890] CPU: 18 PID: 193 Comm: kworker/u64:1 Tainted: G E 5.18.0-rc7+ #27
[ 565.253682] Workqueue: netns cleanup_net
[ 565.257059] RIP: 0010:__nf_unregister_net_hook+0x247/0x270
[...]
[ 565.297120] Call Trace:
[ 565.300900] <TASK>
[ 565.304683] nf_tables_flowtable_event+0x16a/0x220 [nf_tables]
[ 565.308518] raw_notifier_call_chain+0x63/0x80
[ 565.312386] unregister_netdevice_many+0x54f/0xb50
Unregister and destroy netdev hook from netns pre_exit via kfree_rcu
so the NETDEV_UNREGISTER path see unregistered hooks.
Fixes: 767d1216bf ("netfilter: nftables: fix possible UAF over chains from packet path in netns")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 6069da443b upstream.
Unregister flowtable hooks before they are releases via
nf_tables_flowtable_destroy() otherwise hook core reports UAF.
BUG: KASAN: use-after-free in nf_hook_entries_grow+0x5a7/0x700 net/netfilter/core.c:142 net/netfilter/core.c:142
Read of size 4 at addr ffff8880736f7438 by task syz-executor579/3666
CPU: 0 PID: 3666 Comm: syz-executor579 Not tainted 5.16.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
__dump_stack lib/dump_stack.c:88 [inline] lib/dump_stack.c:106
dump_stack_lvl+0x1dc/0x2d8 lib/dump_stack.c:106 lib/dump_stack.c:106
print_address_description+0x65/0x380 mm/kasan/report.c:247 mm/kasan/report.c:247
__kasan_report mm/kasan/report.c:433 [inline]
__kasan_report mm/kasan/report.c:433 [inline] mm/kasan/report.c:450
kasan_report+0x19a/0x1f0 mm/kasan/report.c:450 mm/kasan/report.c:450
nf_hook_entries_grow+0x5a7/0x700 net/netfilter/core.c:142 net/netfilter/core.c:142
__nf_register_net_hook+0x27e/0x8d0 net/netfilter/core.c:429 net/netfilter/core.c:429
nf_register_net_hook+0xaa/0x180 net/netfilter/core.c:571 net/netfilter/core.c:571
nft_register_flowtable_net_hooks+0x3c5/0x730 net/netfilter/nf_tables_api.c:7232 net/netfilter/nf_tables_api.c:7232
nf_tables_newflowtable+0x2022/0x2cf0 net/netfilter/nf_tables_api.c:7430 net/netfilter/nf_tables_api.c:7430
nfnetlink_rcv_batch net/netfilter/nfnetlink.c:513 [inline]
nfnetlink_rcv_skb_batch net/netfilter/nfnetlink.c:634 [inline]
nfnetlink_rcv_batch net/netfilter/nfnetlink.c:513 [inline] net/netfilter/nfnetlink.c:652
nfnetlink_rcv_skb_batch net/netfilter/nfnetlink.c:634 [inline] net/netfilter/nfnetlink.c:652
nfnetlink_rcv+0x10e6/0x2550 net/netfilter/nfnetlink.c:652 net/netfilter/nfnetlink.c:652
__nft_release_hook() calls nft_unregister_flowtable_net_hooks() which
only unregisters the hooks, then after RCU grace period, it is
guaranteed that no packets add new entries to the flowtable (no flow
offload rules and flowtable hooks are reachable from packet path), so it
is safe to call nf_flow_table_free() which cleans up the remaining
entries from the flowtable (both software and hardware) and it unbinds
the flow_block.
Fixes: ff4bf2f42a ("netfilter: nf_tables: add nft_unregister_flowtable_hook()")
Reported-by: syzbot+e918523f77e62790d6d9@syzkaller.appspotmail.com
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7433b6d2af ]
Kyle Zeng reported that there is a race between IPSET_CMD_ADD and IPSET_CMD_SWAP
in netfilter/ip_set, which can lead to the invocation of `__ip_set_put` on a
wrong `set`, triggering the `BUG_ON(set->ref == 0);` check in it.
The race is caused by using the wrong reference counter, i.e. the ref counter instead
of ref_netlink.
Fixes: 24e227896b ("netfilter: ipset: Add schedule point in call_ad().")
Reported-by: Kyle Zeng <zengyhkyle@gmail.com>
Closes: https://lore.kernel.org/netfilter-devel/ZPZqetxOmH+w%2Fmyc@westworld/#r
Tested-by: Kyle Zeng <zengyhkyle@gmail.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 23a3bfd4ba ]
Anonymous sets need to be populated once at creation and then they are
bound to rule since 938154b93b ("netfilter: nf_tables: reject unbound
anonymous set before commit phase"), otherwise transaction reports
EINVAL.
Userspace does not need to delete elements of anonymous sets that are
not yet bound, reject this with EOPNOTSUPP.
From flush command path, skip anonymous sets, they are expected to be
bound already. Otherwise, EINVAL is hit at the end of this transaction
for unbound sets.
Fixes: 96518518cc ("netfilter: add nftables")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit cf5000a778 upstream.
When more than 255 elements expired we're supposed to switch to a new gc
container structure.
This never happens: u8 type will wrap before reaching the boundary
and nft_trans_gc_space() always returns true.
This means we recycle the initial gc container structure and
lose track of the elements that came before.
While at it, don't deref 'gc' after we've passed it to call_rcu.
Fixes: 5f68718b34 ("netfilter: nf_tables: GC transaction API to avoid race with control plane")
Reported-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit b079155faa upstream.
Skip GC run if iterator rewinds to the beginning with EAGAIN, otherwise GC
might collect the same element more than once.
Fixes: f6c383b8c3 ("netfilter: nf_tables: adapt set backend to use GC transaction API")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 6d365eabce upstream.
nft_trans_gc_queue_sync() enqueues the GC transaction and it allocates a
new one. If this allocation fails, then stop this GC sync run and retry
later.
Fixes: 5f68718b34 ("netfilter: nf_tables: GC transaction API to avoid race with control plane")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 96b33300fb upstream.
rbtree GC does not modify the datastructure, instead it collects expired
elements and it enqueues a GC transaction. Use a read spinlock instead
to avoid data contention while GC worker is running.
Fixes: f6c383b8c3 ("netfilter: nf_tables: adapt set backend to use GC transaction API")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 2ee52ae94b upstream.
New elements in this transaction might expired before such transaction
ends. Skip sync GC for such elements otherwise commit path might walk
over an already released object. Once transaction is finished, async GC
will collect such expired element.
Fixes: f6c383b8c3 ("netfilter: nf_tables: adapt set backend to use GC transaction API")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 8e51830e29 upstream.
Don't queue more gc work, else we may queue the same elements multiple
times.
If an element is flagged as dead, this can mean that either the previous
gc request was invalidated/discarded by a transaction or that the previous
request is still pending in the system work queue.
The latter will happen if the gc interval is set to a very low value,
e.g. 1ms, and system work queue is backlogged.
The sets refcount is 1 if no previous gc requeusts are queued, so add
a helper for this and skip gc run if old requests are pending.
Add a helper for this and skip the gc run in this case.
Fixes: f6c383b8c3 ("netfilter: nf_tables: adapt set backend to use GC transaction API")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 8357bc946a upstream.
Use nf_tables_gc_list_lock spinlock, not nf_tables_destroy_list_lock to
protect the gc_list.
Fixes: 5f68718b34 ("netfilter: nf_tables: GC transaction API to avoid race with control plane")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 720344340f upstream.
Abort path is missing a synchronization point with GC transactions. Add
GC sequence number hence any GC transaction losing race will be
discarded.
Fixes: 5f68718b34 ("netfilter: nf_tables: GC transaction API to avoid race with control plane")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 02c6c24402 upstream.
Use maybe_get_net() since GC workqueue might race with netns exit path.
Fixes: 5f68718b34 ("netfilter: nf_tables: GC transaction API to avoid race with control plane")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 6a33d8b73d upstream.
Netlink event path is missing a synchronization point with GC
transactions. Add GC sequence number update to netns release path and
netlink event path, any GC transaction losing race will be discarded.
Fixes: 5f68718b34 ("netfilter: nf_tables: GC transaction API to avoid race with control plane")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 7845914f45 upstream.
nftables selftests fail:
run-tests.sh testcases/sets/0044interval_overlap_0
Expected: 0-2 . 0-3, got:
W: [FAILED] ./testcases/sets/0044interval_overlap_0: got 1
Insertion must ignore duplicate but expired entries.
Moreover, there is a strange asymmetry in nft_pipapo_activate:
It refetches the current element, whereas the other ->activate callbacks
(bitmap, hash, rhash, rbtree) use elem->priv.
Same for .remove: other set implementations take elem->priv,
nft_pipapo_remove fetches elem->priv, then does a relookup,
remove this.
I suspect this was the reason for the change that prompted the
removal of the expired check in pipapo_get() in the first place,
but skipping exired elements there makes no sense to me, this helper
is used for normal get requests, insertions (duplicate check)
and deactivate callback.
In first two cases expired elements must be skipped.
For ->deactivate(), this gets called for DELSETELEM, so it
seems to me that expired elements should be skipped as well, i.e.
delete request should fail with -ENOENT error.
Fixes: 24138933b9 ("netfilter: nf_tables: don't skip expired elements during walk")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit a2dd0233cb upstream.
Ditch it, it has been replace it by the GC transaction API and it has no
clients anymore.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit c92db30304 upstream.
Set on the NFT_SET_ELEM_DEAD_BIT flag on this element, instead of
performing element removal which might race with an ongoing transaction.
Enable gc when dynamic flag is set on since dynset deletion requires
garbage collection after this patch.
Fixes: d0a8d877da ("netfilter: nft_dynset: support for element deletion")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit f6c383b8c3 upstream.
Use the GC transaction API to replace the old and buggy gc API and the
busy mark approach.
No set elements are removed from async garbage collection anymore,
instead the _DEAD bit is set on so the set element is not visible from
lookup path anymore. Async GC enqueues transaction work that might be
aborted and retried later.
rbtree and pipapo set backends does not set on the _DEAD bit from the
sync GC path since this runs in control plane path where mutex is held.
In this case, set elements are deactivated, removed and then released
via RCU callback, sync GC never fails.
Fixes: 3c4287f620 ("nf_tables: Add set type for arbitrary concatenation of ranges")
Fixes: 8d8540c4f5 ("netfilter: nft_set_rbtree: add timeout support")
Fixes: 9d0982927e ("netfilter: nft_hash: add support for timeouts")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 5f68718b34 upstream.
The set types rhashtable and rbtree use a GC worker to reclaim memory.
From system work queue, in periodic intervals, a scan of the table is
done.
The major caveat here is that the nft transaction mutex is not held.
This causes a race between control plane and GC when they attempt to
delete the same element.
We cannot grab the netlink mutex from the work queue, because the
control plane has to wait for the GC work queue in case the set is to be
removed, so we get following deadlock:
cpu 1 cpu2
GC work transaction comes in , lock nft mutex
`acquire nft mutex // BLOCKS
transaction asks to remove the set
set destruction calls cancel_work_sync()
cancel_work_sync will now block forever, because it is waiting for the
mutex the caller already owns.
This patch adds a new API that deals with garbage collection in two
steps:
1) Lockless GC of expired elements sets on the NFT_SET_ELEM_DEAD_BIT
so they are not visible via lookup. Annotate current GC sequence in
the GC transaction. Enqueue GC transaction work as soon as it is
full. If ruleset is updated, then GC transaction is aborted and
retried later.
2) GC work grabs the mutex. If GC sequence has changed then this GC
transaction lost race with control plane, abort it as it contains
stale references to objects and let GC try again later. If the
ruleset is intact, then this GC transaction deactivates and removes
the elements and it uses call_rcu() to destroy elements.
Note that no elements are removed from GC lockless path, the _DEAD bit
is set and pointers are collected. GC catchall does not remove the
elements anymore too. There is a new set->dead flag that is set on to
abort the GC transaction to deal with set->ops->destroy() path which
removes the remaining elements in the set from commit_release, where no
mutex is held.
To deal with GC when mutex is held, which allows safe deactivate and
removal, add sync GC API which releases the set element object via
call_rcu(). This is used by rbtree and pipapo backends which also
perform garbage collection from control plane path.
Since element removal from sets can happen from control plane and
element garbage collection/timeout, it is necessary to keep the set
structure alive until all elements have been deactivated and destroyed.
We cannot do a cancel_work_sync or flush_work in nft_set_destroy because
its called with the transaction mutex held, but the aforementioned async
work queue might be blocked on the very mutex that nft_set_destroy()
callchain is sitting on.
This gives us the choice of ABBA deadlock or UaF.
To avoid both, add set->refs refcount_t member. The GC API can then
increment the set refcount and release it once the elements have been
free'd.
Set backends are adapted to use the GC transaction API in a follow up
patch entitled:
("netfilter: nf_tables: use gc transaction API in set backends")
This is joint work with Florian Westphal.
Fixes: cfed7e1b1f ("netfilter: nf_tables: add set garbage collection helpers")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 24138933b9 upstream.
There is an asymmetry between commit/abort and preparation phase if the
following conditions are met:
1. set is a verdict map ("1.2.3.4 : jump foo")
2. timeouts are enabled
In this case, following sequence is problematic:
1. element E in set S refers to chain C
2. userspace requests removal of set S
3. kernel does a set walk to decrement chain->use count for all elements
from preparation phase
4. kernel does another set walk to remove elements from the commit phase
(or another walk to do a chain->use increment for all elements from
abort phase)
If E has already expired in 1), it will be ignored during list walk, so its use count
won't have been changed.
Then, when set is culled, ->destroy callback will zap the element via
nf_tables_set_elem_destroy(), but this function is only safe for
elements that have been deactivated earlier from the preparation phase:
lack of earlier deactivate removes the element but leaks the chain use
count, which results in a WARN splat when the chain gets removed later,
plus a leak of the nft_chain structure.
Update pipapo_get() not to skip expired elements, otherwise flush
command reports bogus ENOENT errors.
Fixes: 3c4287f620 ("nf_tables: Add set type for arbitrary concatenation of ranges")
Fixes: 8d8540c4f5 ("netfilter: nft_set_rbtree: add timeout support")
Fixes: 9d0982927e ("netfilter: nft_hash: add support for timeouts")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 212ed75dc5 upstream.
The pipapo set backend follows copy-on-update approach, maintaining one
clone of the existing datastructure that is being updated. The clone
and current datastructures are swapped via rcu from the commit step.
The existing integration with the commit protocol is flawed because
there is no operation to clean up the clone if the transaction is
aborted. Moreover, the datastructure swap happens on set element
activation.
This patch adds two new operations for sets: commit and abort, these new
operations are invoked from the commit and abort steps, after the
transactions have been digested, and it updates the pipapo set backend
to use it.
This patch adds a new ->pending_update field to sets to maintain a list
of sets that require this new commit and abort operations.
Fixes: 3c4287f620 ("nf_tables: Add set type for arbitrary concatenation of ranges")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Changes in 5.10.195
erofs: ensure that the post-EOF tails are all zeroed
ARM: pxa: remove use of symbol_get()
mmc: au1xmmc: force non-modular build and remove symbol_get usage
net: enetc: use EXPORT_SYMBOL_GPL for enetc_phc_index
rtc: ds1685: use EXPORT_SYMBOL_GPL for ds1685_rtc_poweroff
modules: only allow symbol_get of EXPORT_SYMBOL_GPL modules
USB: serial: option: add Quectel EM05G variant (0x030e)
USB: serial: option: add FOXCONN T99W368/T99W373 product
usb: dwc3: meson-g12a: do post init to fix broken usb after resumption
usb: chipidea: imx: improve logic if samsung,picophy-* parameter is 0
HID: wacom: remove the battery when the EKR is off
staging: rtl8712: fix race condition
Bluetooth: btsdio: fix use after free bug in btsdio_remove due to race condition
configfs: fix a race in configfs_lookup()
serial: qcom-geni: fix opp vote on shutdown
serial: sc16is7xx: fix broken port 0 uart init
serial: sc16is7xx: fix bug when first setting GPIO direction
firmware: stratix10-svc: Fix an NULL vs IS_ERR() bug in probe
fsi: master-ast-cf: Add MODULE_FIRMWARE macro
nilfs2: fix general protection fault in nilfs_lookup_dirty_data_buffers()
nilfs2: fix WARNING in mark_buffer_dirty due to discarded buffer reuse
pinctrl: amd: Don't show `Invalid config param` errors
ASoC: rt5682: Fix a problem with error handling in the io init function of the soundwire
ARM: dts: imx: update sdma node name format
ARM: dts: imx7s: Drop dma-apb interrupt-names
ARM: dts: imx: Adjust dma-apbh node name
ARM: dts: imx: Set default tuning step for imx7d usdhc
phy: qcom-snps-femto-v2: use qcom_snps_hsphy_suspend/resume error code
media: pulse8-cec: handle possible ping error
media: pci: cx23885: fix error handling for cx23885 ATSC boards
9p: virtio: make sure 'offs' is initialized in zc_request
ASoC: da7219: Flush pending AAD IRQ when suspending
ASoC: da7219: Check for failure reading AAD IRQ events
ethernet: atheros: fix return value check in atl1c_tso_csum()
vxlan: generalize vxlan_parse_gpe_hdr and remove unused args
m68k: Fix invalid .section syntax
s390/dasd: use correct number of retries for ERP requests
s390/dasd: fix hanging device after request requeue
fs/nls: make load_nls() take a const parameter
ASoc: codecs: ES8316: Fix DMIC config
ASoC: atmel: Fix the 8K sample parameter in I2SC master
platform/x86: intel: hid: Always call BTNL ACPI method
platform/x86: huawei-wmi: Silence ambient light sensor
drm/amd/display: Exit idle optimizations before attempt to access PHY
ovl: Always reevaluate the file signature for IMA
ata: pata_arasan_cf: Use dev_err_probe() instead dev_err() in data_xfer()
security: keys: perform capable check only on privileged operations
kprobes: Prohibit probing on CFI preamble symbol
clk: fixed-mmio: make COMMON_CLK_FIXED_MMIO depend on HAS_IOMEM
vmbus_testing: fix wrong python syntax for integer value comparison
net: usb: qmi_wwan: add Quectel EM05GV2
idmaengine: make FSL_EDMA and INTEL_IDMA64 depends on HAS_IOMEM
scsi: qedi: Fix potential deadlock on &qedi_percpu->p_work_lock
netlabel: fix shift wrapping bug in netlbl_catmap_setlong()
bnx2x: fix page fault following EEH recovery
sctp: handle invalid error codes without calling BUG()
scsi: storvsc: Always set no_report_opcodes
ALSA: seq: oss: Fix racy open/close of MIDI devices
tracing: Introduce pipe_cpumask to avoid race on trace_pipes
platform/mellanox: Fix mlxbf-tmfifo not handling all virtio CONSOLE notifications
net: Avoid address overwrite in kernel_connect
udf: Check consistency of Space Bitmap Descriptor
udf: Handle error when adding extent to a file
Revert "net: macsec: preserve ingress frame ordering"
reiserfs: Check the return value from __getblk()
eventfd: Export eventfd_ctx_do_read()
eventfd: prevent underflow for eventfd semaphores
fs: Fix error checking for d_hash_and_lookup()
tmpfs: verify {g,u}id mount options correctly
selftests/harness: Actually report SKIP for signal tests
refscale: Fix uninitalized use of wait_queue_head_t
OPP: Fix passing 0 to PTR_ERR in _opp_attach_genpd()
selftests/resctrl: Don't leak buffer in fill_cache()
selftests/resctrl: Unmount resctrl FS if child fails to run benchmark
selftests/resctrl: Close perf value read fd on errors
x86/decompressor: Don't rely on upper 32 bits of GPRs being preserved
perf/imx_ddr: don't enable counter0 if none of 4 counters are used
s390/pkey: fix/harmonize internal keyblob headers
s390/paes: fix PKEY_TYPE_EP11_AES handling for secure keyblobs
x86/efistub: Fix PCI ROM preservation in mixed mode
cpufreq: powernow-k8: Use related_cpus instead of cpus in driver.exit()
bpftool: Use a local bpf_perf_event_value to fix accessing its fields
bpf: Clear the probe_addr for uprobe
tcp: tcp_enter_quickack_mode() should be static
hwrng: nomadik - keep clock enabled while hwrng is registered
regmap: rbtree: Use alloc_flags for memory allocations
udp: re-score reuseport groups when connected sockets are present
bpf: reject unhashed sockets in bpf_sk_assign
wifi: mt76: testmode: add nla_policy for MT76_TM_ATTR_TX_LENGTH
spi: tegra20-sflash: fix to check return value of platform_get_irq() in tegra_sflash_probe()
can: gs_usb: gs_usb_receive_bulk_callback(): count RX overflow errors also in case of OOM
wifi: mwifiex: Fix OOB and integer underflow when rx packets
wifi: mwifiex: fix error recovery in PCIE buffer descriptor management
selftests/bpf: fix static assert compilation issue for test_cls_*.c
crypto: stm32 - Properly handle pm_runtime_get failing
crypto: api - Use work queue in crypto_destroy_instance
Bluetooth: nokia: fix value check in nokia_bluetooth_serdev_probe()
Bluetooth: Fix potential use-after-free when clear keys
net: tcp: fix unexcepted socket die when snd_wnd is 0
selftests/bpf: Clean up fmod_ret in bench_rename test script
ice: ice_aq_check_events: fix off-by-one check when filling buffer
crypto: caam - fix unchecked return value error
hwrng: iproc-rng200 - Implement suspend and resume calls
lwt: Fix return values of BPF xmit ops
lwt: Check LWTUNNEL_XMIT_CONTINUE strictly
fs: ocfs2: namei: check return value of ocfs2_add_entry()
wifi: mwifiex: fix memory leak in mwifiex_histogram_read()
wifi: mwifiex: Fix missed return in oob checks failed path
samples/bpf: fix broken map lookup probe
wifi: ath9k: fix races between ath9k_wmi_cmd and ath9k_wmi_ctrl_rx
wifi: ath9k: protect WMI command response buffer replacement with a lock
wifi: mwifiex: avoid possible NULL skb pointer dereference
Bluetooth: btusb: Do not call kfree_skb() under spin_lock_irqsave()
wifi: ath9k: use IS_ERR() with debugfs_create_dir()
net: arcnet: Do not call kfree_skb() under local_irq_disable()
mlxsw: i2c: Fix chunk size setting in output mailbox buffer
mlxsw: i2c: Limit single transaction buffer size
hwmon: (tmp513) Fix the channel number in tmp51x_is_visible()
net/sched: sch_hfsc: Ensure inner classes have fsc curve
netrom: Deny concurrent connect().
drm/bridge: tc358764: Fix debug print parameter order
quota: factor out dquot_write_dquot()
quota: rename dquot_active() to inode_quota_active()
quota: add new helper dquot_active()
quota: fix dqput() to follow the guarantees dquot_srcu should provide
ASoC: stac9766: fix build errors with REGMAP_AC97
soc: qcom: ocmem: Add OCMEM hardware version print
soc: qcom: ocmem: Fix NUM_PORTS & NUM_MACROS macros
arm64: dts: qcom: msm8996: Add missing interrupt to the USB2 controller
drm/amdgpu: avoid integer overflow warning in amdgpu_device_resize_fb_bar()
ARM: dts: BCM5301X: Harmonize EHCI/OHCI DT nodes name
ARM: dts: BCM53573: Describe on-SoC BCM53125 rev 4 switch
ARM: dts: BCM53573: Drop nonexistent #usb-cells
ARM: dts: BCM53573: Add cells sizes to PCIe node
ARM: dts: BCM53573: Use updated "spi-gpio" binding properties
drm/etnaviv: fix dumping of active MMU context
x86/mm: Fix PAT bit missing from page protection modify mask
ARM: dts: s3c64xx: align pinctrl with dtschema
ARM: dts: samsung: s3c6410-mini6410: correct ethernet reg addresses (split)
ARM: dts: s5pv210: adjust node names to DT spec
ARM: dts: s5pv210: add dummy 5V regulator for backlight on SMDKv210
ARM: dts: samsung: s5pv210-smdkv210: correct ethernet reg addresses (split)
drm: adv7511: Fix low refresh rate register for ADV7533/5
ARM: dts: BCM53573: Fix Ethernet info for Luxul devices
arm64: dts: qcom: sdm845: Add missing RPMh power domain to GCC
arm64: dts: qcom: sdm845: Fix the min frequency of "ice_core_clk"
drm/amdgpu: Update min() to min_t() in 'amdgpu_info_ioctl'
md/bitmap: don't set max_write_behind if there is no write mostly device
md/md-bitmap: hold 'reconfig_mutex' in backlog_store()
drm/tegra: Remove superfluous error messages around platform_get_irq()
drm/tegra: dpaux: Fix incorrect return value of platform_get_irq
of: unittest: fix null pointer dereferencing in of_unittest_find_node_by_name()
drm/armada: Fix off-by-one error in armada_overlay_get_property()
drm/panel: simple: Add missing connector type and pixel format for AUO T215HVN01
ima: Remove deprecated IMA_TRUSTED_KEYRING Kconfig
drm: xlnx: zynqmp_dpsub: Add missing check for dma_set_mask
drm/msm/mdp5: Don't leak some plane state
firmware: meson_sm: fix to avoid potential NULL pointer dereference
smackfs: Prevent underflow in smk_set_cipso()
drm/amd/pm: fix variable dereferenced issue in amdgpu_device_attr_create()
drm/msm/a2xx: Call adreno_gpu_init() earlier
audit: fix possible soft lockup in __audit_inode_child()
bus: ti-sysc: Fix build warning for 64-bit build
drm/mediatek: Fix potential memory leak if vmap() fail
bus: ti-sysc: Fix cast to enum warning
of: unittest: Fix overlay type in apply/revert check
ALSA: ac97: Fix possible error value of *rac97
ipmi:ssif: Add check for kstrdup
ipmi:ssif: Fix a memory leak when scanning for an adapter
drivers: clk: keystone: Fix parameter judgment in _of_pll_clk_init()
clk: sunxi-ng: Modify mismatched function name
clk: qcom: gcc-sc7180: use ARRAY_SIZE instead of specifying num_parents
clk: qcom: gcc-sc7180: Fix up gcc_sdcc2_apps_clk_src
ext4: correct grp validation in ext4_mb_good_group
clk: qcom: gcc-sm8250: use ARRAY_SIZE instead of specifying num_parents
clk: qcom: gcc-sm8250: Fix gcc_sdcc2_apps_clk_src
clk: qcom: reset: Use the correct type of sleep/delay based on length
PCI: Mark NVIDIA T4 GPUs to avoid bus reset
pinctrl: mcp23s08: check return value of devm_kasprintf()
PCI: pciehp: Use RMW accessors for changing LNKCTL
PCI/ASPM: Use RMW accessors for changing LNKCTL
clk: imx8mp: fix sai4 clock
clk: imx: composite-8m: fix clock pauses when set_rate would be a no-op
vfio/type1: fix cap_migration information leak
powerpc/fadump: reset dump area size if fadump memory reserve fails
powerpc/perf: Convert fsl_emb notifier to state machine callbacks
drm/amdgpu: Use RMW accessors for changing LNKCTL
drm/radeon: Use RMW accessors for changing LNKCTL
net/mlx5: Use RMW accessors for changing LNKCTL
wifi: ath10k: Use RMW accessors for changing LNKCTL
powerpc: Don't include lppaca.h in paca.h
powerpc/pseries: Rework lppaca_shared_proc() to avoid DEBUG_PREEMPT
nfs/blocklayout: Use the passed in gfp flags
powerpc/iommu: Fix notifiers being shared by PCI and VIO buses
jfs: validate max amount of blocks before allocation.
fs: lockd: avoid possible wrong NULL parameter
NFSD: da_addr_body field missing in some GETDEVICEINFO replies
NFS: Guard against READDIR loop when entry names exceed MAXNAMELEN
NFSv4.2: fix handling of COPY ERR_OFFLOAD_NO_REQ
media: ad5820: Drop unsupported ad5823 from i2c_ and of_device_id tables
media: i2c: tvp5150: check return value of devm_kasprintf()
media: v4l2-core: Fix a potential resource leak in v4l2_fwnode_parse_link()
drivers: usb: smsusb: fix error handling code in smsusb_init_device
media: dib7000p: Fix potential division by zero
media: dvb-usb: m920x: Fix a potential memory leak in m920x_i2c_xfer()
media: cx24120: Add retval check for cx24120_message_send()
scsi: hisi_sas: Print SAS address for v3 hw erroneous completion print
scsi: libsas: Introduce more SAM status code aliases in enum exec_status
scsi: hisi_sas: Modify v3 HW SSP underflow error processing
scsi: hisi_sas: Modify v3 HW SATA completion error processing
scsi: hisi_sas: Fix warnings detected by sparse
scsi: hisi_sas: Fix normally completed I/O analysed as failed
media: rkvdec: increase max supported height for H.264
media: mediatek: vcodec: Return NULL if no vdec_fb is found
usb: phy: mxs: fix getting wrong state with mxs_phy_is_otg_host()
scsi: RDMA/srp: Fix residual handling
scsi: iscsi: Rename iscsi_set_param() to iscsi_if_set_param()
scsi: iscsi: Add length check for nlattr payload
scsi: iscsi: Add strlen() check in iscsi_if_set{_host}_param()
scsi: be2iscsi: Add length check when parsing nlattrs
scsi: qla4xxx: Add length check when parsing nlattrs
serial: sprd: Assign sprd_port after initialized to avoid wrong access
serial: sprd: Fix DMA buffer leak issue
x86/APM: drop the duplicate APM_MINOR_DEV macro
scsi: qedf: Do not touch __user pointer in qedf_dbg_stop_io_on_error_cmd_read() directly
scsi: qedf: Do not touch __user pointer in qedf_dbg_debug_cmd_read() directly
scsi: qedf: Do not touch __user pointer in qedf_dbg_fp_int_cmd_read() directly
coresight: tmc: Explicit type conversions to prevent integer overflow
dma-buf/sync_file: Fix docs syntax
driver core: test_async: fix an error code
IB/uverbs: Fix an potential error pointer dereference
fsi: aspeed: Reset master errors after CFAM reset
iommu/qcom: Disable and reset context bank before programming
iommu/vt-d: Fix to flush cache of PASID directory table
media: go7007: Remove redundant if statement
USB: gadget: f_mass_storage: Fix unused variable warning
media: ov5640: Enable MIPI interface in ov5640_set_power_mipi()
media: i2c: ov2680: Set V4L2_CTRL_FLAG_MODIFY_LAYOUT on flips
media: ov2680: Remove auto-gain and auto-exposure controls
media: ov2680: Fix ov2680_bayer_order()
media: ov2680: Fix vflip / hflip set functions
media: ov2680: Fix regulators being left enabled on ov2680_power_on() errors
cgroup:namespace: Remove unused cgroup_namespaces_init()
scsi: core: Use 32-bit hostnum in scsi_host_lookup()
scsi: fcoe: Fix potential deadlock on &fip->ctlr_lock
serial: tegra: handle clk prepare error in tegra_uart_hw_init()
amba: bus: fix refcount leak
Revert "IB/isert: Fix incorrect release of isert connection"
RDMA/siw: Balance the reference of cep->kref in the error path
RDMA/siw: Correct wrong debug message
HID: logitech-dj: Fix error handling in logi_dj_recv_switch_to_dj_mode()
HID: multitouch: Correct devm device reference for hidinput input_dev name
x86/speculation: Mark all Skylake CPUs as vulnerable to GDS
tracing: Fix race issue between cpu buffer write and swap
mtd: rawnand: brcmnand: Fix mtd oobsize
phy/rockchip: inno-hdmi: use correct vco_div_5 macro on rk3328
phy/rockchip: inno-hdmi: round fractal pixclock in rk3328 recalc_rate
phy/rockchip: inno-hdmi: do not power on rk3328 post pll on reg write
rpmsg: glink: Add check for kstrdup
mtd: spi-nor: Check bus width while setting QE bit
mtd: rawnand: fsmc: handle clk prepare error in fsmc_nand_resume()
um: Fix hostaudio build errors
dmaengine: ste_dma40: Add missing IRQ check in d40_probe
cpufreq: Fix the race condition while updating the transition_task of policy
virtio_ring: fix avail_wrap_counter in virtqueue_add_packed
igmp: limit igmpv3_newpack() packet size to IP_MAX_MTU
netfilter: ipset: add the missing IP_SET_HASH_WITH_NET0 macro for ip_set_hash_netportnet.c
netfilter: xt_u32: validate user space input
netfilter: xt_sctp: validate the flag_info count
skbuff: skb_segment, Call zero copy functions before using skbuff frags
igb: set max size RX buffer when store bad packet is enabled
PM / devfreq: Fix leak in devfreq_dev_release()
ALSA: pcm: Fix missing fixup call in compat hw_refine ioctl
printk: ringbuffer: Fix truncating buffer size min_t cast
scsi: core: Fix the scsi_set_resid() documentation
ipmi_si: fix a memleak in try_smi_init()
ARM: OMAP2+: Fix -Warray-bounds warning in _pwrdm_state_switch()
backlight/gpio_backlight: Compare against struct fb_info.device
backlight/bd6107: Compare against struct fb_info.device
backlight/lv5207lp: Compare against struct fb_info.device
xtensa: PMU: fix base address for the newer hardware
arm64: csum: Fix OoB access in IP checksum code for negative lengths
media: dvb: symbol fixup for dvb_attach()
Revert "scsi: qla2xxx: Fix buffer overrun"
scsi: mpt3sas: Perform additional retries if doorbell read returns 0
ntb: Drop packets when qp link is down
ntb: Clean up tx tail index on link down
ntb: Fix calculation ntb_transport_tx_free_entry()
Revert "PCI: Mark NVIDIA T4 GPUs to avoid bus reset"
procfs: block chmod on /proc/thread-self/comm
parisc: Fix /proc/cpuinfo output for lscpu
dlm: fix plock lookup when using multiple lockspaces
dccp: Fix out of bounds access in DCCP error handler
X.509: if signature is unsupported skip validation
net: handle ARPHRD_PPP in dev_is_mac_header_xmit()
fsverity: skip PKCS#7 parser when keyring is empty
pstore/ram: Check start of empty przs during init
s390/ipl: add missing secure/has_secure file to ipl type 'unknown'
crypto: stm32 - fix loop iterating through scatterlist for DMA
cpufreq: brcmstb-avs-cpufreq: Fix -Warray-bounds bug
usb: typec: bus: verify partner exists in typec_altmode_attention
USB: core: Unite old scheme and new scheme descriptor reads
USB: core: Change usb_get_device_descriptor() API
USB: core: Fix race by not overwriting udev->descriptor in hub_port_init()
USB: core: Fix oversight in SuperSpeed initialization
usb: typec: tcpci: clear the fault status bit
tracing: Zero the pipe cpumask on alloc to avoid spurious -EBUSY
md/md-bitmap: remove unnecessary local variable in backlog_store()
udf: initialize newblock to 0
net/ipv6: SKB symmetric hash should incorporate transport ports
io_uring: always lock in io_apoll_task_func
io_uring: break out of iowq iopoll on teardown
io_uring: break iopolling on signal
scsi: qla2xxx: Fix deletion race condition
scsi: qla2xxx: fix inconsistent TMF timeout
scsi: qla2xxx: Fix erroneous link up failure
scsi: qla2xxx: Turn off noisy message log
scsi: qla2xxx: Remove unsupported ql2xenabledif option
fbdev/ep93xx-fb: Do not assign to struct fb_info.dev
drm/ast: Fix DRAM init on AST2200
lib/test_meminit: allocate pages up to order MAX_ORDER
parisc: led: Fix LAN receive and transmit LEDs
parisc: led: Reduce CPU overhead for disk & lan LED computation
pinctrl: cherryview: fix address_space_handler() argument
dt-bindings: clock: xlnx,versal-clk: drop select:false
clk: imx: pll14xx: dynamically configure PLL for 393216000/361267200Hz
clk: qcom: gcc-mdm9615: use proper parent for pll0_vote clock
soc: qcom: qmi_encdec: Restrict string length in decode
NFS: Fix a potential data corruption
NFSv4/pnfs: minor fix for cleanup path in nfs4_get_device_info
kconfig: fix possible buffer overflow
backlight: gpio_backlight: Drop output GPIO direction check for initial power state
perf annotate bpf: Don't enclose non-debug code with an assert()
x86/virt: Drop unnecessary check on extended CPUID level in cpu_has_svm()
perf top: Don't pass an ERR_PTR() directly to perf_session__delete()
watchdog: intel-mid_wdt: add MODULE_ALIAS() to allow auto-load
pwm: lpc32xx: Remove handling of PWM channels
net/sched: fq_pie: avoid stalls in fq_pie_timer()
sctp: annotate data-races around sk->sk_wmem_queued
ipv4: annotate data-races around fi->fib_dead
net: read sk->sk_family once in sk_mc_loop()
drm/i915/gvt: Save/restore HW status to support GVT suspend/resume
drm/i915/gvt: Drop unused helper intel_vgpu_reset_gtt()
ipv4: ignore dst hint for multipath routes
igb: disable virtualization features on 82580
veth: Fixing transmit return status for dropped packets
net: ipv6/addrconf: avoid integer underflow in ipv6_create_tempaddr
af_unix: Fix data-races around user->unix_inflight.
af_unix: Fix data-race around unix_tot_inflight.
af_unix: Fix data-races around sk->sk_shutdown.
af_unix: Fix data race around sk->sk_err.
net: sched: sch_qfq: Fix UAF in qfq_dequeue()
kcm: Destroy mutex in kcm_exit_net()
igc: Change IGC_MIN to allow set rx/tx value between 64 and 80
igbvf: Change IGBVF_MIN to allow set rx/tx value between 64 and 80
igb: Change IGB_MIN to allow set rx/tx value between 64 and 80
s390/zcrypt: don't leak memory if dev_set_name() fails
idr: fix param name in idr_alloc_cyclic() doc
ip_tunnels: use DEV_STATS_INC()
net: dsa: sja1105: fix bandwidth discrepancy between tc-cbs software and offload
net: dsa: sja1105: fix -ENOSPC when replacing the same tc-cbs too many times
netfilter: nfnetlink_osf: avoid OOB read
net: hns3: fix the port information display when sfp is absent
sh: boards: Fix CEU buffer size passed to dma_declare_coherent_memory()
ext4: add correct group descriptors and reserved GDT blocks to system zone
ata: sata_gemini: Add missing MODULE_DESCRIPTION
ata: pata_ftide010: Add missing MODULE_DESCRIPTION
fuse: nlookup missing decrement in fuse_direntplus_link
btrfs: don't start transaction when joining with TRANS_JOIN_NOSTART
btrfs: use the correct superblock to compare fsid in btrfs_validate_super
mtd: rawnand: brcmnand: Fix crash during the panic_write
mtd: rawnand: brcmnand: Fix potential out-of-bounds access in oob write
mtd: rawnand: brcmnand: Fix potential false time out warning
drm/amd/display: prevent potential division by zero errors
perf hists browser: Fix hierarchy mode header
perf tools: Handle old data in PERF_RECORD_ATTR
perf hists browser: Fix the number of entries for 'e' key
ACPI: APEI: explicit init of HEST and GHES in apci_init()
arm64: sdei: abort running SDEI handlers during crash
scsi: qla2xxx: If fcport is undergoing deletion complete I/O with retry
scsi: qla2xxx: Consolidate zio threshold setting for both FCP & NVMe
scsi: qla2xxx: Fix crash in PCIe error handling
scsi: qla2xxx: Flush mailbox commands on chip reset
ARM: dts: samsung: exynos4210-i9100: Fix LCD screen's physical size
ARM: dts: BCM5301X: Extend RAM to full 256MB for Linksys EA6500 V2
bus: mhi: host: Skip MHI reset if device is in RDDM
net: ipv4: fix one memleak in __inet_del_ifa()
selftests/kselftest/runner/run_one(): allow running non-executable files
kselftest/runner.sh: Propagate SIGTERM to runner child
net/smc: use smc_lgr_list.lock to protect smc_lgr_list.list iterate in smcr_port_add
net: ethernet: mvpp2_main: fix possible OOB write in mvpp2_ethtool_get_rxnfc()
net: ethernet: mtk_eth_soc: fix possible NULL pointer dereference in mtk_hwlro_get_fdir_all()
hsr: Fix uninit-value access in fill_frame_info()
r8152: check budget for r8152_poll()
kcm: Fix memory leak in error path of kcm_sendmsg()
platform/mellanox: mlxbf-tmfifo: Drop the Rx packet if no more descriptors
platform/mellanox: mlxbf-tmfifo: Drop jumbo frames
net/tls: do not free tls_rec on async operation in bpf_exec_tx_verdict()
ipv6: fix ip6_sock_set_addr_preferences() typo
ixgbe: fix timestamp configuration code
kcm: Fix error handling for SOCK_DGRAM in kcm_sendmsg().
drm/amd/display: Fix a bug when searching for insert_above_mpcc
parisc: Drop loops_per_jiffy from per_cpu struct
Linux 5.10.195
Change-Id: I4eef618f573b6d4201e05c9cf56088d77d712d97
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
[ Upstream commit f4f8a78031 ]
The opt_num field is controlled by user mode and is not currently
validated inside the kernel. An attacker can take advantage of this to
trigger an OOB read and potentially leak information.
BUG: KASAN: slab-out-of-bounds in nf_osf_match_one+0xbed/0xd10 net/netfilter/nfnetlink_osf.c:88
Read of size 2 at addr ffff88804bc64272 by task poc/6431
CPU: 1 PID: 6431 Comm: poc Not tainted 6.0.0-rc4 #1
Call Trace:
nf_osf_match_one+0xbed/0xd10 net/netfilter/nfnetlink_osf.c:88
nf_osf_find+0x186/0x2f0 net/netfilter/nfnetlink_osf.c:281
nft_osf_eval+0x37f/0x590 net/netfilter/nft_osf.c:47
expr_call_ops_eval net/netfilter/nf_tables_core.c:214
nft_do_chain+0x2b0/0x1490 net/netfilter/nf_tables_core.c:264
nft_do_chain_ipv4+0x17c/0x1f0 net/netfilter/nft_chain_filter.c:23
[..]
Also add validation to genre, subtype and version fields.
Fixes: 11eeef41d5 ("netfilter: passive OS fingerprint xtables match")
Reported-by: Lucas Leong <wmliang@infosec.exchange>
Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit e994764976 upstream.
sctp_mt_check doesn't validate the flag_count field. An attacker can
take advantage of that to trigger a OOB read and leak memory
information.
Add the field validation in the checkentry function.
Fixes: 2e4e6a17af ("[NETFILTER] x_tables: Abstraction layer for {ip,ip6,arp}_tables")
Cc: stable@vger.kernel.org
Reported-by: Lucas Leong <wmliang@infosec.exchange>
Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 69c5d284f6 upstream.
The xt_u32 module doesn't validate the fields in the xt_u32 structure.
An attacker may take advantage of this to trigger an OOB read by setting
the size fields with a value beyond the arrays boundaries.
Add a checkentry function to validate the structure.
This was originally reported by the ZDI project (ZDI-CAN-18408).
Fixes: 1b50b8a371 ("[NETFILTER]: Add u32 match")
Cc: stable@vger.kernel.org
Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 050d91c03b upstream.
The missing IP_SET_HASH_WITH_NET0 macro in ip_set_hash_netportnet can
lead to the use of wrong `CIDR_POS(c)` for calculating array offsets,
which can lead to integer underflow. As a result, it leads to slab
out-of-bound access.
This patch adds back the IP_SET_HASH_WITH_NET0 macro to
ip_set_hash_netportnet to address the issue.
Fixes: 886503f34d ("netfilter: ipset: actually allow allowable CIDR 0 in hash:net,port,net")
Suggested-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Kyle Zeng <zengyhkyle@gmail.com>
Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>