twx-linux/include/uapi/linux
Daniel Borkmann f318903c0b bpf: Add netns cookie and enable it for bpf cgroup hooks
In Cilium we're mainly using BPF cgroup hooks today in order to implement
kube-proxy free Kubernetes service translation for ClusterIP, NodePort (*),
ExternalIP, and LoadBalancer as well as HostPort mapping [0] for all traffic
between Cilium managed nodes. While this works in its current shape and avoids
packet-level NAT for inter Cilium managed node traffic, there is one major
limitation we're facing today, that is, lack of netns awareness.

In Kubernetes, the concept of Pods (which hold one or multiple containers)
has been built around network namespaces, so while we can use the global scope
of attaching to root BPF cgroup hooks also to our advantage (e.g. for exposing
NodePort ports on loopback addresses), we also have the need to differentiate
between initial network namespaces and non-initial one. For example, ExternalIP
services mandate that non-local service IPs are not to be translated from the
host (initial) network namespace as one example. Right now, we have an ugly
work-around in place where non-local service IPs for ExternalIP services are
not xlated from connect() and friends BPF hooks but instead via less efficient
packet-level NAT on the veth tc ingress hook for Pod traffic.

On top of determining whether we're in initial or non-initial network namespace
we also have a need for a socket-cookie like mechanism for network namespaces
scope. Socket cookies have the nice property that they can be combined as part
of the key structure e.g. for BPF LRU maps without having to worry that the
cookie could be recycled. We are planning to use this for our sessionAffinity
implementation for services. Therefore, add a new bpf_get_netns_cookie() helper
which would resolve both use cases at once: bpf_get_netns_cookie(NULL) would
provide the cookie for the initial network namespace while passing the context
instead of NULL would provide the cookie from the application's network namespace.
We're using a hole, so no size increase; the assignment happens only once.
Therefore this allows for a comparison on initial namespace as well as regular
cookie usage as we have today with socket cookies. We could later on enable
this helper for other program types as well as we would see need.

  (*) Both externalTrafficPolicy={Local|Cluster} types
  [0] https://github.com/cilium/cilium/blob/master/bpf/bpf_sock.c

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/c47d2346982693a9cf9da0e12690453aded4c788.1585323121.git.daniel@iogearbox.net
2020-03-27 19:40:38 -07:00
..
android
byteorder
caif
can can: don't use deprecated license identifiers 2019-11-05 12:44:34 +01:00
cifs
dvb
genwqe
hdlc wan/hdlc_x25: make lapb params configurable 2020-01-21 11:41:36 +01:00
hsi
iio
isdn isdn/capi: check message length in capi_write() 2019-09-07 17:44:25 +02:00
mmc
netfilter netfilter: conntrack: allow insertion of clashing entries 2020-02-17 10:55:14 +01:00
netfilter_arp net, uapi: fix -Wpointer-arith warnings 2019-10-04 14:25:17 -07:00
netfilter_bridge net, uapi: fix -Wpointer-arith warnings 2019-10-04 14:25:17 -07:00
netfilter_ipv4 net, uapi: fix -Wpointer-arith warnings 2019-10-04 14:25:17 -07:00
netfilter_ipv6 net, uapi: fix -Wpointer-arith warnings 2019-10-04 14:25:17 -07:00
nfsd nfsd: add support for upcall version 2 2019-09-10 09:26:33 -04:00
raid md: add feature flag MD_FEATURE_RAID0_LAYOUT 2019-09-13 13:10:06 -07:00
sched
spi
sunrpc
tc_act net: sched: add erspan option support to act_tunnel_key 2019-11-21 11:44:06 -08:00
tc_ematch
usb usb: charger: assign specific number for enum value 2020-02-10 11:08:30 -08:00
wimax
a.out.h
acct.h acct: stop using get_seconds() 2019-12-18 18:07:31 +01:00
adb.h
adfs_fs.h
affs_hardblocks.h
agpgart.h
aio_abi.h
am437x-vpfe.h
apm_bios.h
arcfb.h
arm_sdei.h
aspeed-lpc-ctrl.h
aspeed-p2a-ctrl.h
atalk.h
atm_eni.h
atm_he.h
atm_idt77105.h
atm_nicstar.h
atm_tcp.h
atm_zatm.h
atm.h
atmapi.h
atmarp.h
atmbr2684.h
atmclip.h
atmdev.h
atmioc.h
atmlec.h
atmmpc.h
atmppp.h
atmsap.h
atmsvc.h
audit.h bpf: Emit audit messages upon successful prog load and unload 2019-12-11 17:41:09 +01:00
auto_dev-ioctl.h
auto_fs4.h
auto_fs.h
auxvec.h
ax25.h
batadv_packet.h batman-adv: Update copyright years for 2020 2020-01-01 00:00:33 +01:00
batman_adv.h batman-adv: Update copyright years for 2020 2020-01-01 00:00:33 +01:00
baycom.h
bcache.h bcache: use read_cache_page_gfp to read the superblock 2020-01-23 11:40:01 -07:00
bcm933xx_hcs.h
bfs_fs.h
binfmts.h
blkpg.h
blktrace_api.h
blkzoned.h block: add zone open, close and finish ioctl support 2019-11-07 06:31:50 -07:00
bpf_common.h
bpf_perf_event.h
bpf.h bpf: Add netns cookie and enable it for bpf cgroup hooks 2020-03-27 19:40:38 -07:00
bpfilter.h
bpqether.h
bsg.h
bt-bmc.h
btf.h bpf: Introduce function-by-function verification 2020-01-10 17:20:07 +01:00
btrfs_tree.h btrfs: add support for 4-copy replication (raid1c4) 2019-11-18 17:51:49 +01:00
btrfs.h btrfs: add incompat for raid1 with 3, 4 copies 2019-11-18 17:51:49 +01:00
can.h can: don't use deprecated license identifiers 2019-11-05 12:44:34 +01:00
capability.h prctl: PR_{G,S}ET_IO_FLUSHER to support controlling memory reclaim 2020-01-28 10:09:51 +01:00
capi.h
cciss_defs.h
cciss_ioctl.h
cdrom.h
cec-funcs.h media: cec-funcs.h: use new CEC_OP_UI_CMD defines 2019-10-07 07:55:17 -03:00
cec.h media: cec: expose the new connector info API 2019-10-01 17:19:41 -03:00
cgroupstats.h
chio.h scsi: ch: add include guard to chio.h 2019-10-09 22:31:14 -04:00
cm4000_cs.h
cn_proc.h
coda.h
coff.h linux/coff.h: add include guard 2019-09-25 17:51:39 -07:00
connector.h
const.h
coresight-stm.h
cramfs_fs.h
cryptouser.h
cuda.h
cyclades.h y2038: uapi: change __kernel_time_t to __kernel_old_time_t 2019-11-15 14:38:29 +01:00
cycx_cfm.h
dcbnl.h net: Fix misspellings of "configure" and "configuration" 2019-10-28 13:41:01 -07:00
dccp.h
devlink.h devlink: Introduce devlink port flavour virtual 2020-03-03 15:40:40 -08:00
dlm_device.h
dlm_netlink.h
dlm_plock.h
dlm.h
dlmconstants.h
dm-ioctl.h dm: bump version of core and various targets 2020-03-03 11:10:21 -05:00
dm-log-userspace.h
dma-buf.h
dma-heap.h dma-buf: heaps: Use _IOCTL_ for userspace IOCTL identifier 2019-12-17 21:37:40 +05:30
dn.h
dns_resolver.h
dqblk_xfs.h
edd.h
efs_fs_sb.h
elf-em.h
elf-fdpic.h
elf.h
elfcore.h y2038: elfcore: Use __kernel_old_timeval for process times 2019-11-15 14:38:29 +01:00
errno.h
errqueue.h y2038: socket: remove timespec reference in timestamping 2019-11-15 14:38:29 +01:00
erspan.h
ethtool_netlink.h ethtool: add CHANNELS_NTF notification 2020-03-12 15:32:33 -07:00
ethtool.h ethtool: Add support for low latency RS FEC 2020-02-18 19:17:31 -08:00
eventpoll.h
fadvise.h
falloc.h
fanotify.h
fb.h
fcntl.h open: introduce openat2(2) syscall 2020-01-18 09:19:18 -05:00
fd.h
fdreg.h
fib_rules.h
fiemap.h
filter.h
firewire-cdev.h
firewire-constants.h
fou.h
fpga-dfl.h
fs.h f2fs-for-5.4-rc1 2019-09-21 14:26:33 -07:00
fscrypt.h fscrypt: include <linux/ioctl.h> in UAPI header 2019-12-31 10:33:51 -06:00
fsi.h
fsl_hypervisor.h
fsmap.h
fsverity.h fs-verity: add SHA-512 support 2019-08-12 19:33:50 -07:00
fuse.h fuse: Add changelog entries for protocols 7.1 - 7.8 2019-10-23 14:26:37 +02:00
futex.h
gameport.h
gen_stats.h net_sched: add TCA_STATS_PKT64 attribute 2019-11-05 18:20:55 -08:00
genetlink.h
gfs2_ondisk.h
gpio.h gpio: add new SET_CONFIG ioctl() to gpio chardev 2019-11-12 16:30:31 +01:00
gsmmux.h tty: n_gsm: add ioctl to map serial device to mux'ed tty 2019-09-04 12:43:54 +02:00
gtp.h
hash_info.h
hdlc.h
hdlcdrv.h
hdreg.h
hid.h
hiddev.h
hidraw.h HID: hidraw: add support uniq ioctl 2019-12-11 15:31:52 +01:00
hpet.h
hsr_netlink.h
hw_breakpoint.h
hyperv.h
i2c-dev.h
i2c.h
i2o-dev.h
i8k.h
icmp.h
icmpv6.h
idxd.h dmaengine: idxd: Init and probe for Intel data accelerators 2020-01-24 11:18:45 +05:30
if_addr.h
if_addrlabel.h
if_alg.h
if_arcnet.h arcnet: Replace zero-length array with flexible-array member 2020-02-29 21:52:20 -08:00
if_arp.h
if_bonding.h bonding: rename AD_STATE_* to LACP_STATE_* 2019-12-26 13:09:37 -08:00
if_bridge.h net: bridge: vlan: add per-vlan state 2020-01-24 12:58:14 +01:00
if_cablemodem.h
if_eql.h
if_ether.h
if_fc.h
if_fddi.h
if_frad.h
if_hippi.h
if_infiniband.h
if_link.h net: Special handling for IP & MPLS. 2020-02-24 13:31:42 -08:00
if_ltalk.h
if_macsec.h macsec: Netlink support of XPN cipher suites (IEEE 802.1AEbw) 2020-03-16 01:42:31 -07:00
if_packet.h
if_phonet.h
if_plip.h
if_ppp.h
if_pppol2tp.h
if_pppox.h
if_slip.h
if_team.h
if_tun.h
if_tunnel.h
if_vlan.h
if_x25.h
if_xdp.h xsk: add support to allow unaligned chunk placement 2019-08-31 01:08:26 +02:00
if.h wan/hdlc_x25: make lapb params configurable 2020-01-21 11:41:36 +01:00
ife.h
igmp.h
ila.h
in6.h
in_route.h
in.h seg6: fix SRv6 L2 tunnels to use IANA-assigned protocol number 2020-03-11 23:49:30 -07:00
inet_diag.h bpf: inet_diag: Dump bpf_sk_storages in inet_diag_dump() 2020-02-27 18:50:19 -08:00
inotify.h
input-event-codes.h Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2019-12-07 18:33:01 -08:00
input.h Input: input_event - fix struct padding on sparc64 2019-12-13 15:00:36 -08:00
io_uring.h io_uring: add support for epoll_ctl(2) 2020-01-29 15:46:09 -07:00
ioctl.h
iommu.h iommu: Introduce guest PASID bind function 2019-10-15 13:34:43 +02:00
ip6_tunnel.h
ip_vs.h
ip.h
ipc.h
ipmi_bmc.h
ipmi_msgdefs.h
ipmi.h
ipsec.h
ipv6_route.h
ipv6.h
ipx.h
irqnr.h
iso_fs.h
isst_if.h
ivtv.h
ivtvfb.h
jffs2.h jffs2: Remove C++ style comments from uapi header 2019-08-22 17:24:51 +02:00
joystick.h
kcm.h
kcmp.h
kcov.h kcov: fix struct layout for kcov_remote_arg 2020-01-04 13:55:09 -08:00
kd.h
kdev_t.h
kernel-page-flags.h
kernel.h
kernelcapi.h
kexec.h parisc: add kexec syscall support 2019-09-08 15:37:04 +02:00
keyboard.h
keyctl.h
kfd_ioctl.h
kvm_para.h
kvm.h KVM: s390: Add new reset vcpu API 2020-01-31 12:50:04 +01:00
l2tp.h
libc-compat.h
lightnvm.h
limits.h
lirc.h
llc.h
loop.h
lp.h
lwtunnel.h lwtunnel: add options setting and dumping for erspan 2019-11-06 21:14:22 -08:00
magic.h fs: New zonefs file system 2020-02-07 14:39:38 +09:00
major.h
map_to_7segment.h
matroxfb.h
max2175.h
mdio.h net: phy: add EEE-related constants 2019-08-19 13:04:45 -07:00
media-bus-format.h
media.h
mei.h
membarrier.h
memfd.h
mempolicy.h
meye.h
mic_common.h
mic_ioctl.h
mii.h mii: Add helpers for parsing SGMII auto-negotiation 2020-01-05 23:22:32 -08:00
minix_fs.h
mman.h
mmtimer.h
module.h
mount.h
mpls_iptunnel.h
mpls.h
mqueue.h
mroute6.h
mroute.h
msdos_fs.h
msg.h y2038: uapi: change __kernel_time_t to __kernel_old_time_t 2019-11-15 14:38:29 +01:00
mtio.h
n_r3964.h
nbd-netlink.h
nbd.h
ncsi.h
ndctl.h
neighbour.h
net_dropmon.h drop_monitor: Replace zero-length array with flexible-array member 2020-03-02 11:16:28 -08:00
net_namespace.h
net_tstamp.h net: Introduce peer to peer one step PTP time stamping. 2019-12-25 19:51:34 -08:00
net.h
netconf.h
netdevice.h
netfilter_arp.h
netfilter_bridge.h
netfilter_decnet.h
netfilter_ipv4.h
netfilter_ipv6.h
netfilter.h
netlink_diag.h
netlink.h
netrom.h
nexthop.h
nfc.h
nfs2.h
nfs3.h
nfs4_mount.h
nfs4.h
nfs_fs.h
nfs_idmap.h
nfs_mount.h
nfs.h
nfsacl.h
nilfs2_api.h
nilfs2_ondisk.h
nl80211.h nl80211: Add support to configure TID specific RTSCTS configuration 2020-02-24 13:56:57 +01:00
nsfs.h
nubus.h
nvme_ioctl.h nvme: change nvme_passthru_cmd64 to explicitly mark rsvd 2019-11-06 06:17:38 +09:00
nvram.h
omap3isp.h
omapfb.h
oom.h
openat2.h open: introduce openat2(2) syscall 2020-01-18 09:19:18 -05:00
openvswitch.h openvswitch: add TTL decrement action 2020-02-16 19:34:44 -08:00
packet_diag.h
param.h
parport.h
patchkey.h
pci_regs.h PCI: dwc: intel: PCIe RC controller driver 2020-01-09 11:57:18 +00:00
pci.h
pcitest.h
perf_event.h perf/aux: Allow using AUX data in perf samples 2019-11-13 11:06:14 +01:00
personality.h
pfkeyv2.h
pg.h block: pg: add header include guard 2019-10-02 20:32:27 -06:00
phantom.h
phonet.h
pkt_cls.h sched: act: allow user to specify type of HW stats for a filter 2020-03-08 21:07:48 -07:00
pkt_sched.h net: sched: RED: Introduce an ECN nodrop mode 2020-03-14 21:03:46 -07:00
pktcdvd.h
pmu.h
poll.h
posix_acl_xattr.h
posix_acl.h
posix_types.h
ppdev.h
ppp_defs.h y2038: syscall implementation cleanups 2019-12-01 14:00:59 -08:00
ppp-comp.h
ppp-ioctl.h compat_ioctl: handle PPPIOCGIDLE for 64-bit time_t 2019-10-23 17:23:47 +02:00
pps.h
pr.h
prctl.h prctl: PR_{G,S}ET_IO_FLUSHER to support controlling memory reclaim 2020-01-28 10:09:51 +01:00
psample.h
psci.h
psp-sev.h crypto: ccp - Retry SEV INIT command in case of integrity check failure. 2019-10-26 02:09:58 +11:00
ptp_clock.h ptp: Introduce strict checking of external time stamp options. 2019-11-15 12:48:32 -08:00
ptrace.h
qemu_fw_cfg.h
qnx4_fs.h
qnxtypes.h
qrtr.h
quota.h
radeonfb.h
random.h random: ignore GRND_RANDOM in getentropy(2) 2020-01-07 16:07:01 -05:00
raw.h
rds.h net: rds: add service level support in rds-info 2019-08-24 16:55:25 -07:00
reboot.h
reiserfs_fs.h
reiserfs_xattr.h
resource.h y2038: rusage: use __kernel_old_timeval 2019-11-15 14:38:29 +01:00
rfkill.h
rio_cm_cdev.h
rio_mport_cdev.h
romfs_fs.h
rose.h
route.h
rpmsg.h
rseq.h
rtc.h rtc: define RTC_VL_READ values 2019-12-18 10:37:18 +01:00
rtnetlink.h net: bridge: vlan: add rtnetlink group and notify support 2020-01-15 13:48:18 +01:00
rxrpc.h
scc.h linux/scc.h: make uapi linux/scc.h self-contained 2019-12-04 19:44:12 -08:00
sched.h ns: Introduce Time Namespace 2020-01-14 12:20:48 +01:00
scif_ioctl.h
screen_info.h
sctp.h sctp: add SCTP_PEER_ADDR_THLDS_V2 sockopt 2019-11-08 14:18:32 -08:00
sdla.h
seccomp.h seccomp: rework define for SECCOMP_USER_NOTIF_FLAG_CONTINUE 2019-10-28 12:29:46 -07:00
securebits.h
sed-opal.h block: sed-opal: Add support to read/write opal tables generically 2019-11-04 07:11:31 -07:00
seg6_genl.h
seg6_hmac.h
seg6_iptunnel.h
seg6_local.h
seg6.h
selinux_netlink.h
sem.h y2038: uapi: change __kernel_time_t to __kernel_old_time_t 2019-11-15 14:38:29 +01:00
serial_core.h serial: fsl_linflexuart: Be consistent with the name 2019-10-16 06:11:24 -07:00
serial_reg.h
serial.h
serio.h
shm.h y2038: uapi: change __kernel_time_t to __kernel_old_time_t 2019-11-15 14:38:29 +01:00
signal.h
signalfd.h
smc_diag.h
smc.h
smiapp.h
snmp.h tcp: export count for rehash attempts 2020-01-26 15:28:47 +01:00
sock_diag.h bpf: INET_DIAG support in bpf_sk_storage 2020-02-27 18:50:19 -08:00
socket.h
sockios.h
sonet.h
sonypi.h
sound.h
soundcard.h
stat.h statx: define STATX_ATTR_VERITY 2019-11-13 12:15:34 -08:00
stddef.h
stm.h
string.h
suspend_ioctls.h
swab.h include/uapi/linux/swab.h: fix userspace breakage, use __BITS_PER_LONG for swap 2020-02-21 11:22:15 -08:00
switchtec_ioctl.h PCI/switchtec: Add Gen4 flash information interface support 2020-01-15 11:00:39 -06:00
sync_file.h
synclink.h
sysctl.h mm: fix comments related to node reclaim 2020-01-31 10:30:39 -08:00
sysinfo.h
target_core_user.h
taskstats.h tsacct: add 64-bit btime field 2019-12-18 18:07:31 +01:00
tcp_metrics.h
tcp.h tcp: add bytes not sent to SCM_TIMESTAMPING_OPT_STATS 2020-03-09 17:56:33 -07:00
tee.h tee: add AMD-TEE driver 2020-01-04 13:49:51 +08:00
termios.h
thermal.h
time_types.h y2038: rename itimerval to __kernel_old_itimerval 2019-12-18 18:07:33 +01:00
time.h y2038: hide timeval/timespec/itimerval/itimerspec types 2020-02-21 11:22:15 -08:00
timerfd.h
times.h
timex.h y2038: sparc: remove use of struct timex 2019-12-18 18:07:33 +01:00
tiocl.h
tipc_config.h net, uapi: fix -Wpointer-arith warnings 2019-10-04 14:25:17 -07:00
tipc_netlink.h tipc: make legacy address flag readable over netlink 2019-12-20 21:18:42 -08:00
tipc_sockets_diag.h
tipc.h tipc: add new AEAD key structure for user API 2019-11-08 14:01:59 -08:00
tls.h net: tls: export protocol version, cipher, tx_conf/rx_conf to socket diag 2019-08-31 23:44:28 -07:00
toshiba.h
tty_flags.h
tty.h
types.h
udf_fs_i.h
udmabuf.h
udp.h xfrm: add espintcp (RFC 8229) 2019-12-09 09:59:07 +01:00
uhid.h
uinput.h
uio.h
uleds.h
ultrasound.h
un.h
unistd.h
unix_diag.h
usbdevice_fs.h USB: usbfs: Add a capability flag for runtime suspend 2019-08-14 16:52:13 +02:00
usbip.h
userfaultfd.h
userio.h
utime.h y2038: uapi: change __kernel_time_t to __kernel_old_time_t 2019-11-15 14:38:29 +01:00
utsname.h
uuid.h
uvcvideo.h
v4l2-common.h
v4l2-controls.h media: add V4L2_CID_UNIT_CELL_SIZE control 2019-10-10 11:37:26 -03:00
v4l2-dv-timings.h
v4l2-mediabus.h
v4l2-subdev.h
vbox_err.h
vbox_vmmdev_types.h
vboxguest.h
veth.h
vfio_ccw.h
vfio.h Merge branches 'v5.4/vfio/alexey-tce-memory-free-v1', 'v5.4/vfio/connie-re-arrange-v2', 'v5.4/vfio/hexin-pci-reset-v3', 'v5.4/vfio/parav-mtty-uuid-v2' and 'v5.4/vfio/shameer-iova-list-v8' into v5.4/vfio/next 2019-08-23 11:26:24 -06:00
vhost_types.h
vhost.h
videodev2.h media: v4l2-core: fix v4l2_buffer handling for time64 ABI 2020-01-03 15:50:21 +01:00
virtio_9p.h
virtio_balloon.h
virtio_blk.h
virtio_config.h
virtio_console.h
virtio_crypto.h
virtio_fs.h virtio-fs: add virtiofs filesystem 2019-09-18 20:17:50 +02:00
virtio_gpu.h
virtio_ids.h virtio-fs: add virtiofs filesystem 2019-09-18 20:17:50 +02:00
virtio_input.h
virtio_iommu.h
virtio_mmio.h
virtio_net.h
virtio_pci.h
virtio_pmem.h
virtio_ring.h net, uapi: fix -Wpointer-arith warnings 2019-10-04 14:25:17 -07:00
virtio_rng.h
virtio_scsi.h
virtio_types.h
virtio_vsock.h
vm_sockets_diag.h
vm_sockets.h vsock: add VMADDR_CID_LOCAL definition 2019-12-11 15:01:23 -08:00
vmcore.h
vsockmon.h
vt.h
vtpm_proxy.h
wait.h
watchdog.h
wimax.h
wireguard.h wireguard: global: fix spelling mistakes in comments 2019-12-16 19:22:22 -08:00
wireless.h wireless: Use offsetof instead of custom macro. 2019-12-13 10:45:35 +01:00
wmi.h
x25.h
xattr.h
xdp_diag.h
xfrm.h
xilinx-v4l2-controls.h
zorro_ids.h
zorro.h