When hop_num is more than three, it need to return -EINVAL. This patch
fixes it.
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
When create loop qp fail, it will return the correct result when
modify_qp() fails.
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
When init cmq fail in initial flow of RoCE, it should return the errno of
cmq_init function, not of the rest call.
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This patch adds 'nr_hugepages_mempolicy' parameter which is currently missing
in 'vm.txt' file. It also contains a short description of 'nr_hugepages_mempolicy'
parameter.
Signed-off-by: Prashant Dhamdhere <pdhamdhe@redhat.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Use designated function mlx5e_dma_get() to get
the mlx5e_sq_dma object to be pushed into fifo.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Pointers in DB are static, move them to read-only area so they
do not share a cacheline with fields modified in datapath.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
A loaded XDP program might write to the xdp_frame data area,
prefetchw() it to avoid a potential cache miss.
Performance tests:
ConnectX-5, XDP_TX packet rate, single ring.
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
Before: 13,172,976 pps
After: 13,456,248 pps
2% gain.
Fixes: 22f453988194 ("net/mlx5e: Support XDP over Striding RQ")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add implementation for the ndo_xdp_xmit callback.
Dedicate a new set of XDP-SQ instances to satisfy the XDP_REDIRECT
requests. These instances are totally separated from the existing
XDP-SQ objects that satisfy local XDP_TX actions.
Performance tests:
xdp_redirect_map from ConnectX-5 to ConnectX-5.
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
Packet-rate of 64B packets.
Single queue: 7 Mpps.
Multi queue: 55 Mpps.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In the downstream patch that adds support to XDP_REDIRECT-out,
the XDP xmit frame function doesn't share the same run context as
the NAPI that polls the XDP-SQ completion queue.
Hence, need to re-order the XDP-SQ fields to avoid cacheline
false-sharing.
Take redirect_flush and doorbell out of DB, into separated
cachelines.
Add a cacheline breaker within the stats struct.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Separate the XDP counters into two sets:
(1) One set reside in the RQ stats, and they monitor XDP stats
in the RQ side.
(2) Another set is per XDP-SQ, and they monitor XDP stats that
are related to XDP transmit flow.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Convert the XDP xmit functions to use the generic xdp_frame API
in XDP_TX flow.
Same functions will be used later in this series to transmit
the XDP redirect-out packets as well.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add per-ring and total stats for received packets that
goes into XDP redirection.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Take XDP code out of the general EN header and RX file into
new XDP files.
Currently, XDP-SQ resides only within an RQ and used from a
single flow (XDP_TX) triggered upon RX completions.
In a downstream patch, additional type of XDP-SQ instances will be
presented and used for the XDP_REDIRECT flow, totally unrelated to
the RX context.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add checks in control path upon an MTU change or an XDP program set,
to prevent reaching cases where large MTU and XDP are set simultaneously.
This is to make sure we allow XDP only with the linear RX memory scheme,
i.e. a received packet is not scattered to different pages.
Change mlx5e_rx_get_linear_frag_sz() accordingly, so that we make sure
the XDP configuration can really be set, instead of assuming that it is.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Dedicate a function to all checks done when setting an XDP program.
Take indications from priv instead of netdev features.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
No need to expose the MPWQE free function to control path.
The dealloc function already exposed, use it.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
This patch includes the kernel-hacking translation in Italian (both
hacking.rst and locking.rst).
It adds also the anchors for the english kernel-hacking documents.
Signed-off-by: Federico Vaga <federico.vaga@vaga.pv.it>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Two jump tags were misspelled, leading to non-working cross-reference
links.
Reported-by: Federico Vaga <federico.vaga@vaga.pv.it>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
When a CX5 device is configured in dual-port RoCE mode, after creating
many VFs against port 1, creating the same number of VFs against port 2
will flood kernel/syslog with something like
"mlx5_*:mlx5_ib_bind_slave_port:4266:(pid 5269): port 2 already
affiliated."
So basically, when traversing mlx5_ib_dev_list, mlx5_ib_add_slave_port()
repeatedly attempts to bind the new mpi structure to every device on the
list until it finds an unbound device.
Change the log level from warn to dbg to avoid log flooding as the warning
should be harmless.
Signed-off-by: Qing Huang <qing.huang@oracle.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
When the underlying device is a 4 KiB logical block size device with a
protection interval exponent of 0, i.e. 4096 bytes data + 8 bytes PI, the
driver miscalculates the pi_bytes{out,in} by a factor of 8x (64 bytes).
This leads to errors on all reads and writes on 4 KiB logical block size
devices when CONFIG_BLK_DEV_INTEGRITY is enabled and the
VIRTIO_SCSI_F_T10_PI feature bit has been negotiated.
Fixes: e6dc783a38ec0 ("virtio-scsi: Enable DIF/DIX modes in SCSI host LLD")
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Edwards <gedwards@ddn.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This allows bio_integrity_bytes() to be called from drivers instead of
open coding it.
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Edwards <gedwards@ddn.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The name of the directory for per-cpu function statistics
is trace_stat, not trace_stats.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Jeff Kirsher says:
====================
10GbE Intel Wired LAN Driver Updates 2018-07-26
This series contains updates to ixgbe and igb.
Tony fixes ixgbe to add checks to ensure jumbo frames or LRO get enabled
after an XDP program is loaded.
Shannon Nelson adds the missing security configuration registers to the
ixgbe register dump, which will help in debugging.
Christian Grönke fixes an issue in igb that occurs on SGMII based SPF
mdoules, by reverting changes from 2 previous patches. The issue was
that initialization would fail on the fore mentioned modules because the
driver would try to reset the PHY before obtaining the PHY address of
the SGMII attached PHY.
Venkatesh Srinivas replaces wmb() with dma_wmb() for doorbell writes,
which avoids SFENCEs before the doorbell writes.
Alex cleans up and refactors ixgbe Tx/Rx shutdown to reduce time needed
to stop the device. The code refactor allows us to take the completion
time into account when disabling queues, so that on some platforms with
higher completion times, would not result in receive queues disabled
messages.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Once user manually deletes the chain using "chain del", the chain cannot
be marked as explicitly created anymore.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Fixes: 32a4f5ecd738 ("net: sched: introduce chain object to uapi")
Signed-off-by: David S. Miller <davem@davemloft.net>
The zerocopy path ultimately calls iov_iter_get_pages, which defines the
step function for ITER_KVECs as simply, return -EFAULT. Taking the
non-zerocopy path for ITER_KVECs avoids the unnecessary fallback.
See https://lore.kernel.org/lkml/20150401023311.GL29656@ZenIV.linux.org.uk/T/#u
for a discussion of why zerocopy for vmalloc data is not a good idea.
Discovered while testing NBD traffic encrypted with ktls.
Fixes: c46234ebb4d1 ("tls: RX path for ktls")
Signed-off-by: Doron Roberts-Kedes <doronrk@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Code at line 1850 is unreachable. Fix this by removing the break
statement above it, so the code for case RTM_GETCHAIN can be
properly executed.
Addresses-Coverity-ID: 1472050 ("Structurally dead code")
Fixes: 32a4f5ecd738 ("net: sched: introduce chain object to uapi")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The tunnel reception hook is only used by l2tp_ppp for skipping PPP
framing bytes. This is a session specific operation, but once a PPP
session sets ->recv_payload_hook on its tunnel, all frames received by
the tunnel will enter pppol2tp_recv_payload_hook(), including those
targeted at Ethernet sessions (an L2TPv3 tunnel can multiplex PPP and
Ethernet sessions).
So this mechanism is wrong, and uselessly complex. Let's just move this
functionality to the pppol2tp rx handler and drop ->recv_payload_hook.
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
when tipc_own_id failed to obtain node identity,dev_put should
be call before return -EINVAL.
Fixes: 682cd3cf946b ("tipc: confgiure and apply UDP bearer MTU on running links")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Removed checks against non-NULL before calling kfree_skb() and
crypto_free_aead(). These functions are safe to be called with NULL
as an argument.
Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Acked-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix dev_change_tx_queue_len so it rolls back original value
upon a failure in dev_qdisc_change_tx_queue_len.
This is already done for notifirers' failures, share the code.
In case of failure in dev_qdisc_change_tx_queue_len, some tx queues
would still be of the new length, while they should be reverted.
Currently, the revert is not done, and is marked with a TODO label
in dev_qdisc_change_tx_queue_len, and should find some nice solution
to do it.
Yet it is still better to not apply the newly requested value.
Fixes: 48bfd55e7e41 ("net_sched: plug in qdisc ops change_tx_queue_len")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Reported-by: Ran Rozenstein <ranro@mellanox.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This will allow to install a child qdisc under cbs. The main use case
is to install ETF (Earliest TxTime First) qdisc under cbs, so there's
another level of control for time-sensitive traffic.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Return -ESTALE to force a lookup when the file has no more links
Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
execute_ok() will only check the mode bits if the object is not a
directory, so we don't need to revalidate the attributes in that case.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
When nfs_update_inode() sets NFS_INO_INVALID_ACCESS it is a sign that
we want to revalidate the access cache, not the inode attributes.
In fact we only want to revalidate here if we see that the mode bits
are invalid, so check for NFS_INO_INVALID_OTHER instead.
Reported-by: Olga Kornievskaia <aglo@umich.edu>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
If the writes are being rescheduled due to a pNFS error, then we really
want to immediately start a new flush. The O_DIRECT code already does
this, so we only need to worry about buffered writes.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
If the client is sending a layoutget, but the server issues a callback
to recall what it thinks may be an outstanding layout, then we may find
an uninitialised layout attached to the inode due to the layoutget.
In that case, it is appropriate to return NFS4ERR_NOMATCHING_LAYOUT
rather than NFS4ERR_DELAY, as the latter can end up deadlocking.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Even if the results of the permissions checks failed, we should parse
the results of the layout on open call so that we can return the
layout if required.
Note that we also want to ignore the sequence counter for whether or not
a layout recall occurred. If the recall pertained to our OPEN, then the
callback will know, and will attempt to wait for us to finih processing
anyway.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
If the old layout was recalled, and we returned NFS4ERR_NOMATCHINGLAYOUT
then we need to wait for all outstanding layoutget calls to complete
before we can send a new one.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
If a layout segment is carrying layoutstats or layout error information,
then we always want to return it rather than using a forgetful model.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
If a layout has been recalled, then we should fire off a layoutreturn as
soon as all the layout segments that match the recall have been retired.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
RFC5661 doesn't state directly that the client should update the layout
stateid if it returns NFS4ERR_NOMATCHING_LAYOUT in response to a recall,
however it does state that this error will "cleanly indicate completion"
on par with returning the layout. For this reason, we assume that the
client should update the layout stateid. The Linux pNFS server definitely
does expect this behaviour.
However, if the client replies NFS4ERR_DELAY, then it is stating that
the recall was not processed, so it would be very wrong to update the
layout stateid.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
If there are layout segments that are marked for return, then we need
to ensure that pnfs_mark_matching_lsegs_return() does not just
silently discard them, but it should tell the caller that there is a
layoutreturn scheduled.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
- add EINT support
- add gpio-ranges property to pinctrl
- add earlycon to rfb1 to find boot errros more easily
- fix uart clock
- add iommu and smi bindings
- mt6797:
- add support for the 96 board x20 development board
- fix cooling-cells of mt7622 and mt8173
-----BEGIN PGP SIGNATURE-----
iQJLBAABCAA1FiEEiUuSfQSYnG8EMsBltDliWyzx00MFAltU6hoXHG1hdHRoaWFz
LmJnZ0BnbWFpbC5jb20ACgkQtDliWyzx00MQTQ/+JI1N6YWXFKkl6NqGFlFd5Spj
oLdlnagqc45b368eEKj+wBtbidgryhLXdPB62dSCQGABt9Id0WA+5fbTvq7xkvw7
cU8pNV0SPn5+mw2NzQfCgRtHyFEn8u+pGCZkn9CwX7yjk9mSWWKpFANJYy3isqPD
hch8Pne5c1MtB1X9MVbrdo6IKuscdD8isXuGMEWMDyKq0VniOzC+uVa/uWDBdp++
C31OoJE6KQmJj3CJPTUM41DfiwXUfJro/Prfy0xHhoZ6qctJmkuHeCZUfbwBIGSQ
xGBaN6C0AmyPzxfc56C2lVOGrTEy5LU7CgMYnMIvDQjAhREKl0nQj/E/F5YDQP/z
BPDGDu+GhKi4xHNwf1ZchDLkiALOPPM7tKsorxcWNAxlKHV/0Gz737Twj7SaiWNn
GrRNDrfyG1L3Lk3aJsPQ/S7IzTSQWmhXCFPgQKX86qijl/FfVtAEg79Mv1ay0Ii1
8uV9KnDQqiaElZRFxfeOF7oCsK1I4ugcDA7TmsWzk6diApO6w+v9mX8tTcgYetpv
a80k7ymJEStfMSBr3p3mRs5b8dGC8940yik332YzPMLufJ6ocBjqpzgEUgx9S5kk
dkYPV80CrtEYE5VmvjJjcQSroR6RNxjuMvwnsBxj0sH/KuUMFPoh+9qyXzR68g5l
X/pJ6H5271ek+qPOIx0=
=+G4b
-----END PGP SIGNATURE-----
Merge tag 'v4.18-next-dts64' of https://git.kernel.org/pub/scm/linux/kernel/git/matthias.bgg/linux into next/dt
- mt7622:
- add EINT support
- add gpio-ranges property to pinctrl
- add earlycon to rfb1 to find boot errros more easily
- fix uart clock
- add iommu and smi bindings
- mt6797:
- add support for the 96 board x20 development board
- fix cooling-cells of mt7622 and mt8173
* tag 'v4.18-next-dts64' of https://git.kernel.org/pub/scm/linux/kernel/git/matthias.bgg/linux:
arm64: dts: Add Mediatek X20 Development Board support
dt-bindings: arm: mediatek: Document Mediatek X20 Development Board
dt-bindings: mediatek: Add binding for mt2712 IOMMU and SMI
arm64: dts: mt7622: update a clock property for UART0
arm64: dts: mt7622: add earlycon to mt7622-rfb1 board
arm64: dts: mt7622: use gpio-ranges to pinctrl device
arm64: dts: mediatek: Add missing cooling device properties for CPUs
arm64: dts: mt7622: add EINT support to pinctrl
Signed-off-by: Olof Johansson <olof@lixom.net>
Before this patch gfs2_rgrp_ondisk2lvb was called after every call
to gfs2_rgrp_out. This patch just calls it directly from within
gfs2_rgrp_out, and moves the function to be before it so we don't
need a function prototype.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>