History log of /drivers/infiniband/hw/cxgb3/iwch_cm.c
Revision Date Author Comments
e4514cbd972786af67dd6c442c072685387e22a2 26-May-2014 Dan Carpenter <dan.carpenter@oracle.com> RDMA/cxgb3: Fix information leak in send_abort()

The cpl_abort_req struct has several reserved members which need to be
cleared to avoid disclosing kernel information. I have added a memset()
so now it matches the cxgb4 version of this function.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
24d44a391f1b5d56e9c7a4fc1edd085687864ff9 04-Jul-2013 Steve Wise <swise@opengridcomputing.com> RDMA/cma: Add IPv6 support for iWARP

Modify the type of local_addr and remote_addr fields in struct
iw_cm_id from struct sockaddr_in to struct sockaddr_storage to hold
IPv6 and IPv4 addresses uniformly.

Change the references of local_addr and remote_addr in cxgb4, cxgb3,
nes and amso drivers to match this. However to be able to actully run
traffic over IPv6, low-level drivers have to add code to support this.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>

[ Fix unused variable warnings when INFINIBAND_NES_DEBUG not set.
- Roland ]

Signed-off-by: Roland Dreier <roland@purestorage.com>
5107c2a3d117de8219a53e622c0f1f1bc3f7d1ae 03-Nov-2012 Julia Lawall <Julia.Lawall@lip6.fr> RDMA/cxgb3: use WARN

Use WARN rather than printk followed by WARN_ON(1), for conciseness.

A simplified version of the semantic patch that makes this transformation
is as follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@
expression list es;
@@

-printk(
+WARN(1,
es);
-WARN_ON(1);
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
142ad5db2b29a1c392e1b14934fae5d161d6c6e7 10-Aug-2012 Masanari Iida <standby24x7@gmail.com> IB: Fix typos in infiniband drivers

Correct spelling typos in comments in drivers/infiniband.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
534cb283efef9fdbd9f70f4615054d26aa444dd6 03-Jul-2012 David S. Miller <davem@davemloft.net> cxgb3: Convert t3_l2t_get() over to dst_neigh_lookup().

This means passing in a suitable destination address.

Signed-off-by: David S. Miller <davem@davemloft.net>
a4757123aeadf450b5b3c5f51f214660e20477f3 02-Dec-2011 David Miller <davem@davemloft.net> cxgb3: Rework t3_l2t_get to take a dst_entry instead of a neighbour.

This way we consolidate the RCU locking down into the place where it
actually matters, and also we can make the code handle
dst_get_neighbour_noref() returning NULL properly.

Signed-off-by: David S. Miller <davem@davemloft.net>
2721745501a26d0dc3b88c0d2f3aa11471891388 02-Dec-2011 David Miller <davem@davemloft.net> net: Rename dst_get_neighbour{, _raw} to dst_get_neighbour_noref{, _raw}.

To reflect the fact that a refrence is not obtained to the
resulting neighbour entry.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Roland Dreier <roland@purestorage.com>
580da35a31f91a594f3090b7a2c39b85cb051a12 29-Nov-2011 Eric Dumazet <eric.dumazet@gmail.com> IB: Fix RCU lockdep splats

Commit f2c31e32b37 ("net: fix NULL dereferences in check_peer_redir()")
forgot to take care of infiniband uses of dst neighbours.

Many thanks to Marc Aurele who provided a nice bug report and feedback.

Reported-by: Marc Aurele La France <tsi@ualberta.ca>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: <stable@kernel.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
56da00fc92e6f227874bba36f127ffc8847ee1f8 25-Sep-2011 Kumar Sanghvi <kumaras@chelsio.com> RDMA/{amso1100,cxgb3}: Minimal MPAv2 support

As part of MPAv2 Enhanced RDMA Negotiation, pass max supported ird/ord
values upwards for the time being in iw_cxgb3 and amso1100.

Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
e48f129c2f200dde8899f6ea5c6e7173674fc482 06-Sep-2011 Neil Horman <nhorman@tuxdriver.com> [SCSI] cxgb3i: convert cdev->l2opt to use rcu to prevent NULL dereference

This oops was reported recently:
d:mon> e
cpu 0xd: Vector: 300 (Data Access) at [c0000000fd4c7120]
pc: d00000000076f194: .t3_l2t_get+0x44/0x524 [cxgb3]
lr: d000000000b02108: .init_act_open+0x150/0x3d4 [cxgb3i]
sp: c0000000fd4c73a0
msr: 8000000000009032
dar: 0
dsisr: 40000000
current = 0xc0000000fd640d40
paca = 0xc00000000054ff80
pid = 5085, comm = iscsid
d:mon> t
[c0000000fd4c7450] d000000000b02108 .init_act_open+0x150/0x3d4 [cxgb3i]
[c0000000fd4c7500] d000000000e45378 .cxgbi_ep_connect+0x784/0x8e8 [libcxgbi]
[c0000000fd4c7650] d000000000db33f0 .iscsi_if_rx+0x71c/0xb18
[scsi_transport_iscsi2]
[c0000000fd4c7740] c000000000370c9c .netlink_data_ready+0x40/0xa4
[c0000000fd4c77c0] c00000000036f010 .netlink_sendskb+0x4c/0x9c
[c0000000fd4c7850] c000000000370c18 .netlink_sendmsg+0x358/0x39c
[c0000000fd4c7950] c00000000033be24 .sock_sendmsg+0x114/0x1b8
[c0000000fd4c7b50] c00000000033d208 .sys_sendmsg+0x218/0x2ac
[c0000000fd4c7d70] c00000000033f55c .sys_socketcall+0x228/0x27c
[c0000000fd4c7e30] c0000000000086a4 syscall_exit+0x0/0x40
--- Exception: c01 (System Call) at 00000080da560cfc

The root cause was an EEH error, which sent us down the offload_close path in
the cxgb3 driver, which in turn sets cdev->l2opt to NULL, without regard for
upper layer driver (like the cxgbi drivers) which might have execution contexts
in the middle of its use. The result is the oops above, when t3_l2t_get attempts
to dereference L2DATA(cdev)->nentries in arp_hash right after the EEH error handler sets it to NULL.

The fix is to prevent the setting of the NULL pointer until after there are no
further users of it. The t3cdev->l2opt pointer is now converted to be an rcu
pointer and the L2DATA macro is now called under the protection of the
rcu_read_lock(). When the EEH error path:
t3_adapter_error->offload_close->cxgb3_offload_deactivate
Is exectured, setting of that l2opt pointer to NULL, is now gated on an rcu
quiescence point, preventing, allowing L2DATA callers to safely check for a NULL
pointer without concern that the underlying data will be freeded before the
pointer is dereferenced.

This has been tested by the reporter and shown to fix the reproted oops

[nhorman: fix up unitinialised variable reported by Dan Carpenter]
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reviewed-by: Karen Xie <kxie@chelsio.com>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
69cce1d1404968f78b177a0314f5822d5afdbbfb 18-Jul-2011 David S. Miller <davem@davemloft.net> net: Abstract dst->neighbour accesses behind helpers.

dst_{get,set}_neighbour()

Signed-off-by: David S. Miller <davem@davemloft.net>
807838686eb9e40d73b8a3f2384881358f51fff0 13-May-2011 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Don't post zero-byte read if endpoint is going away

tx_ack() wasn't checking the endpoint state and consequently would
attempt to post the p2p 0B read on an endpoint/QP that is closing or
aborting. This causes a NULL pointer dereference crash.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
31e4543db29fb85496a122b965d6482c8d1a2bfe 04-May-2011 David S. Miller <davem@davemloft.net> ipv4: Make caller provide on-stack flow key to ip_route_output_ports().

Signed-off-by: David S. Miller <davem@davemloft.net>
78fbfd8a653ca972afe479517a40661bfff6d8c3 12-Mar-2011 David S. Miller <davem@davemloft.net> ipv4: Create and use route lookup helpers.

The idea here is this minimizes the number of places one has to edit
in order to make changes to how flows are defined and used.

Signed-off-by: David S. Miller <davem@davemloft.net>
b23dd4fe42b455af5c6e20966b7d6959fa8352ea 02-Mar-2011 David S. Miller <davem@davemloft.net> ipv4: Make output route lookup return rtable directly.

Instead of on the stack.

Signed-off-by: David S. Miller <davem@davemloft.net>
273447b352e69c327efdecfd6e1d6fe3edbdcd14 01-Mar-2011 David S. Miller <davem@davemloft.net> ipv4: Kill can_sleep arg to ip_route_output_flow()

This boolean state is now available in the flow flags.

Signed-off-by: David S. Miller <davem@davemloft.net>
420d44daa7aa1cc847e9e527f0a27a9ce61768ca 01-Mar-2011 David S. Miller <davem@davemloft.net> ipv4: Make final arg to ip_route_output_flow to be boolean "can_sleep"

Since that is what the current vague "flags" argument means.

Signed-off-by: David S. Miller <davem@davemloft.net>
ca7cf94f8bf77bf0dfb35b615d82ac76a0ed77ff 26-Oct-2010 Joe Perches <joe@perches.com> RDMA/cxgb3: Remove unnecessary KERN_<level> use

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
bec658ff31453a5726b1c188674d587a5d40c482 19-Sep-2010 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Turn off RX coalescing for iWARP connections

The HW by default has RX coalescing on. For iWARP connections, this
causes a 100ms delay in connection establishement due to the ingress
MPA Start message being stalled in HW. So explicitly turn RX
coalescing off when setting up iWARP connections.

This was causing very bad performance for NP64 gather operations using
Open MPI, due to the way it sets up connections on larger jobs.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Cc: <stable@kernel.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
18199f573ee03e9265b3f5c45389742dae17607a 20-Jul-2010 Or Gerlitz <ogerlitz@voltaire.com> RDMA/cxgb3: Make needlessly global iwch_l2t_send() static

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
d8d1f30b95a635dbd610dcc5eb641aca8f4768cf 11-Jun-2010 Changli Gao <xiaosuo@gmail.com> net-next: remove useless union keyword

remove useless union keyword in rtable, rt6_info and dn_route.

Since there is only one member in a union, the union keyword isn't useful.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
617c9a7e398878d036a3aa9a063ccba145854b45 28-Apr-2010 Roland Dreier <rolandd@cisco.com> RDMA/cxgb3: Shrink .text with compile-time init of handlers arrays

Using compile-time designated initializers for the handler arrays
instead of open-coding the initialization in iwch_cm_init() is (IMHO)
cleaner, and leads to substantially smaller code: on my x86-64 build,
bloat-o-meter shows:

add/remove: 0/1 grow/shrink: 4/3 up/down: 4/-1682 (-1678)
function old new delta
tx_ack 167 168 +1
state_set 55 56 +1
start_ep_timer 99 100 +1
pass_establish 177 178 +1
act_open_req_arp_failure 39 38 -1
sched 84 82 -2
iwch_cm_init 442 91 -351
work_handlers 1328 - -1328

Signed-off-by: Roland Dreier <rolandd@cisco.com>
73a203d2014f50d874b9e40083ad481ca70408e8 05-Apr-2010 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Don't free skbs on NET_XMIT_* indications from LLD

The low level cxgb3 driver can return NET_XMIT_CN and friends.
The iw_cxgb3 driver should _not_ treat these as errors.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
5a0e3ad6af8660be21ca98a971cd00f331318c05 24-Mar-2010 Tejun Heo <tj@kernel.org> include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
eacc4d6a7dc447ec4fc219af129e0fe50d21d8f7 07-Jan-2010 H Hartley Sweeten <hsweeten@visionengravers.com> drivers/infiniband/hw/cxgb3/iwch_cm.c: use %pM to show MAC address

Use the %pM kernel extension to display the MAC address.

The only difference in the output is that the MAC address is
shown in the usual colon-separated hex notation.

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
a52bf98d99e922363d1d600a79de6aaf00090d47 06-Sep-2009 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Wake up any waiters on peer close/abort

A close/abort while waiting for a wr_ack during connection migration
can cause a hung process in iwch_accept_cr/iwch_reject_cr.

The fix is to set rpl_error/rpl_done and wake up the waiters when we
get a close/abort while in MPA_REQ_RCVD state.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
6e47fe43502ba6dfe86d556661795d9bb0361309 06-Sep-2009 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Don't free endpoints early

- Keep ref on connection request endpoints until either accepted or
rejected so it doesn't get freed early.

- Endpoint flags now need to be set via atomic bitops because they can
be set on both the iw_cxgb3 workqueue thread and user disconnect
threads.

- Don't move out of CLOSING too early due to multiple calls to
iwch_ep_disconnect.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
96ac7e88922da6ab33efea87c6b560ba5ab11e75 20-Apr-2009 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Adjust ORD/IRD (if needed) for peer2peer connections

NFS/RDMA currently fails to set up connections if peer2peer is on.
This is due to the fact that the NFS/RDMA client sets its ORD to 0.

If peer2peer is set, make sure the active side ORD is >= 1 and the
passive side IRD is >=1.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
874d8df5ed6e36fed07b524c266f6a96dd6d10d9 30-Mar-2009 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Release dependent resources only when endpoint memory is freed.

The cxgb3 l2t entry, hwtid, and dst entry were being released before
all the iwch_ep references were released. This can cause a crash in
t3_l2t_send_slow() and other places where the l2t entry is used.

The fix is to defer releasing these resources until all endpoint
references are gone.

Details:

- move flags field to the iwch_ep_common struct.
- add a flag indicating resources are to be released.
- release resources at endpoint free time instead of close/abort time.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
04b5d028f50ff05a8f9ae049ee71f8fdfcf1f5de 30-Mar-2009 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Handle EEH events

- wrap calls into cxgb3 and fail them if we're in the middle
of a PCI EEH event.

- correctly unwind and release endpoint and other resources when
we are in an EEH event.

- dispatch IB_EVENT_DEVICE_FATAL event when cxgb3 notifies iw_cxgb3 of
a fatal error.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
42fb61f02f9bdc476c7a76d3cce0400d989f44c5 11-Feb-2009 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Connection termination fixes

The poll and flush code needs to handle all send opcodes: SEND,
SEND_WITH_SE, SEND_WITH_INV, and SEND_WITH_SE_INV.

Ignore TERM indications if the connection already gone.

Ignore HW receive completions if the RQ is empty.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
dc35fac9e936c6cc6ad825fc7e4455468d10adc6 15-Oct-2008 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Remove cmid reference on tid allocation failures

The error path in iwch_connect() can fail to drop the cmid reference,
which will cause the process to hang when destroying the cmid.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
77a8d5741f3ee2c79554382179cca7b5893d6ae9 02-May-2008 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Bump up the MPA connection setup timeout.

Testing on large clusters shows its way too short at 10 secs.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
c4d49776e8f5bf2d900d2b6d4855c1670a535ac5 02-May-2008 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Silently ignore close reply after abort.

Remove bad BUG_ON() that can trigger in correct operation from
close_con_rpl(). It is possible to get a close_rpl message on a dead
connection. The sequence is:

- host refs ep for close exchange
- host posts close_req
- hw posts PEER_ABORT from incoming RST
- host marks ep DEAD
- host posts ABORT_RPL and releases ep resources
- hw posts CLOSE_RPL
- host derefs ep and ep freed.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
f8b0dfd15277974b5c9f3ff17f9e3ab6fdbe45ee 29-Apr-2008 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Support peer-2-peer connection setup

Open MPI, Intel MPI and other applications don't respect the iWARP
requirement that the client (active) side of the connection send the
first RDMA message. This class of application connection setup is
called peer-to-peer. Typically once the connection is setup, _both_
sides want to send data.

This patch enables supporting peer-to-peer over the chelsio RNIC by
enforcing this iWARP requirement in the driver itself as part of RDMA
connection setup.

Connection setup is extended, when the peer2peer module option is 1,
such that the MPA initiator will send a 0B Read (the RTR) just after
connection setup. The MPA responder will suspend SQ processing until
the RTR message is received and reply-to.

In the longer term, this will be handled in a standardized way by
enhancing the MPA negotiation so peers can indicate whether they
want/need the RTR and what type of RTR (0B read, 0B write, or 0B send)
should be sent. This will be done by standardizing a few bits of the
private data in order to negotiate all this. However this patch
enables peer-to-peer applications now and allows most of the required
firmware and driver changes to be done and tested now.

Design:

- Add a module option, peer2peer, to enable this mode.

- New firmware support for peer-to-peer mode:

- a new bit in the rdma_init WR to tell it to do peer-2-peer
and what form of RTR message to send or expect.

- process _all_ preposted recvs before moving the connection
into rdma mode.

- passive side: defer completing the rdma_init WR until all
pre-posted recvs are processed. Suspend SQ processing until
the RTR is received.

- active side: expect and process the 0B read WR on offload TX
queue. Defer completing the rdma_init WR until all
pre-posted recvs are processed. Suspend SQ processing until
the 0B read WR is processed from the offload TX queue.

- If peer2peer is set, driver posts 0B read request on offload TX
queue just after posting the rdma_init WR to the offload TX queue.

- Add CQ poll logic to ignore unsolicitied read responses.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
989a1780698c65dfe093a6aa89ceeff84c31f528 29-Apr-2008 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Correctly serialize peer abort path

Open MPI and other stress testing exposed a few bad bugs in handling
aborts in the middle of a normal close. Fix these by:

- serializing abort reply and peer abort processing with disconnect
processing

- warning (and ignoring) if ep timer is stopped when it wasn't running

- cleaning up disconnect path to correctly deal with aborting and
dead endpoints

- in iwch_modify_qp(), taking a ref on the ep before releasing the qp
lock if iwch_ep_disconnect() will be called. The ref is dropped
after calling disconnect.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
3371836383d63b627b228875f5ac63023cbf11d2 17-Apr-2008 Harvey Harrison <harvey.harrison@gmail.com> IB: Replace remaining __FUNCTION__ occurrences with __func__

__FUNCTION__ is gcc-specific, use __func__ instead.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
1f71f50342c6fe4fbdebe63b0fd196972a70e281 28-Mar-2008 Roland Dreier <rdreier@cisco.com> RDMA/cxgb3: Program hardware IRD with correct value

Because of a typo in iwch_accept_cr(), the cxgb3 connection handling
code programs the hardware IRD (incoming RDMA read queue depth) with
the value that is passed in for the ORD (outgoing RDMA read queue
depth). In particular this means that if an application passes in IRD
> 0 and ORD = 0 (which is a completely sane and valid thing to do for
an app that expects only incoming RDMA read requests), then the
hardware will end up programmed with IRD = 0 and the app will fail in
a mysterious way.

Fix this by using "ep->ird" instead of "ep->ord" in the intended place.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8704e9a8790cc9e394198663c1c9150c899fb9a2 12-Feb-2008 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Fail loopback connections

The cxgb3 HW and driver don't support loopback RDMA connections. So
fail any connection attempt where the destination address is local.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
f1b050bf7a88910f9f00c9c8989c1bf5a67dd140 23-Jan-2008 Denis V. Lunev <den@openvz.org> [NETNS]: Add namespace parameter to ip_route_output_flow.

Needed to propagate it down to the __ip_route_output_key.

Signed_off_by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
8176d297c73a06e6076c9c31f6404047567f6324 24-Jan-2008 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Fix the T3A workaround checks

Correctly work around T3A issues by checking "hwtype != T3A" instead of
"hwtype == T3B". This will be needed for new hardware types.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
e54664c0958acf14ef3a65d1b78f4a54b437cdf7 29-Jul-2007 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Make the iw_cxgb3 module parameters writable

Allow changing parameter values without having to reload the module.
This is safe because these parameters are only looked at when a new
connection is established.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
699924b1e1ea3c9307eb582b9cc386e4af88aaae 29-Jul-2007 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Always call low level send function via cxgb3_ofld_send()

This avoids deadlocks.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
dd00cc486ab1c17049a535413d1751ef3482141c 19-Jul-2007 Yoann Padioleau <padator@wanadoo.fr> some kmalloc/memset ->kzalloc (tree wide)

Transform some calls to kmalloc/memset to a single kzalloc (or kcalloc).

Here is a short excerpt of the semantic patch performing
this transformation:

@@
type T2;
expression x;
identifier f,fld;
expression E;
expression E1,E2;
expression e1,e2,e3,y;
statement S;
@@

x =
- kmalloc
+ kzalloc
(E1,E2)
... when != \(x->fld=E;\|y=f(...,x,...);\|f(...,x,...);\|x=E;\|while(...) S\|for(e1;e2;e3) S\)
- memset((T2)x,0,E1);

@@
expression E1,E2,E3;
@@

- kzalloc(E1 * E2,E3)
+ kcalloc(E1,E2,E3)

[akpm@linux-foundation.org: get kcalloc args the right way around]
Signed-off-by: Yoann Padioleau <padator@wanadoo.fr>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Acked-by: Russell King <rmk@arm.linux.org.uk>
Cc: Bryan Wu <bryan.wu@analog.com>
Acked-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Dave Airlie <airlied@linux.ie>
Acked-by: Roland Dreier <rolandd@cisco.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Acked-by: Dmitry Torokhov <dtor@mail.ru>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Acked-by: Pierre Ossman <drzeus-list@drzeus.cx>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: "David S. Miller" <davem@davemloft.net>
Acked-by: Greg KH <greg@kroah.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1b07db7079103961de64f75761639435e9082504 11-Jul-2007 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Remove cm_id reference on listen failures

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ecc2f0060fa7ff2fc53864ee19e370e5ddd47d5e 25-Jun-2007 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Don't abort after failures sending the mpa reply

This bug results in an abort request being sent down _after_ the tid
has been released. If the tid happens to have been reused, then the
subsequent generation of the tid gets incorrectly aborted.

The thread running iwch_accecpt_cr() must not abort a connection if an
error is returned after being awakened. If any errors did occur while
iwch_accept_cr() is blocked, then the connection has already been
aborted on the thread processing the error.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
96d0e4931e264012f57a2ae8f7c4697bfa55386a 22-Jun-2007 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Don't post TID_RELEASE message

The LLD does this for us in cxgb3_remove_tid().

Also fixed active open failure cases where we also shouldn't be
releasing the TID.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
1580367e7b2068d075cd42d04c4b8c274815e6fc 19-Jun-2007 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Don't count neg_adv abort_req_rss messages as real aborts

Negative advice messages should _not_ count toward the 2 abort
requests needed to indicate an abort request.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
de3d353072f9342f04112ba0504c3e294220cb8f 14-May-2007 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Streaming -> RDMA mode transition fixes

Due to a HW issue, our current scheme to transition the connection from
streaming to rdma mode is broken on the passive side. The firmware
and driver now support a new transition scheme for the passive side:

- driver posts rdma_init_wr (now including the initial receive seqno)
- driver posts last streaming message via TX_DATA message (MPA start
response)
- uP atomically sends the last streaming message and transitions the
tcb to rdma mode.
- driver waits for wr_ack indicating the last streaming message was ACKed.

NOTE: This change also bumps the required firmware version to 4.3.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
aff9e39d97585486764572ab2f3bf5dfce18c660 26-Apr-2007 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Support for new abort logic

The HW now posts 2 ABORT_RPL and/or PEER_ABORT_REQ messages. We need
to handle them by silenty dropping the 1st but mark that we're ready
for the final message. This plugs some close races between the uP and
HW. Also update the minimum required firmware version.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
60be4b5966e22040f97db9dada72841bf90479d1 26-Apr-2007 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Initialize cpu_idx field in cpl_close_listserv_req message

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
d626f62b11e00c16e81e4308ab93d3f13551812a 27-Mar-2007 Arnaldo Carvalho de Melo <acme@redhat.com> [SK_BUFF]: Introduce skb_copy_from_linear_data{_offset}

To clearly state the intent of copying from linear sk_buffs, _offset being a
overly long variant but interesting for the sake of saving some bytes.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
4305b541357ddbd205aa145dc378926b7cb12283 20-Apr-2007 Arnaldo Carvalho de Melo <acme@redhat.com> [SK_BUFF]: Convert skb->end to sk_buff_data_t

Now to convert the last one, skb->data, that will allow many simplifications
and removal of some of the offset helpers.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
badff6d01a8589a1c828b0bf118903ca38627f4e 13-Mar-2007 Arnaldo Carvalho de Melo <acme@redhat.com> [SK_BUFF]: Introduce skb_reset_transport_header(skb)

For the common, open coded 'skb->h.raw = skb->data' operation, so that we can
later turn skb->h.raw into a offset, reducing the size of struct sk_buff in
64bit land while possibly keeping it as a pointer on 32bit.

This one touches just the most simple cases:

skb->h.raw = skb->data;
skb->h.raw = {skb_push|[__]skb_pull}()

The next ones will handle the slightly more "complex" cases.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1ca19770c5ba90d041ba4d06976c77048d330cc8 12-Apr-2007 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Add set_tcb_rpl_handler

As of commit 6cdbd77e ("cxgb3 - missing CPL hanler and register
setting."), the cxgb3 ethernet NIC driver no longer handles SET_TCB
replies, so we need to do it in the iWARP driver.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Acked-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
1f6a849b7ce6c3007088cd437dfc2b9c7cb5d21e 06-Mar-2007 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Don't reuse skbs that are non-linear or cloned

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
adf376b3708f6111e87916fae083633c1be2f88f 06-Mar-2007 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Stop EP timer when MPA exchange is aborted by peer

Stop the endpoint timer when the MPA exchange is aborted by the peer.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
42e31753546d2186d4a675e7d00daa02ea7c8e85 06-Mar-2007 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Fixes for "normal close" failures

Fixes for "normal close" failures:

- Start normal close timer when moving to CLOSING state.
- Handle ABORTING state in close_con_rpl().
- Stop timer correctly on abort during a normal close.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
7d526e6b2c5d6bba70fdc1fc2943bdaf9cc6147d 06-Mar-2007 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Start ep timer on a MPA reject

If the consumer rejects the connection we end up under-referencing the
endpoint structure. The fix is to call iwch_ep_disconnect() instead
of the low level disconnect functions so that the endpoint close timer
is started correctly.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2f236735fd05259a07a28233dcd07a8a6dddee9b 21-Feb-2007 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Stop the EP Timer on BAD CLOSE

Stop the ep timer in ec_status() if the status indicates a
bad close.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2b540355cd2f46c5445030995e72c4b4fb2b775e 21-Feb-2007 Adrian Bunk <bunk@stusta.de> RDMA/cxgb3: cleanups

- don't mark static functions in C files as inline - gcc should know
best whether inlining makes sense
- never compile the unused cxio_dbg.c
- make the following needlessly global functions static:
- cxio_hal.c: cxio_hal_clear_qp_ctx()
- iwch_provider.c: iwch_get_qp()
- remove the following unused global functions:
- cxio_hal.c: cxio_allocate_stag()
- cxio_resource.: cxio_hal_get_rhdl()
- cxio_resource.: cxio_hal_put_rhdl()

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
c52daa29760818772ee4211be4ee8d1c78b888d5 15-Feb-2007 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Remove Open Grid Computing copyrights in iw_cxgb3 driver

Remove the Open Grid Computing copyright. It shouldn't be there.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
b038ced7b3705bf0ac9b30e118af0f56ab48b847 13-Feb-2007 Steve Wise <swise@opengridcomputing.com> RDMA/cxgb3: Add driver for Chelsio T3 RNIC

Add an RDMA/iWARP driver for the Chelsio T3 1GbE and 10GbE adapters.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>