History log of /drivers/infiniband/ulp/
Revision Date Author Comments (<<< Hide modified files) (Show modified files >>>)
8df54d622a120058ee8bec38743c9b8f091c8e58 10-Feb-2012 Linus Torvalds <torvalds@linux-foundation.org> Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Quoth David:

1) GRO MAC header comparisons were ethernet specific, breaking other
link types. This required a multi-faceted fix to cure the originally
noted case (Infiniband), because IPoIB was lying about it's actual
hard header length. Thanks to Eric Dumazet, Roland Dreier, and
others.

2) Fix build failure when INET_UDP_DIAG is built in and ipv6 is modular.
From Anisse Astier.

3) Off by ones and other bug fixes in netprio_cgroup from Neil Horman.

4) ipv4 TCP reset generation needs to respect any network interface
binding from the socket, otherwise route lookups might give a
different result than all the other segments received. From Shawn
Lu.

5) Fix unintended regression in ipv4 proxy ARP responses, from Thomas
Graf.

6) Fix SKB under-allocation bug in sh_eth, from Yoshihiro Shimoda.

7) Revert skge PCI mapping changes that are causing crashes for some
folks, from Stephen Hemminger.

8) IPV4 route lookups fill in the wildcarded fields of the given flow
lookup key passed in, which is fine most of the time as this is
exactly what the caller's want. However there are a few cases that
want to retain the original flow key values afterwards, so handle
those cases properly. Fix from Julian Anastasov.

9) IGB/IXGBE VF lookup bug fixes from Greg Rose.

10) Properly null terminate filename passed to ethtool flash device
method, from Ben Hutchings.

11) S3 resume fix in via-velocity from David Lv.

12) Fix double SKB free during xmit failure in CAIF, from Dmitry
Tarnyagin.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (72 commits)
net: Don't proxy arp respond if iif == rt->dst.dev if private VLAN is disabled
ipv4: Fix wrong order of ip_rt_get_source() and update iph->daddr.
netprio_cgroup: fix wrong memory access when NETPRIO_CGROUP=m
netprio_cgroup: don't allocate prio table when a device is registered
netprio_cgroup: fix an off-by-one bug
bna: fix error handling of bnad_get_flash_partition_by_offset()
isdn: type bug in isdn_net_header()
net: Make qdisc_skb_cb upper size bound explicit.
ixgbe: ethtool: stats user buffer overrun
ixgbe: dcb: up2tc mapping lost on disable/enable CEE DCB state
ixgbe: do not update real num queues when netdev is going away
ixgbe: Fix broken dependency on MAX_SKB_FRAGS being related to page size
ixgbe: Fix case of Tx Hang in PF with 32 VFs
ixgbe: fix vf lookup
igb: fix vf lookup
e1000: add dropped DMA receive enable back in for WoL
gro: more generic L2 header check
IPoIB: Stop lying about hard_header_len and use skb->cb to stash LL addresses
zd1211rw: firmware needs duration_id set to zero for non-pspoll frames
net: enable TC35815 for MIPS again
...
936d7de3d736e0737542641269436f4b5968e9ef 07-Feb-2012 Roland Dreier <roland@purestorage.com> IPoIB: Stop lying about hard_header_len and use skb->cb to stash LL addresses

Commit a0417fa3a18a ("net: Make qdisc_skb_cb upper size bound
explicit.") made it possible for a netdev driver to use skb->cb
between its header_ops.create method and its .ndo_start_xmit
method. Use this in ipoib_hard_header() to stash away the LL address
(GID + QPN), instead of the "ipoib_pseudoheader" hack. This allows
IPoIB to stop lying about its hard_header_len, which will let us fix
the L2 check for GRO.

Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib.h
poib/ipoib_main.c
poib/ipoib_multicast.c
715252d41912941efb791a7b7bad94d2614dc5c3 04-Feb-2012 Jesper Juhl <jj@chaosbits.net> IB/srpt: Don't return freed pointer from srpt_alloc_ioctx_ring()

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Roland Dreier <roland@purestorage.com>
rpt/ib_srpt.c
3af336376f77859da84bb1156ef29d5337b316a9 04-Nov-2011 Dan Carpenter <dan.carpenter@oracle.com> IB/srpt: Fix ERR_PTR() vs. NULL checking confusion

transport_init_session() and target_fabric_configfs_init() don't
return NULL pointers, they only return ERR_PTRs or valid pointers.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
rpt/ib_srpt.c
ebfded8c4b34caea450709ce467e67483fa4d8df 02-Feb-2012 Jesper Juhl <jj@chaosbits.net> IB/srpt: Remove unneeded <linux/version.h> include

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Roland Dreier <roland@purestorage.com>
rpt/ib_srpt.h
f225066b64eaffe3a51ee488fb750c82fbcd971c 02-Feb-2012 Roland Dreier <roland@purestorage.com> IB/srpt: Use ARRAY_SIZE() instead of open-coding

Signed-off-by: Roland Dreier <roland@purestorage.com>
rpt/ib_srpt.c
486d8b9f88cd0871a716e2f16873e811ee6c1ece 02-Feb-2012 Roland Dreier <roland@purestorage.com> IB/srpt: Use DEFINE_SPINLOCK()/LIST_HEAD()

Signed-off-by: Roland Dreier <roland@purestorage.com>
rpt/ib_srpt.c
f59e842fc0871cd5baa213dc32e0ce8e5aaf4758 19-Jan-2012 Linus Torvalds <torvalds@linux-foundation.org> Merge branch 'for-next-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending

* 'for-next-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
ib_srpt: Initial SRP Target merge for v3.3-rc1
972b2c719990f91eb3b2310d44ef8a2d38955a14 08-Jan-2012 Linus Torvalds <torvalds@linux-foundation.org> Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

* 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (165 commits)
reiserfs: Properly display mount options in /proc/mounts
vfs: prevent remount read-only if pending removes
vfs: count unlinked inodes
vfs: protect remounting superblock read-only
vfs: keep list of mounts for each superblock
vfs: switch ->show_options() to struct dentry *
vfs: switch ->show_path() to struct dentry *
vfs: switch ->show_devname() to struct dentry *
vfs: switch ->show_stats to struct dentry *
switch security_path_chmod() to struct path *
vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb
vfs: trim includes a bit
switch mnt_namespace ->root to struct mount
vfs: take /proc/*/mounts and friends to fs/proc_namespace.c
vfs: opencode mntget() mnt_set_mountpoint()
vfs: spread struct mount - remaining argument of next_mnt()
vfs: move fsnotify junk to struct mount
vfs: move mnt_devname
vfs: move mnt_list to struct mount
vfs: switch pnode.h macros to struct mount *
...
587a1f1659e8b330b8738ef4901832a2b63f0bed 24-Jul-2011 Al Viro <viro@zeniv.linux.org.uk> switch ->is_visible() to returning umode_t

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
ser/iscsi_iser.c
a42d985bd5b234da8b61347a78dc3057bf7bb94d 14-Oct-2011 Bart Van Assche <bvanassche@acm.org> ib_srpt: Initial SRP Target merge for v3.3-rc1

This patch adds the kernel module ib_srpt SCSI RDMA Protocol (SRP) target
implementation conforming to the SRP r16a specification for the mainline
drivers/target infrastructure.

This driver was originally developed by Vu Pham and has been optimized by
Bart Van Assche and merged into upstream LIO based on his srpt-lio-4.1
branch here:

https://github.com/bvanassche/srpt-lio/commits/srpt-lio-4.1/

This updated patch also contains the following two changes from
lio-core-2.6.git/master. One is to fix a bug with 1 >= task->task_sg[]
chained mappings in ib_srpt, and the other to convert the configfs control
plane to reference IB Port GUID and struct srpt_port directly following
mainline v4.x target_core_fabric_configfs.c convertion for ib_srpt
to work with rtslib/rtsadmin v2 code.

These seperate patches can be found here:

ib_srpt: Fix bug with chainged SGLs in srpt_map_sg_to_ib_sge
http://www.risingtidesystems.com/git/?p=lio-core-2.6.git;a=commitdiff;h=ea485147563b6555a97dbf811825fbb586519252

ib_srpt: Convert se_wwn endpoint reference to struct srpt_port->port_wwn
http://www.risingtidesystems.com/git/?p=lio-core-2.6.git;a=commitdiff;h=4e544a210acb227df1bb4ca5086e65bdf4e648ea

This also includes the following recent v1 -> v2 review changes:

ib_srpt: Fix potential out-of-bounds array access
ib_srpt: Avoid failed multipart RDMA transfers
ib_srpt: Fix srpt_alloc_fabric_acl failure case return value
ib_srpt: Update comments to reference $driver/$port layout
ib_srpt: Fix sport->port_guid formatting code
ib_srpt: Remove legacy use_port_guid_in_session_name module parameter
ib_srpt: Convert srp_max_rdma_size into per port configfs attribute
ib_srpt: Convert srp_max_rsp_size into per port configfs attribute
ib_srpt: Convert srpt_sq_size into per port configfs attribute

and v2 -> v3 review changes:

ib_srpt: Fix possible race with srp_sq_size in srpt_create_ch_ib
ib_srpt: Fix possible race with srp_max_rsp_size in srpt_release_channel_work
ib_srpt: Fix up MAX_SRPT_RDMA_SIZE define
ib_srpt: Make srpt_map_sg_to_ib_sge() failure case return -EAGAIN
ib_srpt: Convert port_guid to use subnet_prefix + interface_id formatting
ib_srpt: Make srpt_check_stop_free return kref_put status
ib_srpt: Make compilation with BUG=n proceed`
ib_srpt: Use new target_core_fabric.h include
ib_srpt: Check hex2bin() return code to silence build warning

Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Roland Dreier <roland@purestorage.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Vu Pham <vu@mellanox.com>
Cc: David Dillow <dillowda@ornl.gov>
Signed-off-by: Nicholas A. Bellinger <nab@risingtidesystems.com>
rpt/Kconfig
rpt/Makefile
rpt/ib_dm_mad.h
rpt/ib_srpt.c
rpt/ib_srpt.h
17e6abeec4cb8df1e33ea0e2b889586c731a68be 02-Dec-2011 David Miller <davem@davemloft.net> infiniband: ipoib: Sanitize neighbour handling in ipoib_main.c

Reduce the number of dst_get_neighbour_noref() calls within a single
call chain. Primarily by passing the neighbour pointer down to the
helper functions.

Handle dst_get_neighbour_noref() returning NULL in ipoib_start_xmit()
by incrementing the dropped counter and freeing the packet. We don't
want it to fall through into the ARP/RARP/multicast handling, since
that should only happen when skb_dst() is NULL.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Roland Dreier <roland@purestorage.com>
poib/ipoib_main.c
2721745501a26d0dc3b88c0d2f3aa11471891388 02-Dec-2011 David Miller <davem@davemloft.net> net: Rename dst_get_neighbour{, _raw} to dst_get_neighbour_noref{, _raw}.

To reflect the fact that a refrence is not obtained to the
resulting neighbour entry.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Roland Dreier <roland@purestorage.com>
poib/ipoib_main.c
poib/ipoib_multicast.c
b3613118eb30a589d971e4eccbbb2a1314f5dfd4 02-Dec-2011 David S. Miller <davem@davemloft.net> Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
596b9b68ef118f7409afbc78487263e08ef96261 25-Jul-2011 David Miller <davem@davemloft.net> neigh: Add infrastructure for allocating device neigh privates.

netdev->neigh_priv_len records the private area length.

This will trigger for neigh_table objects which set tbl->entry_size
to zero, and the first instances of this will be forthcoming.

Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib_main.c
a493f1a24a496711d96b91c4dc0a1bd35eb6954b 30-Nov-2011 Roland Dreier <roland@purestorage.com> Merge branches 'cxgb4', 'ipoib', 'misc' and 'qib' into for-next
580da35a31f91a594f3090b7a2c39b85cb051a12 29-Nov-2011 Eric Dumazet <eric.dumazet@gmail.com> IB: Fix RCU lockdep splats

Commit f2c31e32b37 ("net: fix NULL dereferences in check_peer_redir()")
forgot to take care of infiniband uses of dst neighbours.

Many thanks to Marc Aurele who provided a nice bug report and feedback.

Reported-by: Marc Aurele La France <tsi@ualberta.ca>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: <stable@kernel.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
poib/ipoib_main.c
poib/ipoib_multicast.c
3874397c0bdec3c21ce071711cd105165179b8eb 21-Nov-2011 Mike Marciniszyn <mike.marciniszyn@qlogic.com> IB/ipoib: Prevent hung task or softlockup processing multicast response

This following can occur with ipoib when processing a multicast reponse:

BUG: soft lockup - CPU#0 stuck for 67s! [ib_mad1:982]
Modules linked in: ...
CPU 0:
Modules linked in: ...
Pid: 982, comm: ib_mad1 Not tainted 2.6.32-131.0.15.el6.x86_64 #1 ProLiant DL160 G5
RIP: 0010:[<ffffffff814ddb27>] [<ffffffff814ddb27>] _spin_unlock_irqrestore+0x17/0x20
RSP: 0018:ffff8802119ed860 EFLAGS: 00000246
0000000000000004 RBX: ffff8802119ed860 RCX: 000000000000a299
RDX: ffff88021086c700 RSI: 0000000000000246 RDI: 0000000000000246
RBP: ffffffff8100bc8e R08: ffff880210ac229c R09: 0000000000000000
R10: ffff88021278aab8 R11: 0000000000000000 R12: ffff8802119ed860
R13: ffffffff8100be6e R14: 0000000000000001 R15: 0000000000000003
FS: 0000000000000000(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00000000006d4840 CR3: 0000000209aa5000 CR4: 00000000000406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Call Trace:
[<ffffffffa032c247>] ? ipoib_mcast_send+0x157/0x480 [ib_ipoib]
[<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20
[<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20
[<ffffffffa03283d4>] ? ipoib_path_lookup+0x124/0x2d0 [ib_ipoib]
[<ffffffffa03286fc>] ? ipoib_start_xmit+0x17c/0x430 [ib_ipoib]
[<ffffffff8141e758>] ? dev_hard_start_xmit+0x2c8/0x3f0
[<ffffffff81439d0a>] ? sch_direct_xmit+0x15a/0x1c0
[<ffffffff81423098>] ? dev_queue_xmit+0x388/0x4d0
[<ffffffffa032d6b7>] ? ipoib_mcast_join_finish+0x2c7/0x510 [ib_ipoib]
[<ffffffffa032dab8>] ? ipoib_mcast_sendonly_join_complete+0x1b8/0x1f0 [ib_ipoib]
[<ffffffffa02a0946>] ? mcast_work_handler+0x1a6/0x710 [ib_sa]
[<ffffffffa015f01e>] ? ib_send_mad+0xfe/0x3c0 [ib_mad]
[<ffffffffa00f6c93>] ? ib_get_cached_lmc+0xa3/0xb0 [ib_core]
[<ffffffffa02a0f9b>] ? join_handler+0xeb/0x200 [ib_sa]
[<ffffffffa029e4fc>] ? ib_sa_mcmember_rec_callback+0x5c/0xa0 [ib_sa]
[<ffffffffa029e79c>] ? recv_handler+0x3c/0x70 [ib_sa]
[<ffffffffa01603a4>] ? ib_mad_completion_handler+0x844/0x9d0 [ib_mad]
[<ffffffffa015fb60>] ? ib_mad_completion_handler+0x0/0x9d0 [ib_mad]
[<ffffffff81088830>] ? worker_thread+0x170/0x2a0
[<ffffffff8108e160>] ? autoremove_wake_function+0x0/0x40
[<ffffffff810886c0>] ? worker_thread+0x0/0x2a0
[<ffffffff8108ddf6>] ? kthread+0x96/0xa0
[<ffffffff8100c1ca>] ? child_rip+0xa/0x20

Coinciding with stack trace is the following message:

ib0: ib_address_create failed

The code below in ipoib_mcast_join_finish() will note the above
failure in the address handle but otherwise continue:

ah = ipoib_create_ah(dev, priv->pd, &av);
if (!ah) {
ipoib_warn(priv, "ib_address_create failed\n");
} else {

The while loop at the bottom of ipoib_mcast_join_finish() will attempt
to send queued multicast packets in mcast->pkt_queue and eventually
end up in ipoib_mcast_send():

if (!mcast->ah) {
if (skb_queue_len(&mcast->pkt_queue) < IPOIB_MAX_MCAST_QUEUE)
skb_queue_tail(&mcast->pkt_queue, skb);
else {
++dev->stats.tx_dropped;
dev_kfree_skb_any(skb);
}

My read is that the code will requeue the packet and return to the
ipoib_mcast_join_finish() while loop and the stage is set for the
"hung" task diagnostic as the while loop never sees a non-NULL ah, and
will do nothing to resolve.

There are GFP_ATOMIC allocates in the provider routines, so this is
possible and should be dealt with.

The test that induced the failure is associated with a host SM on the
same server during a shutdown.

This patch causes ipoib_mcast_join_finish() to exit with an error
which will flush the queued mcast packets. Nothing is done to unwind
the QP attached state so that subsequent sends from above will retry
the join.

Reviewed-by: Ram Vepa <ram.vepa@qlogic.com>
Reviewed-by: Gary Leshner <gary.leshner@qlogic.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
poib/ipoib_ib.c
poib/ipoib_main.c
poib/ipoib_multicast.c
9ca36f7db29a1e4bde58fa7cf98b542c032b7180 17-Nov-2011 David S. Miller <davem@davemloft.net> infiniband: Update net drivers for netdev_features_t changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib_main.c
32aaeffbd4a7457bf2f7448b33b5946ff2a960eb 07-Nov-2011 Linus Torvalds <torvalds@linux-foundation.org> Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux

* 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
Revert "tracing: Include module.h in define_trace.h"
irq: don't put module.h into irq.h for tracking irqgen modules.
bluetooth: macroize two small inlines to avoid module.h
ip_vs.h: fix implicit use of module_get/module_put from module.h
nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
include: replace linux/module.h with "struct module" wherever possible
include: convert various register fcns to macros to avoid include chaining
crypto.h: remove unused crypto_tfm_alg_modname() inline
uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
pm_runtime.h: explicitly requires notifier.h
linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
miscdevice.h: fix up implicit use of lists and types
stop_machine.h: fix implicit use of smp.h for smp_processor_id
of: fix implicit use of errno.h in include/linux/of.h
of_platform.h: delete needless include <linux/module.h>
acpi: remove module.h include from platform/aclinux.h
miscdevice.h: delete unnecessary inclusion of module.h
device_cgroup.h: delete needless include <linux/module.h>
net: sch_generic remove redundant use of <linux/module.h>
net: inet_timewait_sock doesnt need <linux/module.h>
...

Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in
- drivers/media/dvb/frontends/dibx000_common.c
- drivers/media/video/{mt9m111.c,ov6650.c}
- drivers/mfd/ab3550-core.c
- include/linux/dmaengine.h
52439540ea30396982b69662dd21aede6b336288 03-Nov-2011 Or Gerlitz <ogerlitz@mellanox.com> IB/iser: DMA unmap TX bufs used for iSCSI/iSER headers

The current driver never does DMA unmapping on these buffers. Fix that
by adding DMA unmapping to the task cleanup callback, and DMA mapping to
the task init function (drop the headers_initialized micro-optimization).

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
ser/iscsi_iser.c
ser/iscsi_iser.h
2c4ce609347f2a45792c8d9ebb5af11217766cb6 03-Nov-2011 Or Gerlitz <ogerlitz@mellanox.com> IB/iser: Use separate buffers for the login request/response

The driver counted on the transactional nature of iSCSI login/text
flows and used the same buffer for both the request and the response.
We also went further and did DMA mapping only once, with
DMA_FROM_DEVICE, which violates the DMA mapping API. Fix that by
using different buffers, one for requests and one for responses, and
use the correct DMA mapping direction for each.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
ser/iscsi_iser.h
ser/iser_initiator.c
ser/iser_verbs.c
f470f8d4e702593ee1d0852871ad80373bce707b 01-Nov-2011 Linus Torvalds <torvalds@linux-foundation.org> Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (62 commits)
mlx4_core: Deprecate log_num_vlan module param
IB/mlx4: Don't set VLAN in IBoE WQEs' control segment
IB/mlx4: Enable 4K mtu for IBoE
RDMA/cxgb4: Mark QP in error before disabling the queue in firmware
RDMA/cxgb4: Serialize calls to CQ's comp_handler
RDMA/cxgb3: Serialize calls to CQ's comp_handler
IB/qib: Fix issue with link states and QSFP cables
IB/mlx4: Configure extended active speeds
mlx4_core: Add extended port capabilities support
IB/qib: Hold links until tuning data is available
IB/qib: Clean up checkpatch issue
IB/qib: Remove s_lock around header validation
IB/qib: Precompute timeout jiffies to optimize latency
IB/qib: Use RCU for qpn lookup
IB/qib: Eliminate divide/mod in converting idx to egr buf pointer
IB/qib: Decode path MTU optimization
IB/qib: Optimize RC/UC code by IB operation
IPoIB: Use the right function to do DMA unmap pages
RDMA/cxgb4: Use correct QID in insert_recv_cqe()
RDMA/cxgb4: Make sure flush CQ entries are collected on connection close
...
504255f8d0480cf293962adf4bc3aecac645ae71 01-Nov-2011 Roland Dreier <roland@purestorage.com> Merge branches 'amso1100', 'cma', 'cxgb3', 'cxgb4', 'fdr', 'ipath', 'ipoib', 'misc', 'mlx4', 'misc', 'nes', 'qib' and 'xrc' into for-next
fec14d2fcebe824377ef0305babc365d039f6b39 30-Aug-2011 Paul Gortmaker <paul.gortmaker@windriver.com> infiniband: add moduleparam.h to drivers/infiniband as required

These files were getting the moduleparam infrastructure from the
implicit presence of module.h being everywhere, but that is going
away soon.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
poib/ipoib_cm.c
poib/ipoib_ib.c
poib/ipoib_multicast.c
b108d9764cff25262bf764542ed1998d3e568962 27-May-2011 Paul Gortmaker <paul.gortmaker@windriver.com> infiniband: add in export.h for files using EXPORT_SYMBOL/THIS_MODULE

These were getting it implicitly via device.h --> module.h but
we are going to stop that when we clean up the headers.

Fix these in advance so the tree remains biscect-clean.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
poib/ipoib_fs.c
e4dd23d753c3cb0d8533d353069e8b2e8a666360 27-May-2011 Paul Gortmaker <paul.gortmaker@windriver.com> infiniband: Fix up module files that need to include module.h

They had been getting it implicitly via device.h but we can't
rely on that for the future, due to a pending cleanup so fix
it now.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
ser/iscsi_iser.c
ec7ae517537ae5c7b0b2cd7f562dfa3e7a05b954 29-Oct-2011 Linus Torvalds <torvalds@linux-foundation.org> Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (204 commits)
[SCSI] qla4xxx: export address/port of connection (fix udev disk names)
[SCSI] ipr: Fix BUG on adapter dump timeout
[SCSI] megaraid_sas: Fix instance access in megasas_reset_timer
[SCSI] hpsa: change confusing message to be more clear
[SCSI] iscsi class: fix vlan configuration
[SCSI] qla4xxx: fix data alignment and use nl helpers
[SCSI] iscsi class: fix link local mispelling
[SCSI] iscsi class: Replace iscsi_get_next_target_id with IDA
[SCSI] aacraid: use lower snprintf() limit
[SCSI] lpfc 8.3.27: Change driver version to 8.3.27
[SCSI] lpfc 8.3.27: T10 additions for SLI4
[SCSI] lpfc 8.3.27: Fix queue allocation failure recovery
[SCSI] lpfc 8.3.27: Change algorithm for getting physical port name
[SCSI] lpfc 8.3.27: Changed worst case mailbox timeout
[SCSI] lpfc 8.3.27: Miscellanous logic and interface fixes
[SCSI] megaraid_sas: Changelog and version update
[SCSI] megaraid_sas: Add driver workaround for PERC5/1068 kdump kernel panic
[SCSI] megaraid_sas: Add multiple MSI-X vector/multiple reply queue support
[SCSI] megaraid_sas: Add support for MegaRAID 9360/9380 12GB/s controllers
[SCSI] megaraid_sas: Clear FUSION_IN_RESET before enabling interrupts
...
9e903e085262ffbf1fc44a17ac06058aca03524a 18-Oct-2011 Eric Dumazet <eric.dumazet@gmail.com> net: add skb frag size accessors

To ease skb->truesize sanitization, its better to be able to localize
all references to skb frags size.

Define accessors : skb_frag_size() to fetch frag size, and
skb_frag_size_{set|add|sub}() to manipulate it.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib_cm.c
poib/ipoib_ib.c
787adb9d6ad9afb498a1580a7d8ad05f779c488a 18-Oct-2011 Dotan Barak <dotanb@dev.mellanox.co.il> IPoIB: Use the right function to do DMA unmap pages

Pages that were mapped using ib_dma_map_page() should be unmapped
using ib_dma_unmap_page().

Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <roland@purestorage.com>
poib/ipoib_cm.c
96104eda01695a26da2c8f7423ec0ba3509c8c97 24-May-2011 Sean Hefty <sean.hefty@intel.com> RDMA/core: Add SRQ type field

Currently, there is only a single ("basic") type of SRQ, but with XRC
support we will add a second. Prepare for this by defining an SRQ type
and setting all current users to IB_SRQT_BASIC.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
poib/ipoib_cm.c
e36fb88a9a0fb8ac4b87c8ac709214a408de6d97 04-Oct-2011 Marcel Apfelbaum <marcela@dev.mellanox.co.il> IPoIB: Handle extended rates in debugfs

Use new function ib_rate_to_mbps() to handle printing rate in debugfs,
so that we handle extended rates.

Signed-off-by: Marcel Apfelbaum <marcela@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <roland@purestorage.com>
poib/ipoib_fs.c
8decf868790b48a727d7e7ca164f2bcd3c1389c0 22-Sep-2011 David S. Miller <davem@davemloft.net> Merge branch 'master' of github.com:davem330/net

Conflicts:
MAINTAINERS
drivers/net/Kconfig
drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
drivers/net/ethernet/broadcom/tg3.c
drivers/net/wireless/iwlwifi/iwl-pci.c
drivers/net/wireless/iwlwifi/iwl-trans-tx-pcie.c
drivers/net/wireless/rt2x00/rt2800usb.c
drivers/net/wireless/wl12xx/main.c
f27fb2ef7bd88c9c5f67befe4d85e2155aa0e1a8 25-Jul-2011 Mike Christie <michaelc@cs.wisc.edu> [SCSI] iscsi class: sysfs group is_visible callout for iscsi host attrs

The iscsi class currently does not support writable sysfs
attrs for LLD sysfs settings. This patch converts the
iscsi class and driver's host attrs to use the attribute
container sysfs group and the sysfs group's is_visible callout
to be able to support readable or writable sysfs attrs.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
ser/iscsi_iser.c
1d063c17298d7cd26cfe350f1e93e1727b4aa53f 25-Jul-2011 Mike Christie <michaelc@cs.wisc.edu> [SCSI] iscsi class: sysfs group is_visible callout for session attrs

The iscsi class currently does not support writable sysfs
attrs for LLD sysfs settings. This patch converts the
iscsi class and driver's session attrs to use the attribute
container sysfs group and the sysfs group's is_visible callout
to be able to support readable or writable sysfs attrs.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
ser/iscsi_iser.c
3128c6c73cdf3df92c3165bfb785ae50114d18bf 25-Jul-2011 Mike Christie <michaelc@cs.wisc.edu> [SCSI] iscsi cls: sysfs group is_visible callout for conn attrs

The iscsi class currently does not support writable sysfs
attrs for LLD sysfs settings. This patch converts the
iscsi class and drivers to use the attribute container
sysfs group and the sysfs group's is_visible callout
to be able to support readable or writable sysfs attrs.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
ser/iscsi_iser.c
5581be3b48914b0f9126b14daa02d334928322b0 25-Aug-2011 Ian Campbell <Ian.Campbell@citrix.com> IPoIB: convert to SKB paged frag API.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Roland Dreier <roland@kernel.org>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: linux-rdma@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib_cm.c
poib/ipoib_ib.c
afc4b13df143122f99a0eb10bfefb216c2806de0 16-Aug-2011 Jiri Pirko <jpirko@redhat.com> net: remove use of ndo_set_multicast_list in drivers

replace it by ndo_set_rx_mode

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib_main.c
80b43de83781ed67810d54c7892ac9cb2a2601df 17-Aug-2011 Roland Dreier <roland@purestorage.com> Merge branches 'ipoib' and 'iser' into for-next
200ae1a08bec8f3fedfcfe94c892d9a024db4e46 01-Aug-2011 Or Gerlitz <ogerlitz@mellanox.com> IB/iser: Support iSCSI PDU padding

RFC3270 mandates that iSCSI PDUs are padded to the closest integer
number of four byte words. Fix the iser code to support that on both
the TX/RX flows.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.co.il>
Signed-off-by: Roland Dreier <roland@purestorage.com>
ser/iscsi_iser.c
ser/iser_initiator.c
0ace64b85ea7b90e3bffe408b9d7c3364692bfa4 01-Aug-2011 Or Gerlitz <ogerlitz@mellanox.com> IBiser: Fix wrong mask when sizeof (dma_addr_t) > sizeof (unsigned long)

The code that prepares the SG associated with SCSI command for FMR was
buggy for systems with DMA addresses that don't fit in unsigned long,
e.g under the 32-bit based XenServer dom0 sizeof(dma_addr_t) is 8.

Fix that by casting to unsigned long long a masking constant used by
the code. This resolves a crash in iser_sg_to_page_vec on this system.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.co.il>
Signed-off-by: Roland Dreier <roland@purestorage.com>
ser/iscsi_iser.h
22cfb0bf6721bb1f865f67bc21e3c36c272faf36 16-Aug-2011 Bernd Schubert <bernd.schubert@itwm.fraunhofer.de> IPoIB: Fix possible NULL dereference in ipoib_start_xmit()

Fix a bug introduced in 69cce1d14049 ("net: Abstract dst->neighbour
accesses behind helpers.") where we might dereference skb_dst(skb)
even if it is NULL, which causes:

[ 240.944030] BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
[ 240.948007] IP: [<ffffffffa0366ce9>] ipoib_start_xmit+0x39/0x280 [ib_ipoib]
[...]
[ 240.948007] Call Trace:
[ 240.948007] <IRQ>
[ 240.948007] [<ffffffff812cd5e0>] dev_hard_start_xmit+0x2a0/0x590
[ 240.948007] [<ffffffff8131f680>] ? arp_create+0x70/0x200
[ 240.948007] [<ffffffff812e8e1f>] sch_direct_xmit+0xef/0x1c0

Addresses: https://bugzilla.kernel.org/show_bug.cgi?id=41212
Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
Signed-off-by: Roland Dreier <roland@purestorage.com>
poib/ipoib_main.c
91d41fdf31f74e6e2e5f3cb018eca4200e36e202 27-Jul-2011 Linus Torvalds <torvalds@linux-foundation.org> Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
target: Convert to DIV_ROUND_UP_SECTOR_T usage for sectors / dev_max_sectors
kernel.h: Add DIV_ROUND_UP_ULL and DIV_ROUND_UP_SECTOR_T macro usage
iscsi-target: Add iSCSI fabric support for target v4.1
iscsi: Add Serial Number Arithmetic LT and GT into iscsi_proto.h
iscsi: Use struct scsi_lun in iscsi structs instead of u8[8]
iscsi: Resolve iscsi_proto.h naming conflicts with drivers/target/iscsi
60063497a95e716c9a689af3be2687d261f115b4 27-Jul-2011 Arun Sharma <asharma@fb.com> atomic: use <linux/atomic.h>

This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>

Signed-off-by: Arun Sharma <asharma@fb.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
poib/ipoib.h
rp/ib_srp.c
123521830c0ea35055b900d2ff0b73bb129e08cb 27-May-2011 Nicholas Bellinger <nab@linux-iscsi.org> iscsi: Resolve iscsi_proto.h naming conflicts with drivers/target/iscsi

This patch renames the following iscsi_proto.h structures to avoid
namespace issues with drivers/target/iscsi/iscsi_target_core.h:

*) struct iscsi_cmd -> struct iscsi_scsi_req
*) struct iscsi_cmd_rsp -> struct iscsi_scsi_rsp
*) struct iscsi_login -> struct iscsi_login_req

This patch includes useful ISCSI_FLAG_LOGIN_[CURRENT,NEXT]_STAGE*,
and ISCSI_FLAG_SNACK_TYPE_* definitions used by iscsi_target_mod, and
fixes the incorrect definition of struct iscsi_snack to following
RFC-3720 Section 10.16. SNACK Request.

Also, this patch updates libiscsi, iSER, be2iscsi, and bn2xi to
use the updated structure definitions in a handful of locations.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
ser/iser_initiator.c
ece236ce2fad9c27a6fd2530f899289025194bce 22-Jul-2011 Linus Torvalds <torvalds@linux-foundation.org> Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (26 commits)
IB/qib: Defer HCA error events to tasklet
mlx4_core: Bump the driver version to 1.0
RDMA/cxgb4: Use printk_ratelimited() instead of printk_ratelimit()
IB/mlx4: Support PMA counters for IBoE
IB/mlx4: Use flow counters on IBoE ports
IB/pma: Add include file for IBA performance counters definitions
mlx4_core: Add network flow counters
mlx4_core: Fix location of counter index in QP context struct
mlx4_core: Read extended capabilities into the flags field
mlx4_core: Extend capability flags to 64 bits
IB/mlx4: Generate GID change events in IBoE code
IB/core: Add GID change event
RDMA/cma: Don't allow IPoIB port space for IBoE
RDMA: Allow for NULL .modify_device() and .modify_port() methods
IB/qib: Update active link width
IB/qib: Fix potential deadlock with link down interrupt
IB/qib: Add sysfs interface to read free contexts
IB/mthca: Remove unnecessary read of PCI_CAP_ID_EXP
IB/qib: Remove double define
IB/qib: Remove unnecessary read of PCI_CAP_ID_EXP
...
69cce1d1404968f78b177a0314f5822d5afdbbfb 18-Jul-2011 David S. Miller <davem@davemloft.net> net: Abstract dst->neighbour accesses behind helpers.

dst_{get,set}_neighbour()

Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib_main.c
poib/ipoib_multicast.c
fd1b6c4a693c9cac59375ffb36ffe5d7c079037c 13-Jul-2011 Bart Van Assche <bvanassche@acm.org> IB/srp: Avoid duplicate devices from LUN scan

SCSI scanning of a channel:id:lun triplet in Linux works as follows
(function scsi_scan_target() in drivers/scsi/scsi_scan.c):

- If lun == SCAN_WILD_CARD, send a REPORT LUNS command to the target
and process the result.

- If lun != SCAN_WILD_CARD, send an INQUIRY command to the LUN
corresponding to the specified channel:id:lun triplet to verify
whether the LUN exists.

So a SCSI driver must either take the channel and target id values in
account in its quecommand() function or it should declare that it only
supports one channel and one target id.

Currently the ib_srp driver does neither. As a result scanning the
SCSI bus via e.g. rescan-scsi-bus.sh causes many duplicate SCSI
devices to be created. For each 0:0:L device, several duplicates are
created with the same LUN number and with (C:I) != (0:0). Fix this by
declaring that the ib_srp driver only supports one channel and one
target id.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: <stable@kernel.org>
Acked-by: David Dillow <dillowda@ornl.gov>
Signed-off-by: Roland Dreier <roland@purestorage.com>
rp/ib_srp.c
a6b7a407865aab9f849dd99a71072b7cd1175116 06-Jun-2011 Alexey Dobriyan <adobriyan@gmail.com> net: remove interrupt.h inclusion from netdevice.h

* remove interrupt.g inclusion from netdevice.h -- not needed
* fixup fallout, add interrupt.h and hardirq.h back where needed.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ser/iscsi_iser.h
4c171acc20794af16a27da25e11ec4e9cad5d9fa 26-May-2011 Linus Torvalds <torvalds@linux-foundation.org> Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
RDMA/cma: Save PID of ID's owner
RDMA/cma: Add support for netlink statistics export
RDMA/cma: Pass QP type into rdma_create_id()
RDMA: Update exported headers list
RDMA/cma: Export enum cma_state in <rdma/rdma_cm.h>
RDMA/nes: Add a check for strict_strtoul()
RDMA/cxgb3: Don't post zero-byte read if endpoint is going away
RDMA/cxgb4: Use completion objects for event blocking
IB/srp: Fix integer -> pointer cast warnings
IB: Add devnode methods to cm_class and umad_class
IB/mad: Return EPROTONOSUPPORT when an RDMA device lacks the QP required
IB/uverbs: Add devnode method to set path/mode
RDMA/ucma: Add .nodename/.mode to tell userspace where to create device node
RDMA: Add netlink infrastructure
RDMA: Add error handling to ib_core_init()
8dc4abdf4c82d0e1c47f14b6615406d31975ea66 25-May-2011 Roland Dreier <roland@purestorage.com> Merge branches 'cma', 'cxgb3', 'cxgb4', 'misc', 'nes', 'netlink', 'srp' and 'uverbs' into for-next
b26f9b9949013fec31b23c426fc463164ae08891 01-Apr-2010 Sean Hefty <sean.hefty@intel.com> RDMA/cma: Pass QP type into rdma_create_id()

The RDMA CM currently infers the QP type from the port space selected
by the user. In the future (eg with RDMA_PS_IB or XRC), there may not
be a 1-1 correspondence between port space and QP type. For netlink
export of RDMA CM state, we want to export the QP type to userspace,
so it is cleaner to explicitly associate a QP type to an ID.

Modify rdma_create_id() to allow the user to specify the QP type, and
use it to make our selections of datagram versus connected mode.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
ser/iser_verbs.c
737b94eb41cb99250ccce9148ca411b55d4dc96a 23-May-2011 Roland Dreier <roland@purestorage.com> IB/srp: Fix integer -> pointer cast warnings

Fix

drivers/infiniband/ulp/srp/ib_srp.c: In function 'srp_handle_recv':
drivers/infiniband/ulp/srp/ib_srp.c:1150: warning: cast to pointer from integer of different size
drivers/infiniband/ulp/srp/ib_srp.c: In function 'srp_send_completion':
drivers/infiniband/ulp/srp/ib_srp.c:1234: warning: cast to pointer from integer of different size

by adding an intermediate cast to uintptr_t.

Signed-off-by: Roland Dreier <roland@purestorage.com>
Acked-by: David Dillow <dillowda@ornl.gov>
rp/ib_srp.c
3d96c74d8983b16bc7ecb196e61a2173fcc3f09f 19-Apr-2011 Michał Mirosław <mirq-linux@rere.qmqm.pl> net: infiniband/ulp/ipoib: convert to hw_features

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib.h
poib/ipoib_cm.c
poib/ipoib_ethtool.c
poib/ipoib_ib.c
poib/ipoib_main.c
25985edcedea6396277003854657b5f3cb31a628 31-Mar-2011 Lucas De Marchi <lucas.demarchi@profusion.mobi> Fix common misspellings

Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
ser/iscsi_iser.h
0625bef6060fab4aab0e484130b59af5e9ac81bc 24-Mar-2011 Linus Torvalds <torvalds@linux-foundation.org> Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB: Increase DMA max_segment_size on Mellanox hardware
IB/mad: Improve an error message so error code is included
RDMA/nes: Don't print success message at level KERN_ERR
RDMA/addr: Fix return of uninitialized ret value
IB/srp: try to use larger FMR sizes to cover our mappings
IB/srp: add support for indirect tables that don't fit in SRP_CMD
IB/srp: rework mapping engine to use multiple FMR entries
IB/srp: allow sg_tablesize to be set for each target
IB/srp: move IB CM setup completion into its own function
IB/srp: always avoid non-zero offsets into an FMR
be8b981453a4904399cb090c1660618e250092d8 19-Jan-2011 David Dillow <dillowda@ornl.gov> IB/srp: try to use larger FMR sizes to cover our mappings

Now that we can get larger SG lists, we can take advantage of HCAs that
allow us to use larger FMR sizes. In many cases, we can use up to 512
entries, so start there and work our way down.

Signed-off-by: David Dillow <dillowda@ornl.gov>
rp/ib_srp.c
rp/ib_srp.h
c07d424d6118d528ef71b22b7424bfc359c307a5 16-Jan-2011 David Dillow <dillowda@ornl.gov> IB/srp: add support for indirect tables that don't fit in SRP_CMD

This allows us to guarantee the ability to submit up to 8 MB requests
based on the current value of SCSI_MAX_SG_CHAIN_SEGMENTS. While FMR will
usually condense the requests into 8 SG entries, it is imperative that
the target support external tables in case the FMR mapping fails or is
not supported.

We add a safety valve to allow targets without the needed support to
reap the benefits of the large tables, but fail in a manner that lets
the user know that the data didn't make it to the device. The user must
add "allow_ext_sg=1" to the target parameters to indicate that the
target has the needed support.

If indirect_sg_entries is not specified in the modules options, then
the sg_tablesize for the target will default to cmd_sg_entries unless
overridden by the target options.

Signed-off-by: David Dillow <dillowda@ornl.gov>
rp/ib_srp.c
rp/ib_srp.h
8f26c9ff9cd0317ad867bce972f69e0c6c2cbe3c 15-Jan-2011 David Dillow <dillowda@ornl.gov> IB/srp: rework mapping engine to use multiple FMR entries

Instead of forcing all of the S/G entries to fit in one FMR, and falling
back to indirect descriptors if that fails, allow the use of as many
FMRs as needed to map the request. This lays the groundwork for allowing
indirect descriptor tables that are larger than can fit in the command
IU, but should marginally improve performance now by reducing the number
of indirect descriptors needed.

We increase the minimum page size for the FMR pool to 4K, as larger
pages help increase the coverage of each FMR, and it is rare that the
kernel would send down a request with scattered 512 byte fragments.

This patch also move some of the target initialization code afte the
parsing of options, to keep it together with the new code that needs to
allocate memory based on the options given.

Signed-off-by: David Dillow <dillowda@ornl.gov>
rp/ib_srp.c
rp/ib_srp.h
4924864404d0ce2c32a6d20b27b5b6fcb31e481d 15-Jan-2011 David Dillow <dillowda@ornl.gov> IB/srp: allow sg_tablesize to be set for each target

Different configurations of target software allow differing max sizes of
the command IU. Allowing this to be changed per-target allows all
targets on an initiator to get an optimal setting.

We deprecate srp_sg_tablesize and replace it with cmd_sg_entries in
preparation for allowing more indirect descriptors than can fit in the
IU.

Signed-off-by: David Dillow <dillowda@ornl.gov>
rp/ib_srp.c
rp/ib_srp.h
961e0be89a5120a1409ebc525cca6f603615a8a8 14-Jan-2011 David Dillow <dillowda@ornl.gov> IB/srp: move IB CM setup completion into its own function

This is to clean up prior to further changes.

Signed-off-by: David Dillow <dillowda@ornl.gov>
rp/ib_srp.c
8c4037b501acd2ec3abc7925e66af8af40a2da9d 14-Jan-2011 David Dillow <dillowda@ornl.gov> IB/srp: always avoid non-zero offsets into an FMR

It is unclear exactly how this code works around Mellanox SRP targets,
or if the problem is on the target side or in the HCA itself. In an
abundance of caution, we should always enable the workaround.

Signed-off-by: David Dillow <dillowda@ornl.gov>
rp/ib_srp.c
7c53c6f89d7a6487986c51cd73ae9a9be338a8f4 16-Feb-2011 Mike Christie <michaelc@cs.wisc.edu> [SCSI] iser: export addr and port

This pactch has iser export the address and port
of the endpoint.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
ser/iscsi_iser.c
6a108a14fa356ef607be308b68337939e56ea94e 20-Jan-2011 David Rientjes <rientjes@google.com> kconfig: rename CONFIG_EMBEDDED to CONFIG_EXPERT

The meaning of CONFIG_EMBEDDED has long since been obsoleted; the option
is used to configure any non-standard kernel with a much larger scope than
only small devices.

This patch renames the option to CONFIG_EXPERT in init/Kconfig and fixes
references to the option throughout the kernel. A new CONFIG_EMBEDDED
option is added that automatically selects CONFIG_EXPERT when enabled and
can be used in the future to isolate options that should only be
considered for embedded systems (RISC architectures, SLOB, etc).

Calling the option "EXPERT" more accurately represents its intention: only
expert users who understand the impact of the configuration changes they
are making should enable it.

Reviewed-by: Ingo Molnar <mingo@elte.hu>
Acked-by: David Woodhouse <david.woodhouse@intel.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Greg KH <gregkh@suse.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Robin Holt <holt@sgi.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
poib/Kconfig
4790f4dc5f4326dab5d81ed8fb8c9473e620bdbb 17-Jan-2011 Roland Dreier <rolandd@cisco.com> Merge branches 'misc', 'mlx4', 'mthca', 'nes' and 'srp' into for-next
f06267104dd9112f11586830d22501d0e26245ea 19-Oct-2010 Tejun Heo <tj@kernel.org> RDMA: Update workqueue usage

* ib_wq is added, which is used as the common workqueue for infiniband
instead of the system workqueue. All system workqueue usages
including flush_scheduled_work() callers are converted to use and
flush ib_wq.

* cancel_delayed_work() + flush_scheduled_work() converted to
cancel_delayed_work_sync().

* qib_wq is removed and ib_wq is used instead.

This is to prepare for deprecation of flush_scheduled_work().

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
695b83495e2fba9d3a883193cfc9d5eefa96a911 13-Jan-2011 Bart Van Assche <bvanassche@acm.org> IB/srp: Test only once whether iu allocation succeeded

Merge the two tests in srp_queuecommand() of whether information unit
allocation succeeded into one. An intended side effect of this change
is that we fix the warning:

drivers/infiniband/ulp/srp/ib_srp.c: In function 'srp_queuecommand':
drivers/infiniband/ulp/srp/ib_srp.c:1116: warning: 'req' may be used uninitialized in this function

(seen with CONFIG_CC_OPTIMIZE_FOR_SIZE=y at least with gcc 4.4.4)

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: David Dillow <dillowda@ornl.gov>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
948579cd8c6ea7c8c98c52b79f4470952e182ebd 05-Nov-2010 Joe Perches <joe@perches.com> RDMA: Use vzalloc() to replace vmalloc()+memset(0)

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
poib/ipoib_main.c
2b76c05794e66655e10633d2d78287854c991f63 11-Jan-2011 Roland Dreier <rolandd@cisco.com> Merge branches 'cxgb4', 'ipath', 'ipoib', 'mlx4', 'mthca', 'nes', 'qib' and 'srp' into for-next
8ae31e5b1fc73751d800d551fb30340caa53c7dd 11-Jan-2011 Or Gerlitz <ogerlitz@voltaire.com> IPoIB: Add GRO support

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
poib/ipoib_ib.c
poib/ipoib_main.c
19e364f6801e38972673278adedaab1abf6f854c 11-Jan-2011 Or Gerlitz <ogerlitz@voltaire.com> IPoIB: Remove LRO support

As a first step in moving from LRO to GRO, revert commit af40da894e9
("IPoIB: add LRO support"). Also eliminate the ethtool set_flags
callback which isn't needed anymore. Finally, we need to include
<linux/sched.h> directly to get the declaration of restart_syscall()
(which used to be included implicitly through <linux/inet_lro.h>).

Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Vladimir Sokolovsky <vlad@mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/Kconfig
poib/ipoib.h
poib/ipoib_ethtool.c
poib/ipoib_ib.c
poib/ipoib_main.c
9af762719e8f8fa282de02997dced593030eb238 26-Nov-2010 David Dillow <dillowda@ornl.gov> IB/srp: consolidate hot-path variables into cache lines

Put the variables accessed together in the hot-path into common
cachelines, and separate them by RW vs RO to avoid false dirtying.
We keep a local copy of the lkey and rkey in the target to avoid
traversing pointers (and associated cache lines) to find them.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: David Dillow <dillowda@ornl.gov>
rp/ib_srp.c
rp/ib_srp.h
e9684678221441f886b4d7c74f8770bb0981737a 26-Nov-2010 Bart Van Assche <bvanassche@acm.org> IB/srp: stop sharing the host lock with SCSI

We don't need protection against the SCSI stack, so use our own lock to
allow parallel progress on separate CPUs.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
[ broken out and small cleanups by David Dillow ]
Signed-off-by: David Dillow <dillowda@ornl.gov>
rp/ib_srp.c
rp/ib_srp.h
94a9174c630c8465ed9e97ecd242993429930c05 26-Nov-2010 Bart Van Assche <bvanassche@acm.org> IB/srp: reduce lock coverage of command completion

We only need the lock to cover list and credit manipulations, so push
those into srp_remove_req() and update the call chains.

We reorder the request removal and command completion in
srp_process_rsp() to avoid the SCSI mid-layer sending another command
before we've released our request and added any credits returned by the
target. This prevents us from returning HOST_BUSY unneccesarily.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
[ broken out, small cleanups, and modified to avoid potential extraneous
HOST_BUSY returns by David Dillow ]
Signed-off-by: David Dillow <dillowda@ornl.gov>
rp/ib_srp.c
76c75b258f1fe6abac6af2356989ad4d6518886e 26-Nov-2010 Bart Van Assche <bvanassche@acm.org> IB/srp: reduce local coverage for command submission and EH

We only need locks to protect our lists and number of credits available.
By pre-consuming the credit for the request, we can reduce our lock
coverage to just those areas. If we don't actually send the request,
we'll need to put the credit back into the pool.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
[ broken out and small cleanups by David Dillow ]
Signed-off-by: David Dillow <dillowda@ornl.gov>
rp/ib_srp.c
rp/ib_srp.h
536ae14e7588e85203d4b4147c041309be5b3efb 26-Nov-2010 Bart Van Assche <bvanassche@acm.org> IB/srp: don't move active requests to their own list

We use req->scmnd != NULL to indicate an active request, so there's no
need to keep a separate list for them. We can afford the array iteration
during error handling, and dropping it gives us one less item that needs
lock protection.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
[ broken out and small cleanups by David Dillow ]
Signed-off-by: David Dillow <dillowda@ornl.gov>
rp/ib_srp.c
rp/ib_srp.h
dcb4cb85f4b7caac9769bce464fef16306a4758c 26-Nov-2010 Bart Van Assche <bvanassche@acm.org> IB/srp: allow lockless work posting

Only one CPU at a time will own an RX IU, so using the address of the IU
as the work request cookie allows us to avoid taking a lock. We can
similarly prepare the TX path for lockless posting by moving the free TX
IUs to a list. This also removes the requirement that the queue sizes be
a power of 2.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
[ broken out, small cleanups, and modified to avoid needing an extra field
in the IU by David Dillow]
Signed-off-by: David Dillow <dillowda@ornl.gov>
rp/ib_srp.c
rp/ib_srp.h
9709f0e05b827049733f439de82a4a1688b37b86 26-Nov-2010 Bart Van Assche <bvanassche@acm.org> IB/srp: consolidate state change code

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
[ broken out and small cleanups by David Dillow ]
Signed-off-by: David Dillow <dillowda@ornl.gov>
rp/ib_srp.c
f8b6e31e4e46bf514c27fce38783ed5615cca01d 26-Nov-2010 David Dillow <dillowda@ornl.gov> IB/srp: allow task management without a previous request

We can only have one task management comment outstanding, so move the
completion and status to the target port. This allows us to handle
resets of a LUN without a corresponding request having been sent.
Meanwhile, we don't need to play games with host_scribble, just use it
as the pointer it is.

This fixes a crash when we issue a bus reset using sg_reset.

Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=13893
Reported-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: David Dillow <dillowda@ornl.gov>
rp/ib_srp.c
rp/ib_srp.h
f281233d3eba15fb225d21ae2e228fd4553d824a 16-Nov-2010 Jeff Garzik <jeff@garzik.org> SCSI host lock push-down

Move the mid-layer's ->queuecommand() invocation from being locked
with the host lock to being unlocked to facilitate speeding up the
critical path for drivers who don't need this lock taken anyway.

The patch below presents a simple SCSI host lock push-down as an
equivalent transformation. No locking or other behavior should change
with this patch. All existing bugs and locking orders are preserved.

Additionally, add one parameter to queuecommand,
struct Scsi_Host *
and remove one parameter from queuecommand,
void (*done)(struct scsi_cmnd *)

Scsi_Host* is a convenient pointer that most host drivers need anyway,
and 'done' is redundant to struct scsi_cmnd->scsi_done.

Minimal code disturbance was attempted with this change. Most drivers
needed only two one-line modifications for their host lock push-down.

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Acked-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
rp/ib_srp.c
9e5fca251f44832cb996961048ea977f80faf6ea 27-Oct-2010 Linus Torvalds <torvalds@linux-foundation.org> Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (63 commits)
IB/qib: clean up properly if pci_set_consistent_dma_mask() fails
IB/qib: Allow driver to load if PCIe AER fails
IB/qib: Fix uninitialized pointer if CONFIG_PCI_MSI not set
IB/qib: Fix extra log level in qib_early_err()
RDMA/cxgb4: Remove unnecessary KERN_<level> use
RDMA/cxgb3: Remove unnecessary KERN_<level> use
IB/core: Add link layer type information to sysfs
IB/mlx4: Add VLAN support for IBoE
IB/core: Add VLAN support for IBoE
IB/mlx4: Add support for IBoE
mlx4_en: Change multicast promiscuous mode to support IBoE
mlx4_core: Update data structures and constants for IBoE
mlx4_core: Allow protocol drivers to find corresponding interfaces
IB/uverbs: Return link layer type to userspace for query port operation
IB/srp: Sync buffer before posting send
IB/srp: Use list_first_entry()
IB/srp: Reduce number of BUSY conditions
IB/srp: Eliminate two forward declarations
IB/mlx4: Signal node desc changes to SM by using FW to generate trap 144
IB: Replace EXTRA_CFLAGS with ccflags-y
...
732eacc0542d0aa48797f675888b85d6065af837 26-Oct-2010 Hagen Paul Pfeifer <hagen@jauu.net> replace nested max/min macros with {max,min}3 macro

Use the new {max,min}3 macros to save some cycles and bytes on the stack.
This patch substitutes trivial nested macros with their counterpart.

Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Cc: Joe Perches <joe@perches.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
poib/ipoib_main.c
116e9535fe5e00bafab7a637f306b110cf95cff5 27-Oct-2010 Roland Dreier <rolandd@cisco.com> Merge branches 'amso1100', 'cma', 'cxgb3', 'cxgb4', 'ehca', 'iboe', 'ipoib', 'misc', 'mlx4', 'nes', 'qib' and 'srp' into for-next
19081f31ce941a22bfc681d18ae2d31e31084df5 18-Oct-2010 David Dillow <dillowda@ornl.gov> IB/srp: Sync buffer before posting send

srp_send_tsk_mgmt() was missing the proper DMA sync calls before posting
the buffer to the device.

Signed-off-by: David Dillow <dillowda@ornl.gov>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
21c1a90769e680e7c1f49bae4c5804cf0c7bc814 30-Aug-2010 Bart Van Assche <bvanassche@acm.org> IB/srp: Use list_first_entry()

Use the list_first_entry() macro in ib_srp instead of open-coding the equivalent,
which makes the source code slightly more descriptive. The list_first_entry()
macro itself was introduced in kernel 2.6.22.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: David Dillow <dillowda@ornl.gov>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
7ade400aba9a675b610074d6609658661db07eeb 30-Aug-2010 Bart Van Assche <bvanassche@acm.org> IB/srp: Reduce number of BUSY conditions

As proposed by the SRP (draft) standard, ib_srp reserves one ring
element for SRP_TSK_MGMT requests. This patch makes sure that the SCSI
mid-layer never tries to queue more than (SRP request limit) - 1 SCSI
commands to ib_srp. This improves performance for targets whose request
limit is less than or equal to SRP_NORMAL_REQ_SQ_SIZE by reducing the
number of BUSY responses reported by ib_srp to the SCSI mid-layer.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: David Dillow <dillowda@ornl.gov>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
05a1d7504f836ee67e27f2488cb5b8126b51dbd4 08-Oct-2010 David Dillow <dillowda@ornl.gov> IB/srp: Eliminate two forward declarations

Signed-off-by: David Dillow <dillowda@ornl.gov>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
c3aa9b186b95025d4ba4e90d6140c9887dfaae0a 20-Sep-2010 Eli Cohen <eli@dev.mellanox.co.il> IPoIB: Set dev_id field of net_device

Use the net device's dev_id field to encode the port number of the pci
device. This can be used to to associate a net device with the pci
device's port. The encoding is: dev_id = port - 1.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
bb12588a38e6db85e01dceadff7bc161fc92e7d2 08-Oct-2010 David Dillow <dillowda@ornl.gov> IB/srp: Implement SRP_CRED_REQ and SRP_AER_REQ

This patch adds support for SRP_CRED_REQ to avoid a lockup by targets
that use that mechanism to return credits to the initiator. This
prevents a lockup observed in the field where we would never add the
credits from the SRP_CRED_REQ to our current count, and would therefore
never send another command to the target.

Minimal support for SRP_AER_REQ is also added, as these messages can
also be used to convey additional credits to the initiator.

Based upon extensive debugging and code by Bart Van Assche and a bug
report by Chris Worley.

Signed-off-by: David Dillow <dillowda@ornl.gov>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
rp/ib_srp.h
dd5e6e38b2b8bd8bf71cae800e2b613e85ef1522 30-Aug-2010 Bart Van Assche <bvanassche@acm.org> IB/srp: Preparation for transmit ring response allocation

The transmit ring in ib_srp (srp_target.tx_ring) is currently only used
for allocating requests sent by the initiator to the target. This patch
prepares using that ring for allocation of both requests and responses.
Also, this patch differentiates the uses of SRP_SQ_SIZE, increases the
size of the IB send completion queue by one element and reserves one
transmit ring slot for SRP_TSK_MGMT requests.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: David Dillow <dillowda@ornl.gov>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
rp/ib_srp.h
631dd1a885b6d7e9f6f51b4e5b311c2bb04c323c 18-Oct-2010 Justin P. Mattock <justinmattock@gmail.com> Update broken web addresses in the kernel.

The patch below updates broken web addresses in the kernel

Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Finn Thain <fthain@telegraphics.com.au>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Dimitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Mike Frysinger <vapier.adi@gmail.com>
Acked-by: Ben Pfaff <blp@cs.stanford.edu>
Acked-by: Hans J. Koch <hjk@linutronix.de>
Reviewed-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
ser/Kconfig
7b4c876961ad6ddcfacd69b25fe7e13ff41fe322 28-Sep-2010 Eli Cohen <eli@mellanox.co.il> IPoIB: Skip IBoE ports

IPoIB is IP-over-Infiniband link layer. In the case of IBoE, the link
layer is Ethernet and IP can work directly over Ethernet, so disable
IPoIB for non-IB_LINK_LAYER_INFINIBAND ports.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
fed1db33fe85573487a4732d628ac5afdb5dc776 27-Aug-2010 Christoph Lameter <cl@linux.com> IPoIB: Set pkt_type correctly for multicast packets (fix IGMP breakage)

IGMP processing is broken because the IPOIB does not set the
skb->pkt_type the right way for multicast traffic. All incoming
packets are set to PACKET_HOST which means that igmp_recv() will
ignore the IGMP broadcasts/multicasts.

This in turn means that the IGMP timers are firing and are sending
information about multicast subscriptions unnecessarily. In a large
private network this can cause traffic spikes.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_ib.c
3cc08fc35db75b059118626c30b60b0f56583802 08-Aug-2010 Linus Torvalds <torvalds@linux-foundation.org> Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (42 commits)
IB/qib: Add missing <linux/slab.h> include
IB/ehca: Drop unnecessary NULL test
RDMA/nes: Fix confusing if statement indentation
IB/ehca: Init irq tasklet before irq can happen
RDMA/nes: Fix misindented code
RDMA/nes: Fix showing wqm_quanta
RDMA/nes: Get rid of "set but not used" variables
RDMA/nes: Read firmware version from correct place
IB/srp: Export req_lim via sysfs
IB/srp: Make receive buffer handling more robust
IB/srp: Use print_hex_dump()
IB: Rename RAW_ETY to RAW_ETHERTYPE
RDMA/nes: Fix two sparse warnings
RDMA/cxgb3: Make needlessly global iwch_l2t_send() static
IB/iser: Make needlessly global iser_alloc_rx_descriptors() static
RDMA/cxgb4: Add timeouts when waiting for FW responses
IB/qib: Fix race between qib_error_qp() and receive packet processing
IB/qib: Limit the number of packets processed per interrupt
IB/qib: Allow writes to the diag_counters to be able to clear them
IB/qib: Set cfgctxts to number of CPUs by default
...
03b37ecdb3975f09832747600853d3818a50eda3 05-Aug-2010 Roland Dreier <rolandd@cisco.com> Merge branches 'cxgb3', 'cxgb4', 'ehca', 'ipath', 'misc', 'nes', 'qib' and 'srp' into for-next
89de74866b846cc48780fda3de7fd223296aaca9 03-Aug-2010 Bart Van Assche <bvanassche@acm.org> IB/srp: Export req_lim via sysfs

Export req_lim via sysfs for debugging.

Signed-off-by: Bart Van Assche <bart.vanassche@gmail.com>
Acked-by: David Dillow <dave@thedillows.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
6ba74014c1ab0e37af7de6f64b4eccbbae3cb9e7 04-Aug-2010 Linus Torvalds <torvalds@linux-foundation.org> Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1443 commits)
phy/marvell: add 88ec048 support
igb: Program MDICNFG register prior to PHY init
e1000e: correct MAC-PHY interconnect register offset for 82579
hso: Add new product ID
can: Add driver for esd CAN-USB/2 device
l2tp: fix export of header file for userspace
can-raw: Fix skb_orphan_try handling
Revert "net: remove zap_completion_queue"
net: cleanup inclusion
phy/marvell: add 88e1121 interface mode support
u32: negative offset fix
net: Fix a typo from "dev" to "ndev"
igb: Use irq_synchronize per vector when using MSI-X
ixgbevf: fix null pointer dereference due to filter being set for VLAN 0
e1000e: Fix irq_synchronize in MSI-X case
e1000e: register pm_qos request on hardware activation
ip_fragment: fix subtracting PPPOE_SES_HLEN from mtu twice
net: Add getsockopt support for TCP thin-streams
cxgb4: update driver version
cxgb4: add new PCI IDs
...

Manually fix up conflicts in:
- drivers/net/e1000e/netdev.c: due to pm_qos registration
infrastructure changes
- drivers/net/phy/marvell.c: conflict between adding 88ec048 support
and cleaning up the IDs
- drivers/net/wireless/ipw2x00/ipw2100.c: trivial ipw2100_pm_qos_req
conflict (registration change vs marking it static)
c996bb47bb419b7c2f75499e11750142775e5da9 30-Jul-2010 Bart Van Assche <bvanassche@acm.org> IB/srp: Make receive buffer handling more robust

The current strategy in ib_srp for posting receive buffers is:

* Post one buffer after channel establishment.
* Post one buffer before sending an SRP_CMD or SRP_TSK_MGMT to the target.

As a result, only the first non-SRP_RSP information unit from the
target will be processed. If that first information unit is an
SRP_T_LOGOUT, it will be processed. On the other hand, if the
initiator receives an SRP_CRED_REQ or SRP_AER_REQ before it receives a
SRP_T_LOGOUT, the SRP_T_LOGOUT won't be processed.

We can fix this inconsistency by changing the strategy for posting
receive buffers to:

* Post all receive buffers after channel establishment.
* After a receive buffer has been consumed and processed, post it again.

A side effect is that the ib_post_recv() call is moved out of the SCSI
command processing path. Since __srp_post_recv() is not called
directly any more, get rid of it and move the code directly into
srp_post_recv(). Also, move srp_post_recv() up in the file to avoid a
forward declaration.

Signed-off-by: Bart Van Assche <bart.vanassche@gmail.com>
Acked-by: David Dillow <dave@thedillows.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
7a7008110b94dfaa90db4b0cc5b0c3f964c80506 29-Jul-2010 Bart Van Assche <bvanassche@acm.org> IB/srp: Use print_hex_dump()

Replace an open-coded dump of the receive buffer with a call to
print_hex_dump().

Signed-off-by: Bart Van Assche <bart.vanassche@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
48d8fcebb7abf64843314672c1208b730be911bb 20-Jul-2010 Or Gerlitz <ogerlitz@voltaire.com> IB/iser: Make needlessly global iser_alloc_rx_descriptors() static

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iser_initiator.c
7a52b34b07122ff5f45258d47f260f8a525518f0 06-Jun-2010 Or Gerlitz <ogerlitz@voltaire.com> IPoIB: Fix world-writable child interface control sysfs attributes

Sumeet Lahorani <sumeet.lahorani@oracle.com> reported that the IPoIB
child entries are world-writable; however we don't want ordinary users
to be able to create and destroy child interfaces, so fix them to be
writable only by root.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Cc: <stable@kernel.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
39827be26b36ef9cdbc661c92a269e0484cd9ef5 03-Jul-2010 Ben Hutchings <bhutchings@solarflare.com> IB/{nes, ipoib}: Pass supported flags to ethtool_op_set_flags()

Following commit 1437ce3983bcbc0447a0dedcd644c14fe833d266 "ethtool:
Change ethtool_op_set_flags to validate flags", ethtool_op_set_flags
takes a third parameter and cannot be used directly as an
implementation of ethtool_ops::set_flags.

Changes nes and ipoib driver to pass in the appropriate value.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib_ethtool.c
f8965467f366fd18f01feafb5db10512d7b4422c 21-May-2010 Linus Torvalds <torvalds@linux-foundation.org> Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1674 commits)
qlcnic: adding co maintainer
ixgbe: add support for active DA cables
ixgbe: dcb, do not tag tc_prio_control frames
ixgbe: fix ixgbe_tx_is_paused logic
ixgbe: always enable vlan strip/insert when DCB is enabled
ixgbe: remove some redundant code in setting FCoE FIP filter
ixgbe: fix wrong offset to fc_frame_header in ixgbe_fcoe_ddp
ixgbe: fix header len when unsplit packet overflows to data buffer
ipv6: Never schedule DAD timer on dead address
ipv6: Use POSTDAD state
ipv6: Use state_lock to protect ifa state
ipv6: Replace inet6_ifaddr->dead with state
cxgb4: notify upper drivers if the device is already up when they load
cxgb4: keep interrupts available when the ports are brought down
cxgb4: fix initial addition of MAC address
cnic: Return SPQ credit to bnx2x after ring setup and shutdown.
cnic: Convert cnic_local_flags to atomic ops.
can: Fix SJA1000 command register writes on SMP systems
bridge: fix build for CONFIG_SYSFS disabled
ARCNET: Limit com20020 PCI ID matches for SOHARD cards
...

Fix up various conflicts with pcmcia tree drivers/net/
{pcmcia/3c589_cs.c, wireless/orinoco/orinoco_cs.c and
wireless/orinoco/spectrum_cs.c} and feature removal
(Documentation/feature-removal-schedule.txt).

Also fix a non-content conflict due to pm_qos_requirement getting
renamed in the PM tree (now pm_qos_request) in net/mac80211/scan.c
ffebedb7ab3f7964a70a1771547b26af38a189d2 16-May-2010 Roland Dreier <rolandd@cisco.com> Merge branches 'amso1100', 'bkl', 'cma', 'cxgb3', 'cxgb4', 'ipoib', 'iser', 'masked-atomics', 'misc', 'mthca' and 'nes' into for-next
9fda1ac5fa09c49e9148f85be14f55e2bb856c0f 06-May-2010 Dan Carpenter <error27@gmail.com> IB/iser: Fix error flow in iser_create_ib_conn_res()

We shouldn't free things here because we free them later.
The call tree looks like this:
iser_connect() ==> initiating the connection establishment
and later
iser_cma_handler() => iser_route_handler() => iser_create_ib_conn_res()
if we fail here, eventually iser_conn_release() is called, resulting
in a double free.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iser_verbs.c
39ff05dbbbdb082bbabf06206c56b3cd4ef73904 05-May-2010 Or Gerlitz <ogerlitz@voltaire.com> IB/iser: Enhance disconnection logic for multi-pathing

The iser connection teardown flow isn't over until the underlying
Connection Manager (e.g the IB CM) delivers a disconnected or timeout
event through the RDMA-CM. When the remote (target) side isn't
reachable, e.g when some HW e.g port/hca/switch isn't functioning or
taken down administratively, the CM timeout flow is used and the event
may be generated only after relatively long time -- on the order of
tens of seconds.

The current iser code exposes this possibly long delay to higher
layers, specifically to the iscsid daemon and iscsi kernel stack. As a
result, the iscsi stack doesn't respond well: this low-level CM delay
is added to the fail-over time under HA schemes such as the one
provided by DM multipath through the multipathd(8) service.

This patch enhances the reference counting scheme on iser's IB
connections so that the disconnect flow initiated by iscsid from user
space (ep_disconnect) doesn't wait for the CM to deliver the
disconnect/timeout event. (The connection teardown isn't done from
iser's view point until the event is delivered)

The iser ib (rdma) connection object is destroyed when its reference
count reaches zero. When this happens on the RDMA-CM callback
context, extra care is taken so that the RDMA-CM does the actual
destroying of the associated ID, since doing it in the callback is
prohibited.

The reference count of iser ib connection normally reaches three,
where the <ref, deref> relations are

1. conn <init, terminate>
2. conn <bind, stop/destroy>
3. cma id <create, disconnect/error/timeout callbacks>

With this patch, multipath fail-over time is about 30 seconds, while
without this patch, multipath fail-over time is about 130 seconds.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iscsi_iser.c
ser/iscsi_iser.h
ser/iser_verbs.c
d265b9808272c9f25e1c36d3fb5ddb466efd90e9 05-May-2010 Or Gerlitz <ogerlitz@voltaire.com> IB/iser: Remove buggy back-pointer setting

The iscsi connection object life cycle includes binding and unbinding
(conn_stop) to/from the iscsi transport connection object. Since
iscsi connection objects are recycled, at the time the transport
connection (e.g iser's IB connection) is released, it is not valid to
touch the iscsi connection tied to the transport back-pointer since it
may already point to a different transport connection.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iser_verbs.c
2110f9bf37511df06220bb7e977f417baecf2950 05-May-2010 Or Gerlitz <ogerlitz@voltaire.com> IB/iser: Add asynchronous event handler

Add handler to handle events such as port up and down. This is useful
when testing high-availability schemes such as multi-pathing.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iscsi_iser.h
ser/iser_verbs.c
d414371795d54fa916938f948105d08928abfbb9 04-Mar-2010 Or Gerlitz <ogerlitz@voltaire.com> IPoIB: Allow disabling/enabling TSO on the fly through ethtool

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_ethtool.c
871039f02f8ec4ab2e5e9010718caa8e085786f1 11-Apr-2010 David S. Miller <davem@davemloft.net> Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6

Conflicts:
drivers/net/stmmac/stmmac_main.c
drivers/net/wireless/wl12xx/wl1271_cmd.c
drivers/net/wireless/wl12xx/wl1271_main.c
drivers/net/wireless/wl12xx/wl1271_spi.c
net/core/ethtool.c
net/mac80211/scan.c
4a35ecf8bf1c4b039503fa554100fe85c761de76 07-Apr-2010 David S. Miller <davem@davemloft.net> Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6

Conflicts:
drivers/net/bonding/bond_main.c
drivers/net/via-velocity.c
drivers/net/wireless/iwlwifi/iwl-agn.c
22bedad3ce112d5ca1eaf043d4990fa2ed698c87 01-Apr-2010 Jiri Pirko <jpirko@redhat.com> net: convert multicast list to list_head

Converts the list and the core manipulating with it to be the same as uc_list.

+uses two functions for adding/removing mc address (normal and "global"
variant) instead of a function parameter.
+removes dev_mcast.c completely.
+exposes netdev_hw_addr_list_* macros along with __hw_addr_* functions for
manipulation with lists on a sandbox (used in bonding and 80211 drivers)

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib_multicast.c
5a0e3ad6af8660be21ca98a971cd00f331318c05 24-Mar-2010 Tejun Heo <tj@kernel.org> include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
poib/ipoib_cm.c
poib/ipoib_fs.c
poib/ipoib_ib.c
poib/ipoib_multicast.c
poib/ipoib_verbs.c
poib/ipoib_vlan.c
ser/iscsi_iser.c
ser/iser_verbs.c
3e4aa12f8a81506c44f04b4f0eb7663981c5a282 22-Mar-2010 Jiri Pirko <jpirko@redhat.com> ipoib: remove addrlen check for mc addresses

Finally this bit can be removed. Currently, after the bonding driver is
changed/fixed (32a806c194ea112cfab00f558482dd97bee5e44e net-next-2.6),
that's not possible for an addr with different length than dev->addr_len
to be present in list. Removing this check as in new mc_list there will be
no addrlen in the record.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib_multicast.c
961cde93dee2658000ead32abffb8ddf0727abe0 19-Mar-2010 Linus Torvalds <torvalds@linux-foundation.org> Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (69 commits)
[SCSI] scsi_transport_fc: Fix synchronization issue while deleting vport
[SCSI] bfa: Update the driver version to 2.1.2.1.
[SCSI] bfa: Remove unused header files and did some cleanup.
[SCSI] bfa: Handle SCSI IO underrun case.
[SCSI] bfa: FCS and include file changes.
[SCSI] bfa: Modified the portstats get/clear logic
[SCSI] bfa: Replace bfa_get_attr() with specific APIs
[SCSI] bfa: New portlog entries for events (FIP/FLOGI/FDISC/LOGO).
[SCSI] bfa: Rename pport to fcport in BFA FCS.
[SCSI] bfa: IOC fixes, check for IOC down condition.
[SCSI] bfa: In MSIX mode, ignore spurious RME interrupts when FCoE ports are in FW mismatch state.
[SCSI] bfa: Fix Command Queue (CPE) full condition check and ack CPE interrupt.
[SCSI] bfa: IOC recovery fix in fcmode.
[SCSI] bfa: AEN and byte alignment fixes.
[SCSI] bfa: Introduce a link notification state machine.
[SCSI] bfa: Added firmware save clear feature for BFA driver.
[SCSI] bfa: FCS authentication related changes.
[SCSI] bfa: PCI VPD, FIP and include file changes.
[SCSI] bfa: Fix to copy fpma MAC when requested by user space application.
[SCSI] bfa: RPORT state machine: direct attach mode fix.
...
a48f509b26cec53338f4b0abd52ecea35e3974b8 04-Mar-2010 Or Gerlitz <ogerlitz@voltaire.com> IPoIB: Include return code in trace message for ib_post_send() failures

Print the return code of ib_post_send() if it fails to make these
debugging messages more useful.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
poib/ipoib_ib.c
f0dc117abdfa9a0e96c3d013d836460ef3cd08c7 03-Mar-2010 Eli Cohen <eli@mellanox.co.il> IPoIB: Fix TX queue lockup with mixed UD/CM traffic

The IPoIB UD QP reports send completions to priv->send_cq, which is
usually left unarmed; it only gets armed when the number of
outstanding send requests reaches the size of the TX queue. This
arming is done only in the send path for the UD QP. However, when
sending CM packets, the net queue may be stopped for the same reasons
but no measures are taken to recover the UD path from a lockup.

Consider this scenario: a host sends high rate of both CM and UD
packets, with a TX queue length of N. If at some time the number of
outstanding UD packets is more than N/2 and the overall outstanding
packets is N-1, and CM sends a packet (making the number of
outstanding sends equal N), the TX queue will be stopped. When all
the CM packets complete, the number of outstanding packets will still
be higher than N/2 so the TX queue will not be restarted.

Fix this by calling ib_req_notify_cq() when the queue is stopped in
the CM path.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
3ff1562ea48cddaa5ac1adcb8892227389a4c96c 03-Mar-2010 Linus Torvalds <torvalds@linux-foundation.org> Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (48 commits)
IB/srp: Clean up error path in srp_create_target_ib()
IB/srp: Split send and recieve CQs to reduce number of interrupts
RDMA/nes: Add support for KR device id 0x0110
IB/uverbs: Use anon_inodes instead of private infinibandeventfs
IB/core: Fix and clean up ib_ud_header_init()
RDMA/cxgb3: Mark RDMA device with CXIO_ERROR_FATAL when removing
RDMA/cxgb3: Don't allocate the SW queue for user mode CQs
RDMA/cxgb3: Increase the max CQ depth
RDMA/cxgb3: Doorbell overflow avoidance and recovery
IB/core: Pack struct ib_device a little tighter
IB/ucm: Clean whitespace errors
IB/ucm: Increase maximum devices supported
IB/ucm: Use stack variable 'base' in ib_ucm_add_one
IB/ucm: Use stack variable 'devnum' in ib_ucm_add_one
IB/umad: Clean whitespace
IB/umad: Increase maximum devices supported
IB/umad: Use stack variable 'base' in ib_umad_init_port
IB/umad: Use stack variable 'devnum' in ib_umad_init_port
IB/umad: Remove port_table[]
IB/umad: Convert *cdev to cdev in struct ib_umad_port
...
309ce156aa27f29338438011d292a8d6496623d3 20-Feb-2010 Jayamohan Kallickal <jayamohank@serverengines.com> [SCSI] libiscsi: Make iscsi_eh_target_reset start with session reset

The iscsi_eh_target_reset has been modified to attempt
target reset only. If it fails, then iscsi_eh_session_reset
will be called.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Jayamohan Kallickal <jayamohank@serverengines.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
ser/iscsi_iser.c
f57721507ffeaf8785f7f17254651de4d8b45e09 02-Mar-2010 Roland Dreier <rolandd@cisco.com> Merge branch 'srp' into for-next
5c2187f0a184d6c5ec87aab403b79a8bb24a7988 02-Mar-2010 Roland Dreier <rolandd@cisco.com> Merge branch 'iser' into for-next
da9d2f07306fc29a2f10885c2b0a463f3863c365 25-Feb-2010 Roland Dreier <rolandd@cisco.com> IB/srp: Clean up error path in srp_create_target_ib()

Instead of repeating the error unwinding steps in each place an error
can be detected, use the common idiom of gotos into an error flow.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
9c03dc9f19351edf25c1107e3cfd3cc538c7ab9e 02-Feb-2010 Bart Van Assche <bart.vanassche@gmail.com> IB/srp: Split send and recieve CQs to reduce number of interrupts

We can reduce the number of IB interrupts from two interrupts per
srp_queuecommand() call to one by using separate CQs for send and
receive completions and processing send completions by polling every
time a TX IU is allocated.

Receive completion events still trigger an interrupt.

Signed-off-by: Bart Van Assche <bart.vanassche@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
rp/ib_srp.h
6c74651c3bce418d3b29edfdeb72664f9441509a 27-Feb-2010 Jiri Pirko <jpirko@redhat.com> ipoib: returned back addrlen check for mc addresses

Apparently bogus mc address can break IPOIB multicast processing. Therefore
returning the check for addrlen back until this is resolved in bonding (I don't
see any other point from where mc address with non-dev->addr_len length can came
from).

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib_multicast.c
fbf219f1c89b15e90ec2db5a3e9636376dc623db 24-Feb-2010 Jiri Pirko <jpirko@redhat.com> infiniband: convert to use netdev_for_each_mc_addr

Due to the loop complexicity in nes_nic.c, I'm using char* to copy mc addresses
to it.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib_multicast.c
88ec415772144f4fc4a50b123bb6200de686898d 08-Feb-2010 Or Gerlitz <ogerlitz@voltaire.com> IB/iser: Remove redundant locking from iser scsi command response flow

Currently the iSER receive completion flow takes the session lock
twice. Optimize it to avoid the first one by letting
iser_task_rdma_finalize() be called only from the cleanup_task
callback invoked by iscsi_free_task, thus reducing the contention on
the session lock between the scsi command submission to the scsi
command completion flows.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iser_initiator.c
962b4b528ba87c8d837bb04794a1918c7de631cd 08-Feb-2010 Or Gerlitz <ogerlitz@voltaire.com> IB/iser: Use libiscsi passthrough mode

libiscsi passthrough mode invokes the transport xmit calls directly
without first going through an internal queue, unlike the other mode,
which uses a queue and a xmitworker thread. Now that the "cant_sleep"
prerequisite of iscsi_host_alloc is met, move to use it. Handling
xmit errors is now done by the passthrough flow of libiscsi. Since
the queue/worker aren't used in this mode, the code that schedules the
xmitworker is removed.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iscsi_iser.c
ser/iser_initiator.c
aae3c995ff74a183d15207436d383942485b2edd 08-Feb-2010 Or Gerlitz <ogerlitz@voltaire.com> IB/iser: Remove unnecessary connection checks

Remove unnecessary checks for the IB connection state and for QP
overflow, as conn state changes are reported by iSER to libiscsi and
handled there. QP overflow is theoretically possible only when
unsolicited data-outs are used; anyway it's being checked and handled
by HW drivers.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iscsi_iser.h
ser/iser_initiator.c
ser/iser_verbs.c
528f4e8c8341706a354ff96daf615e678e9b296f 08-Feb-2010 Or Gerlitz <ogerlitz@voltaire.com> IB/iser: Use atomic allocations

Two minor flows in iSER's data path still use allocations; move them
to be atomic as a preperation step towards moving to use libiscsi
passthrough mode.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iser_initiator.c
ser/iser_memory.c
f19624aa92003969ba822cd3c552800965aa530b 08-Feb-2010 Or Gerlitz <ogerlitz@voltaire.com> IB/iser: Simplify send flow/descriptors

Simplify and shrink the logic/code used for the send descriptors.
Changes include removing struct iser_dto (an unnecessary abstraction),
using struct iser_regd_buf only for handling SCSI commands, using
dma_sync instead of dma_map/unmap, etc.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iscsi_iser.c
ser/iscsi_iser.h
ser/iser_initiator.c
ser/iser_memory.c
ser/iser_verbs.c
78ad0a34dc138047529058c5f2265664cb70a052 08-Feb-2010 Or Gerlitz <ogerlitz@voltaire.com> IB/iser: Use different CQ for send completions

Use a different CQ for send completions, where send completions are
polled by the interrupt-driven receive completion handler. Therefore,
interrupts aren't used for the send CQ.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iscsi_iser.h
ser/iser_verbs.c
704315f082d473b34047817f0a6a01924f38501e 08-Feb-2010 Or Gerlitz <ogerlitz@voltaire.com> IB/iser: Remove atomic counter for posted receive buffers

Now that both the posting and reaping of receive buffers is done in
the completion path, the counter of outstanding buffers not be atomic.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iscsi_iser.h
ser/iser_initiator.c
ser/iser_verbs.c
bcc60c381d857ced653e912cbe6121294773e147 08-Feb-2010 Or Gerlitz <ogerlitz@voltaire.com> IB/iser: New receive buffer posting logic

Currently, the recv buffer posting logic is based on the transactional
nature of iSER which allows for posting a buffer before sending a PDU.
Change this to post only when the number of outstanding recv buffers
is below a water mark and in a batched manner, thus simplifying and
optimizing the data path. Use a pre-allocated ring of recv buffers
instead of allocating from kmem cache. A special treatment is given
to the login response buffer whose size must be 8K unlike the size of
buffers used for any other purpose which is 128 bytes.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iscsi_iser.c
ser/iscsi_iser.h
ser/iser_initiator.c
ser/iser_verbs.c
1cef4659850eeb862c248c7670e404d7a1711ed1 08-Feb-2010 Or Gerlitz <ogerlitz@voltaire.com> IB/iser: Revert commit bba7ebb "avoid recv buffer exhaustion"

We will make a major change in the recv buffer posting logic, after
which the problem commit bba7ebb "avoid recv buffer exhaustion caused
by unexpected PDUs" comes to solve doesn't exist any more, so revert it.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iscsi_iser.h
ser/iser_initiator.c
ser/iser_verbs.c
3ffe533c87281b68d469b279ff3a5056f9c75862 18-Feb-2010 Alexey Dobriyan <adobriyan@gmail.com> ipv6: drop unused "dev" arg of icmpv6_send()

Dunno, what was the idea, it wasn't used for a long time.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib_cm.c
757bebb3f989f10acc6f105e89305b0d19aa7c55 12-Feb-2010 Or Gerlitz <ogerlitz@voltaire.com> IPoIB: Remove TX moderation settings from ethtool support

As of commit f56bcd8 ("IPoIB: Use separate CQ for UD send
completions"), there are no TX interrupts. Change the ethtool code
not to report TX moderation settings, so users will not be misled to
think they can control TX interrupt moderation. Pointed out by Alex
Vainman <alexv@voltaire.com>

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_ethtool.c
e69381b4175ba162229646f6753ff1d87c24d468 16-Dec-2009 Linus Torvalds <torvalds@linux-foundation.org> Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (45 commits)
RDMA/cxgb3: Fix error paths in post_send and post_recv
RDMA/nes: Fix stale ARP issue
RDMA/nes: FIN during MPA startup causes timeout
RDMA/nes: Free kmap() resources
RDMA/nes: Check for zero STag
RDMA/nes: Fix Xansation test crash on cm_node ref_count
RDMA/nes: Abnormal listener exit causes loopback node crash
RDMA/nes: Fix crash in nes_accept()
RDMA/nes: Resource not freed for REJECTed connections
RDMA/nes: MPA request/response error checking
RDMA/nes: Fix query of ORD values
RDMA/nes: Fix MAX_CM_BUFFER define
RDMA/nes: Pass correct size to ioremap_nocache()
RDMA/nes: Update copyright and branding string
RDMA/nes: Add max_cqe check to nes_create_cq()
RDMA/nes: Clean up struct nes_qp
RDMA/nes: Implement IB_SIGNAL_ALL_WR as an iWARP extension
RDMA/nes: Add additional SFP+ PHY uC status check and PHY reset
RDMA/nes: Correct fast memory registration implementation
IB/ehca: Fix error paths in post_send and post_recv
...
14f369d1d61e7ac6578c54ca9ce3caaf4072412c 16-Dec-2009 Roland Dreier <rolandd@cisco.com> Merge branches 'amso1100', 'cma', 'cxgb3', 'ehca', 'ipath', 'ipoib', 'iser', 'misc', 'mlx4' and 'nes' into for-next
4ef58d4e2ad1fa2a3e5bbf41af2284671fca8cf8 10-Dec-2009 Linus Torvalds <torvalds@linux-foundation.org> Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (42 commits)
tree-wide: fix misspelling of "definition" in comments
reiserfs: fix misspelling of "journaled"
doc: Fix a typo in slub.txt.
inotify: remove superfluous return code check
hdlc: spelling fix in find_pvc() comment
doc: fix regulator docs cut-and-pasteism
mtd: Fix comment in Kconfig
doc: Fix IRQ chip docs
tree-wide: fix assorted typos all over the place
drivers/ata/libata-sff.c: comment spelling fixes
fix typos/grammos in Documentation/edac.txt
sysctl: add missing comments
fs/debugfs/inode.c: fix comment typos
sgivwfb: Make use of ARRAY_SIZE.
sky2: fix sky2_link_down copy/paste comment error
tree-wide: fix typos "couter" -> "counter"
tree-wide: fix typos "offest" -> "offset"
fix kerneldoc for set_irq_msi()
spidev: fix double "of of" in comment
comment typo fix: sybsystem -> subsystem
...
0cd4d0fd9b0a4e10c091fc6316d1bf92885dcd9c 09-Dec-2009 David J. Wilder <dwilder@us.ibm.com> IPoIB: Clear ipoib_neigh.dgid in ipoib_neigh_alloc()

IPoIB can miss a change in destination GID under some conditions. The
problem is caused when ipoib_neigh->dgid contains a stale address.
The fix is to set ipoib_neigh->dgid to zero in ipoib_neigh_alloc().

This can happen when a system using bonding on its IPoIB interfaces
has switched its active interface from interface A to B and back to A.
The system that fails over will not correctly processes the 2nd
address change, as described below.

When an address has changed neighbor->ha is updated with the new
address. Each neighbor has an associated ipoib_neigh.
ipoib_neigh->dgid also holds a copy of the remote node's hardware
address. When an address changes neighbor->ha is updated by the
network layer (arp code) with the new address. IPoIB detects this
change in ipoib_start_xmit() by comparing neighbor->ha with
ipoib_neigh->dgid. The bug is that ipoib_neigh->dgid may already
contain the new address (A) thus the change from B to A is missed by
ipoib. Here is the sequence of events:

ipoib_neigh->dgid = A and neighbor->ha = A

The address is switched to B (the first switch)

neighbor->ha = B

The change is seen in ipoib_start_xmit() -- neighbor->ha !=
ipoib_neigh->dgid so ipoib_neigh is released, and a new one is
allocated.

The allocator may return the same chunk of memory that was just
released, therefore ipoib_neigh->dgid still contains A at this point.

ipoib_neigh->dgid should be updated in neigh_add_path(), but if the
following conditions are true dgid is not updated:

1) __path_find() returns a path
2) path->ah is NULL

The remote system now switches from address B to A, neighbor->ha is
updated to A.

Now we have again : ipoib_neigh->dgid = A and neighbor->ha = A

Since the addresses are the same ipoib won't process the change in
address. Fix this by zeroing out the dgid field when allocating a new
struct ipoib_neigh.

Signed-off-by: David Wilder <dwilder@us.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
b20d038dff877566694181578c49c31616d622cd 11-Nov-2009 Mike Christie <michaelc@cs.wisc.edu> [SCSI] iser: set tgt and lu reset timeout

When iser enabled lu reset support it did not set the
bit to allow userspace to get/set the timeout. This
sets the tgt and lu reset timeout bits.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
ser/iscsi_iser.c
94e2bd688820aed72b4f8092f88c2ccf64e003de 16-Oct-2009 Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com> tree-wide: fix some typos and punctuation in comments

fix some typos and punctuation in comments

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
ser/iser_verbs.c
c1ccaf2478f84c2665cf57f981db143aa582d646 12-Nov-2009 Or Gerlitz <ogerlitz@voltaire.com> IB/iser: Rewrite SG handling for RDMA logic

After dma-mapping an SG list provided by the SCSI midlayer, iser has
to make sure the mapped SG is "aligned for RDMA" in the sense that its
possible to produce one mapping in the HCA IOMMU which represents the
whole SG. Next, the mapped SG is formatted for registration with the HCA.

This patch re-writes the logic that does the above, to make it clearer
and simpler. It also fixes a bug in the being aligned for RDMA checks,
where a "start" check wasn't done but rather only "end" check.

Signed-off-by: Alexander Nezhinsky <alexandern@voltaire.com>
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iser_memory.c
b8b9e1b8128d8854cf55740f9ceba3010143520d 22-Sep-2009 Jayamohan Kallickal <jayamohank@serverengines.com> [SCSI] libiscsi: iscsi_session_setup to allow for private space

This patch contains changes that allow iscsi_session_setup
to allocate private space for LLD's

Signed-off-by: Jayamohan Kallickal <jayamohank@serverengines.com>
Acked-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
ser/iscsi_iser.c
d7757be133cc05620608af46acd178686681b7ef 25-Sep-2009 Linus Torvalds <torvalds@linux-foundation.org> Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IPoIB: Don't turn on carrier for a non-active port
IB/mthca: Fix access to freed memory in catastrophic event handling
mlx4_core: Pass cache line size to device FW
RDMA/nes: Remove duplicate .ndo_set_mac_address field initialization
IB/mad: Fix lock-lock-timer deadlock in RMPP code
5ee95120841fd623c48d7d971182cf58e3b0c8de 24-Sep-2009 Moni Shoua <monis@Voltaire.COM> IPoIB: Don't turn on carrier for a non-active port

Multicast joins can succeed even if the IB port is down. This happens
when the SM runs on the same port with the requesting port. However,
IPoIB calls netif_carrier_on() when the join of the broadcast group
succeeds, without caring about the state of the IB port. The result
is an IPoIB interface in RUNNING state but without an active IB port
to support it.

If a bonding interface uses this IPoIB interface as a slave it might
not detect that this slave is almost useless and failover
functionality will be damaged. The fix checks the state of the IB
port in the carrier_task before calling netif_carrier_on().

Adresses: https://bugs.openfabrics.org/show_bug.cgi?id=1726
Signed-off-by: Moni Shoua <monis@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_multicast.c
d7e9660ad9d5e0845f52848bce31bcf5cdcdea6b 14-Sep-2009 Linus Torvalds <torvalds@linux-foundation.org> Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1623 commits)
netxen: update copyright
netxen: fix tx timeout recovery
netxen: fix file firmware leak
netxen: improve pci memory access
netxen: change firmware write size
tg3: Fix return ring size breakage
netxen: build fix for INET=n
cdc-phonet: autoconfigure Phonet address
Phonet: back-end for autoconfigured addresses
Phonet: fix netlink address dump error handling
ipv6: Add IFA_F_DADFAILED flag
net: Add DEVTYPE support for Ethernet based devices
mv643xx_eth.c: remove unused txq_set_wrr()
ucc_geth: Fix hangs after switching from full to half duplex
ucc_geth: Rearrange some code to avoid forward declarations
phy/marvell: Make non-aneg speed/duplex forcing work for 88E1111 PHYs
drivers/net/phy: introduce missing kfree
drivers/net/wan: introduce missing kfree
net: force bridge module(s) to be GPL
Subject: [PATCH] appletalk: Fix skb leak when ipddp interface is not loaded
...

Fixed up trivial conflicts:

- arch/x86/include/asm/socket.h

converted to <asm-generic/socket.h> in the x86 tree. The generic
header has the same new #define's, so that works out fine.

- drivers/net/tun.c

fix conflict between 89f56d1e9 ("tun: reuse struct sock fields") that
switched over to using 'tun->socket.sk' instead of the redundantly
available (and thus removed) 'tun->sk', and 2b980dbd ("lsm: Add hooks
to the TUN driver") which added a new 'tun->sk' use.

Noted in 'next' by Stephen Rothwell.
5e47596bee12597824a3b5b21e20f80b61e58a35 06-Sep-2009 Jason Gunthorpe <jgunthorpe@obsidianresearch.com> IPoIB: Check multicast address format

Check that the format of multicast link addresses is correct before
taking them from dev->mc_list to priv->multicast_list. This way we
never try to send a bogus address to the SA, which prevents badness
from erronous 'ip maddr addr add', broken bonding drivers, etc.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_multicast.c
721d67cdca5b7642b380ca0584de8dceecf6102f 06-Sep-2009 Roland Dreier <rolandd@cisco.com> IPoIB: Drop priv->lock before calling ipoib_send()

IPoIB currently must use irqsave locking for priv->lock, since it is
taken from interrupt context in one path. However, ipoib_send() does
skb_orphan(), and the network stack locking is not IRQ-safe.
Therefore we need to make sure we don't hold priv->lock when calling
ipoib_send() to avoid lockdep warnings (the code was almost certainly
safe in practice, since the only code path that takes priv->lock from
interrupt context would never call into the network stack).

Addresses: http://bugzilla.kernel.org/show_bug.cgi?id=13757
Reported-by: Bart Van Assche <bart.vanassche@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
poib/ipoib_multicast.c
cd0bcf4cb963a147baf0b79d94c25ba86220f708 06-Sep-2009 Roland Dreier <rolandd@cisco.com> IPoIB: Remove unused <rdma/ib_cache.h> includes

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
poib/ipoib_ib.c
451f14439847db302e5104c44458b2dbb4b1829d 31-Aug-2009 Eric Dumazet <eric.dumazet@gmail.com> drivers: Kill now superfluous ->last_rx stores

The generic packet receive code takes care of setting
netdev->last_rx when necessary, for the sake of the
bonding ARP monitor.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Neil Horman <nhorman@txudriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib_cm.c
poib/ipoib_ib.c
9cbc1cb8cd46ce1f7645b9de249b2ce8460129bb 15-Jun-2009 David S. Miller <davem@davemloft.net> Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6

Conflicts:
Documentation/feature-removal-schedule.txt
drivers/scsi/fcoe/fcoe.c
net/core/drop_monitor.c
net/core/net-traces.c
adf30907d63893e4208dfe3f5c88ae12bc2f25d5 02-Jun-2009 Eric Dumazet <eric.dumazet@gmail.com> net: skb->dst accessors

Define three accessors to get/set dst attached to a skb

struct dst_entry *skb_dst(const struct sk_buff *skb)

void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)

void skb_dst_drop(struct sk_buff *skb)
This one should replace occurrences of :
dst_release(skb->dst)
skb->dst = NULL;

Delete skb->dst field

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib_cm.c
poib/ipoib_main.c
poib/ipoib_multicast.c
86d15cd83363a9787039895cb1a1b6be50f82ad3 31-May-2009 Eric Dumazet <eric.dumazet@gmail.com> net: unset IFF_XMIT_DST_RELEASE for qeth and ipoib

Last two drivers that need skb->dst in their start_xmit() function

Tell dev_hard_start_xmit() to no release it by unsetting IFF_XMIT_DST_RELEASE

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib_main.c
b3cd5050bf8eb32ceecee129cac7c59e6f1668c4 14-May-2009 Mike Christie <michaelc@cs.wisc.edu> [SCSI] libiscsi: add task aborted state

If a task did not complete normally due to a TMF, libiscsi will
now complete the task with the state ISCSI_TASK_ABRT_TMF. Drivers
like bnx2i that need to free resources if a command did not complete normally
can then check the task state. If a driver does not need to send
a special command if we have dropped the session then they can check
for ISCSI_TASK_ABRT_SESS_RECOV.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iscsi_iser.c
10eb0f013c63c71c82ede77945a5f390c10cfda6 14-May-2009 Mike Christie <michaelc@cs.wisc.edu> [SCSI] iscsi: pass ep connect shost

When we create the tcp/ip connection by calling ep_connect, we currently
just go by the routing table info.

I think there are two problems with this.

1. Some drivers do not have access to a routing table. Some drivers like
qla4xxx do not even know about other ports.

2. If you have two initiator ports on the same subnet, the user may have
set things up so that session1 was supposed to be run through port1. and
session2 was supposed to be run through port2. It looks like we could
end with both sessions going through one of the ports.

Fixes for cxgb3i from Karen Xie.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iscsi_iser.c
26574401fef6766f6c3ca25b5c13febe662d2a32 13-May-2009 Eric W. Biederman <ebiederm@xmission.com> net: Fix ipoib rtnl_lock sysfs deadlock.

Network device sysfs files that grab the rtnl_lock unconditionally
will deadlock if accessed when the network device is being
unregistered. So use trylock and syscall_restart to avoid this
deadlock.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib_cm.c
poib/ipoib_vlan.c
61bd1e858db743af64f6e363c526f7e433d12e0c 03-May-2009 Linus Torvalds <torvalds@linux-foundation.org> Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: (53 commits)
[SCSI] libosd: OSD2r05: on-the-wire changes for latest OSD2 revision 5.
[SCSI] libosd: OSD2r05: OSD_CRYPTO_KEYID_SIZE will grow 20 => 32 bytes
[SCSI] libosd: OSD2r05: Prepare for rev5 attribute list changes
[SCSI] libosd: fix potential ERR_PTR dereference in osd_initiator.c
[SCSI] mpt2sas : bump driver version to 01.100.02.00
[SCSI] mpt2sas: fix hotplug event processing
[SCSI] mpt2sas : release diagnotic buffers prior host reset
[SCSI] mpt2sas : Broadcast Primative AEN bug fix
[SCSI] mpt2sas : Identify Dell series-7 adapters at driver load time
[SCSI] mpt2sas : driver name needs to be in the MPT2IOCINFO ioctl
[SCSI] mpt2sas : running out of message frames
[SCSI] mpt2sas : fix oops when firmware sends large sense buffer size
[SCSI] mpt2sas : the sanity check in base_interrupt needs to be on dword boundary
[SCSI] mpt2sas : unique ioctl magic number
[SCSI] fix sign extension with 1.5TB usb-storage LBD=y
[SCSI] ipr: Fix sleeping function called with interrupts disabled
[SCSI] fcoe: fip: add multicast filter to receive FIP advertisements.
[SCSI] libfc: Fix compilation warnings with allmodconfig
[SCSI] fcoe: fix spelling typos and bad comments
[SCSI] fcoe: don't export functions that are internal to fcoe
...
6b5d6c443a9b4fd71b633cef66b5db4de8a85787 21-Apr-2009 Mike Christie <michaelc@cs.wisc.edu> [SCSI] cxgb3i, iser, iscsi_tcp: set target can queue

Set target can queue limit to the number of preallocated
session tasks we have.

This along with the cxgb3i can_queue patch will fix a throughput
problem where it could only queue one LU worth of data at a time.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iscsi_iser.c
e028cc55cc5c90a1c57eefe560a0cbb4df1fed14 20-Apr-2009 Yossi Etigin <yosefe@voltaire.com> IPoIB: Disable NAPI while CQ is being drained

If NAPI is enabled while IPoIB's CQ is being drained, it creates a
race on priv->ibwc between ipoib_poll() and ipoib_drain_cq(), leading
to memory corruption.

The solution is to enable/disable NAPI in ipoib_ib_dev_{open/stop}()
instead of in ipoib_{open/stop}(), and sync NAPI on the INITIALIZED
flag instead on the ADMIN_UP flag. This way NAPI will be disabled when
ipoib_drain_cq() is called.

This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1587>.

Signed-off-by: Yossi Etigin <yosefe@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_ib.c
poib/ipoib_main.c
0534c8cb5c8a8a954751fa01eef7831a475a9ec5 10-Apr-2009 Linus Torvalds <torvalds@linux-foundation.org> Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
RDMA/nes: Add support for new SFP+ PHY
RDMA/nes: Add wide_ppm_offset parm for switch compatibility
RDMA/nes: Fix SFP+ PHY initialization
RDMA/nes: Fix nes_nic_cm_xmit() error handling
RDMA/nes: Fix error handling issues
RDMA/nes: Fix incorrect casts on 32-bit architectures
IPoIB: Document newish features
RDMA/cma: Create cm id even when IB port is down
RDMA/cma: Use rate from IPoIB broadcast when joining IPoIB multicast groups
IPoIB: Avoid free_netdev() BUG when destroying a child interface
mlx4_core: Don't leak mailbox for SET_PORT on Ethernet ports
RDMA/cxgb3: Release dependent resources only when endpoint memory is freed.
RDMA/cxgb3: Handle EEH events
IB/mlx4: Use pgprot_writecombine() for BlueFlame pages
edb5abb1e2a84fd8802a3577d95eac84fe1405ab 31-Mar-2009 Roland Dreier <rolandd@cisco.com> IPoIB: Avoid free_netdev() BUG when destroying a child interface

We have to release the RTNL before calling free_netdev() so that the
device state has a chance to become NETREG_UNREGISTERED. Otherwise
when removing a child interface, we hit the BUG() that tests the
device state in free_netdev().

Reported-by: Yossi Etigin <yosefe@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_vlan.c
d54b3538b0bfb31351d02d1669d4a978d2abfc5f 28-Mar-2009 Linus Torvalds <torvalds@linux-foundation.org> Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (119 commits)
[SCSI] scsi_dh_rdac: Retry for NOT_READY check condition
[SCSI] mpt2sas: make global symbols unique
[SCSI] sd: Make revalidate less chatty
[SCSI] sd: Try READ CAPACITY 16 first for SBC-2 devices
[SCSI] sd: Refactor sd_read_capacity()
[SCSI] mpt2sas v00.100.11.15
[SCSI] mpt2sas: add MPT2SAS_MINOR(221) to miscdevice.h
[SCSI] ch: Add scsi type modalias
[SCSI] 3w-9xxx: add power management support
[SCSI] bsg: add linux/types.h include to bsg.h
[SCSI] cxgb3i: fix function descriptions
[SCSI] libiscsi: fix possbile null ptr session command cleanup
[SCSI] iscsi class: remove host no argument from session creation callout
[SCSI] libiscsi: pass session failure a session struct
[SCSI] iscsi lib: remove qdepth param from iscsi host allocation
[SCSI] iscsi lib: have lib create work queue for transmitting IO
[SCSI] iscsi class: fix lock dep warning on logout
[SCSI] libiscsi: don't cap queue depth in iscsi modules
[SCSI] iscsi_tcp: replace scsi_debug/tcp_debug logging with iscsi conn logging
[SCSI] libiscsi_tcp: replace tcp_debug/scsi_debug logging with session/conn logging
...
13220a94d35708d5378114e96ffcc88d0a74fe99 26-Mar-2009 Linus Torvalds <torvalds@linux-foundation.org> Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1750 commits)
ixgbe: Allow Priority Flow Control settings to survive a device reset
net: core: remove unneeded include in net/core/utils.c.
e1000e: update version number
e1000e: fix close interrupt race
e1000e: fix loss of multicast packets
e1000e: commonize tx cleanup routine to match e1000 & igb
netfilter: fix nf_logger name in ebt_ulog.
netfilter: fix warning in ebt_ulog init function.
netfilter: fix warning about invalid const usage
e1000: fix close race with interrupt
e1000: cleanup clean_tx_irq routine so that it completely cleans ring
e1000: fix tx hang detect logic and address dma mapping issues
bridge: bad error handling when adding invalid ether address
bonding: select current active slave when enslaving device for mode tlb and alb
gianfar: reallocate skb when headroom is not enough for fcb
Bump release date to 25Mar2009 and version to 0.22
r6040: Fix second PHY address
qeth: fix wait_event_timeout handling
qeth: check for completion of a running recovery
qeth: unregister MAC addresses during recovery.
...

Manually fixed up conflicts in:
drivers/infiniband/hw/cxgb3/cxio_hal.h
drivers/infiniband/hw/nes/nes_nic.c
09f98bafea792644f2dea39eb080aa57d854f5b3 25-Mar-2009 Roland Dreier <rolandd@cisco.com> Merge branches 'cxgb3', 'endian', 'ipath', 'ipoib', 'iser', 'mad', 'misc', 'mlx4', 'mthca', 'nes' and 'sysfs' into for-next
fe8114e8e1d15ba07ddcaebc4741957a1546f307 20-Mar-2009 Stephen Hemminger <shemminger@vyatta.com> infiniband: convert ipoib to net_device_ops

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib_main.c
5e7facb77ff4b6961d936773fb1f175f7abf76b7 05-Mar-2009 Mike Christie <michaelc@cs.wisc.edu> [SCSI] iscsi class: remove host no argument from session creation callout

We do not need to have llds set the host no for the session's
parent, because we know the session's parent is going to be
the host. This removes it from the session creation callback
and converts the drivers.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iscsi_iser.c
4d1083509a69a36cc1394f188b7b8956e5526a16 05-Mar-2009 Mike Christie <michaelc@cs.wisc.edu> [SCSI] iscsi lib: remove qdepth param from iscsi host allocation

The qdepth setting was useful when we needed libiscsi to verify
the setting. Now we just need to make sure if older tools
passed in zero then we need to set some default.

So this patch just has us use the sht->cmd_per_lun or if
for LLD does a host per session then we can set it on per
host basis.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iscsi_iser.c
32ae763e3fce4192cd008956a340353a2e5c3192 05-Mar-2009 Mike Christie <michaelc@cs.wisc.edu> [SCSI] iscsi lib: have lib create work queue for transmitting IO

We were using the shost work queue which ended up being
a little akward since all iscsi hosts need a thread for
scanning, but only drivers hooked into libiscsi need
a workqueue for transmitting. So this patch moves the
xmit workqueue to the lib.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iscsi_iser.c
ser/iser_initiator.c
e28f3d5b51ed07d822f135cd941b01e2d485270e 05-Mar-2009 Mike Christie <michaelc@cs.wisc.edu> [SCSI] libiscsi: don't cap queue depth in iscsi modules

There is no need to cap the queue depth in the modules. We set
this in userspace and can do that there. For performance testing
with ram based targets, this is helpful since we can have very
high queue depths.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iscsi_iser.c
ser/iscsi_iser.h
48a237a26db0a31404c83a88e984b37a30ddcf5a 05-Mar-2009 Mike Christie <michaelc@cs.wisc.edu> [SCSI] iser: have iser use its own logging

iser has its own logging inrfastrucutre. Convert it to use
it instead of libiscsi.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iscsi_iser.c
1aedb7721f05461f777fdee25b50d8a168c425ed 27-Feb-2009 Or Gerlitz <ogerlitz@voltaire.com> IB/iser: Remove hard setting of path MTU

Remove hard setting of the IB MTU used by iSER's RC queue-pair to 1K,
as this was done due to inter-op issues with an old iser target which
is not used any more.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iser_verbs.c
71d98b4628ee869d62814f6d8607d76cab4b9ec5 17-Feb-2009 Jack Morgenstein <jackm@dev.mellanox.co.il> IPoIB: In unicast_arp_send(), only free newly-created paths

If path_rec_start() returns error, call path_free() only if the path
was newly-created. If we free an existing path whose valid flag was zero,
(but do not detach it from the list) we cause corruption of the
path list (of which it is a member), and get a kernel crash.

The simplest solution is to not free an existing path -- just leave it
in the list as-is (i.e., with its valid flag cleared).

Thanks to Yossi Etigin of Voltaire for identifying the problem flow
which caused the kernel crash.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Moni Shua <monis@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
288379f050284087578b77e04f040b57db3db3f8 20-Jan-2009 Ben Hutchings <bhutchings@solarflare.com> net: Remove redundant NAPI functions

Following the removal of the unused struct net_device * parameter from
the NAPI functions named *netif_rx_* in commit 908a7a1, they are
exactly equivalent to the corresponding *napi_* functions and are
therefore redundant.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib_ib.c
3c20962086b0ceb5498ba840e5a91bf4a692aae9 16-Jan-2009 Yossi Etigin <yosefe@Voltaire.COM> IPoIB: Do not print error messages for multicast join retries

When IPoIB tries to join a multicast group, and the SA module's SM
address handle is NULL (because of an SM change, etc), the join
returns with -EAGAIN status. In that case, don't print an error
message unless multicast debugging is enabled.

Signed-off-by: Yossi Etigin <yosefe@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_multicast.c
cbbe1efa4972350286b52cb48aefaa11e198c0fb 15-Jan-2009 Roland Dreier <rolandd@cisco.com> IPoIB: Fix deadlock between ipoib_open() and child interface create

Fix a deadlock between child interface creation/deletion and ipoib
start/stop. The former takes vlan_mutex, and then might take RTNL via
register_netdev()/unregister_netdev(). The latter is executed with
RTNL held, and tries to take vlan_mutex, which can lead to an AB-BA
deadlock.

Fix this by having the child interface creation/deletion code take the
RTNL first so vlan_mutex always nests inside RTNL. We can use
register_netdevice() for child interfaces because we form the
interface name from the parent interface and hence don't need the '%'
expansion of register_netdev().

Reported-by: Yossi Etigin <yosefe@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_vlan.c
b8a1b1ce14252b59b2d5c89de25b54f9bfd4cc5e 14-Jan-2009 Roland Dreier <rolandd@cisco.com> IPoIB: Fix hang in napi_disable() if P_Key is never found

After commit fe25c561 ("IPoIB: Don't enable NAPI when it's already
enabled"), if an interface is brought up but the corresponding P_Key
never appears, then ipoib_stop() will hang in napi_disable(), because
ipoib_open() returns before it does napi_enable().

Fix this by changing ipoib_open() to call napi_enable() even if the
P_Key isn't present.

Reported-by: Yossi Etigin <yosefe@Voltaire.COM>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
ccbf04f24c55ead791dac5df8ddeb1a640fbaad8 13-Jan-2009 Linus Torvalds <torvalds@linux-foundation.org> Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB/iser: Add dependency on INFINIBAND_ADDR_TRANS
IPoIB: Do not join broadcast group if interface is brought down
RDMA/nes: Fix for NIPQUAD removal
IPoIB: Fix loss of connectivity after bonding failover on both sides
IB/mlx4: Don't register IB device for adapters with no IB ports
mlx4_core: Fix warning from min()
IB/ehca: spin_lock_irqsave() takes an unsigned long
8c9ea7fe96afb30660673da77853114827fac0ca 13-Jan-2009 Roland Dreier <rolandd@cisco.com> Merge branches 'ehca', 'ipoib', 'iser', 'mlx4' and 'nes' into for-next
f5eb3b76003cc36f3f66514eef05779e7559c6a3 13-Jan-2009 Randy Dunlap <randy.dunlap@oracle.com> IB/iser: Add dependency on INFINIBAND_ADDR_TRANS

Fix ib_iser build to depend on INFINIBAND_ADDR_TRANS; if INET=y but
IPV6=n, then the RDMA CM is not built but INFINIBAND_ISER can be
enabled, leading to:

ERROR: "rdma_destroy_id" [drivers/infiniband/ulp/iser/ib_iser.ko] undefined!
ERROR: "rdma_connect" [drivers/infiniband/ulp/iser/ib_iser.ko] undefined!
ERROR: "rdma_destroy_qp" [drivers/infiniband/ulp/iser/ib_iser.ko] undefined!
ERROR: "rdma_create_id" [drivers/infiniband/ulp/iser/ib_iser.ko] undefined!
ERROR: "rdma_create_qp" [drivers/infiniband/ulp/iser/ib_iser.ko] undefined!
ERROR: "rdma_resolve_route" [drivers/infiniband/ulp/iser/ib_iser.ko] undefined!
ERROR: "rdma_disconnect" [drivers/infiniband/ulp/iser/ib_iser.ko] undefined!
ERROR: "rdma_resolve_addr" [drivers/infiniband/ulp/iser/ib_iser.ko] undefined!

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
ser/Kconfig
50df48f59d656d58a1734df5cfe00cdc9a74e8b5 13-Jan-2009 Yossi Etigin <yosefe@Voltaire.COM> IPoIB: Do not join broadcast group if interface is brought down

Because the ipoib_workqueue is not flushed when ipoib interface is
brought down, ipoib_mcast_join() may trigger a join to the broadcast
group after priv->broadcast was set to NULL (during cleanup). This
will cause the system to be a member of the broadcast group when
interface is down. As a side effect, this breaks the optimization of
setting the Q_key only when joining the broadcast group.

Signed-off-by: Yossi Etigin <yosefe@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_multicast.c
a50df398cddf6b757bdbf30f5f0875982ef5c660 09-Jan-2009 Yossi Etigin <yosefe@Voltaire.COM> IPoIB: Fix loss of connectivity after bonding failover on both sides

Fix bonding failover in the case both peers failover and the
gratuitous ARP is lost. In that case, the sender side will create an
ipoib_neigh and issue a path request with the old GID first. When
skb->dst->neighbour->ha changes due to ARP refresh, this ipoib_neigh
will not be added to the path->list of the path of the new GID,
because the ipoib_neigh already exists. It will not have an AH
either, because of sender-side failover. Therefore, it will not get
an AH when the path is resolved.

The solution here is to compare GIDs in ipoib_start_xmit() even if
neigh->ah is invalid. Comparing with an uninitialized value of
neigh->dgid should be fine, since a spurious match is harmless (and
astronomically unlikely too).

Signed-off-by: Moni Shoua <monis@voltaire.com>
Signed-off-by: Yossi Etigin <yosefe@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
d927e38c6c1859494792547beee249c17b43a17e 06-Jan-2009 Kay Sievers <kay.sievers@vrfy.org> infiniband: struct device - replace bus_id with dev_name(), dev_set_name()

Acked-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
rp/ib_srp.c
2ff79d52d56eebcffd83e9327b89d7daedf1e897 02-Dec-2008 Mike Christie <michaelc@cs.wisc.edu> [SCSI] libiscsi: pass opcode into alloc_pdu callout

We do not need to allocate a itt for data_out, so this
passes the opcode to the alloc_pdu callout.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iscsi_iser.c
0f9c7449ce050759d10424048b96d1bd0d59dcc1 02-Dec-2008 Mike Christie <michaelc@cs.wisc.edu> [SCSI] iser: convert iser to new alloc_pdu api

This just converts iser to new alloc_pdu api. It still
preallocates the pdu, so there is no difference.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iscsi_iser.c
ser/iser_initiator.c
0191b625ca5a46206d2fb862bb08f36f2fcb3b31 28-Dec-2008 Linus Torvalds <torvalds@linux-foundation.org> Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1429 commits)
net: Allow dependancies of FDDI & Tokenring to be modular.
igb: Fix build warning when DCA is disabled.
net: Fix warning fallout from recent NAPI interface changes.
gro: Fix potential use after free
sfc: If AN is enabled, always read speed/duplex from the AN advertising bits
sfc: When disabling the NIC, close the device rather than unregistering it
sfc: SFT9001: Add cable diagnostics
sfc: Add support for multiple PHY self-tests
sfc: Merge top-level functions for self-tests
sfc: Clean up PHY mode management in loopback self-test
sfc: Fix unreliable link detection in some loopback modes
sfc: Generate unique names for per-NIC workqueues
802.3ad: use standard ethhdr instead of ad_header
802.3ad: generalize out mac address initializer
802.3ad: initialize ports LACPDU from const initializer
802.3ad: remove typedef around ad_system
802.3ad: turn ports is_individual into a bool
802.3ad: turn ports is_enabled into a bool
802.3ad: make ntt bool
ixgbe: Fix set_ringparam in ixgbe to use the same memory pools.
...

Fixed trivial IPv4/6 address printing conflicts in fs/cifs/connect.c due
to the conversion to %pI (in this networking merge) and the addition of
doing IPv6 addresses (from the earlier merge of CIFS).
908a7a16b852ffd618a9127be8d62432182d81b4 23-Dec-2008 Neil Horman <nhorman@tuxdriver.com> net: Remove unused netdev arg from some NAPI interfaces.

When the napi api was changed to separate its 1:1 binding to the net_device
struct, the netif_rx_[prep|schedule|complete] api failed to remove the now
vestigual net_device structure parameter. This patch cleans up that api by
properly removing it..

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib_ib.c
bba7ebba3b17f4fe8c5907a32e16d9bd3fcf5192 21-Dec-2008 David Disseldorp <ddiss@sgi.com> IB/iser: Avoid recv buffer exhaustion caused by unexpected PDUs

iSCSI/iSER targets may send PDUs without a prior request from the
initiator. RFC 5046 refers to these PDUs as "unexpected". NOP-In PDUs
with itt=RESERVED and Asynchronous Message PDUs occupy this category.

The amount of active "unexpected" PDU's an iSER target may have at any
time is governed by the MaxOutstandingUnexpectedPDUs key, which is not
yet supported.

Currently when an iSER target sends an "unexpected" PDU, the
initiators recv buffer consumed by the PDU is not replaced. If over
initial_post_recv_bufs_num "unexpected" PDUs are received then the
receive queue will run out of receive work requests entirely.

This patch ensures recv buffers consumed by "unexpected" PDUs are
replaced in the next iser_post_receive_control() call.

Signed-off-by: David Disseldorp <ddiss@sgi.com>
Signed-off-by: Ken Sandars <ksandars@sgi.com>
Acked-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iscsi_iser.h
ser/iser_initiator.c
ser/iser_verbs.c
198d6ba4d7f48c94f990f4604f0b3d73925e0ded 19-Nov-2008 David S. Miller <davem@davemloft.net> Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6

Conflicts:

drivers/isdn/i4l/isdn_net.c
fs/cifs/connect.c
ff79ae80837cf45cb703b34824dd3862d2ddcb24 12-Nov-2008 Yossi Etigin <yosefe@Voltaire.COM> IPoIB: Fix crash in path_rec_completion()

Fix a crash in path_rec_completion() during an SM up/down loop. If
more than one path record request is issued, the first completion
releases path->done, allowing ipoib_flush_paths() to free the path,
and thus corrupting it for the second completion.

Commit ee1e2c82 ("IPoIB: Refresh paths instead of flushing them on SM
change events") added the field path->valid and changed the test "if
(!path)" to "if (!path || !path->valid)". This change made it
possible for a path with an outstanding query to pass the test and
issue another query on the same path. Having two queries on the same
path leads to a crash.

This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1325>.

Signed-off-by: Yossi Etigin <yosefe@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
93a3ab939ba90e00e193f0bad98f43fbdfbd925d 12-Nov-2008 Yossi Etigin <yosefe@Voltaire.COM> IPoIB: Fix hang in ipoib_flush_paths()

ipoib_flush_paths() can hang during an SM up/down loop: if
path_rec_start() fails (for instance, because there is no sm_ah), the
path is still added to the path list by neigh_add_path(). Then,
ipoib_flush_paths() will wait for path->done, but it will never
complete because the request was not issued at all. Fix this by
completing path->done if issuing the query fails.

This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1329>.

Signed-off-by: Yossi Etigin <yosefe@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
fe25c56190bbc0951d7c53b4ccd148e669d69938 12-Nov-2008 Yossi Etigin <yosefe@Voltaire.COM> IPoIB: Don't enable NAPI when it's already enabled

If a P_Key is not present when an interface is created, ipoib_open()
will return after doing napi_enable(). ipoib_open() will be called
again from ipoib_pkey_poll() when the P_Key appears, after NAPI has
already been enabled, and try to enable it again. This triggers a
BUG_ON() in napi_enable().

Fix this by moving the call to napi_enable() to after the test for
P_Key presence.

Signed-off-by: Yossi Etigin <yosefe@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
63779436ab4ad0867bcea53bf853b0004d7b895d 31-Oct-2008 Harvey Harrison <harvey.harrison@gmail.com> drivers: replace NIPQUAD()

Using NIPQUAD() with NIPQUAD_FMT, %d.%d.%d.%d or %u.%u.%u.%u
can be replaced with %pI4

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ser/iser_verbs.c
5b095d98928fdb9e3b75be20a54b7a6cbf6ca9ad 29-Oct-2008 Harvey Harrison <harvey.harrison@gmail.com> net: replace %p6 with %pI6

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib_cm.c
poib/ipoib_main.c
poib/ipoib_multicast.c
rp/ib_srp.c
8c165a8383ef56e84b541fa638be5cf1440010e7 29-Oct-2008 Harvey Harrison <harvey.harrison@gmail.com> infiniband: remove IPOIB_GID_RAW_ARG, IPOIB_GID_ARG, IPOIB_GID_FMT

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib.h
fcace2fe7a86237c451b09aaf7e2e9d19e09887f 29-Oct-2008 Harvey Harrison <harvey.harrison@gmail.com> infiniband: ipoib replace IPOIB_GID_FMT with %p6

Replace all uses of IPOIB_GID_FMT, IPOIB_GID_RAW_ARG() and IPOIB_GID_ARG()

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib_cm.c
poib/ipoib_main.c
poib/ipoib_multicast.c
8867cd7c8678ff2d9d0382dbbfbcc7a3e7e61cbc 29-Oct-2008 Harvey Harrison <harvey.harrison@gmail.com> infiniband: use %p6 for printing message ids

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
rp/ib_srp.c
724bdd097e4d47b6ad963db5d92258ab5c485e05 23-Oct-2008 Linus Torvalds <torvalds@linux-foundation.org> Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB/ehca: Reject dynamic memory add/remove when ehca adapter is present
IB/ehca: Fix reported max number of QPs and CQs in systems with >1 adapter
IPoIB: Set netdev offload features properly for child (VLAN) interfaces
IPoIB: Clean up ethtool support
mlx4_core: Add Ethernet PCI device IDs
mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC
mlx4_core: Multiple port type support
mlx4_core: Ethernet MAC/VLAN management
mlx4_core: Get ethernet MTU and default address from firmware
mlx4_core: Support multiple pre-reserved QP regions
Update NetEffect maintainer emails to Intel emails
RDMA/cxgb3: Remove cmid reference on tid allocation failures
IB/mad: Use krealloc() to resize snoop table
IPoIB: Always initialize poll_timer to avoid crash on unload
IB/ehca: Don't allow creating UC QP with SRQ
mlx4_core: Add QP range reservation support
RDMA/ucma: Test ucma_alloc_multicast() return against NULL, not with IS_ERR()
83bb63f62bda28be88b21216fbb59838a10f2348 23-Oct-2008 Or Gerlitz <ogerlitz@voltaire.com> IPoIB: Set netdev offload features properly for child (VLAN) interfaces

Child devices were created without any offload features set, fix this by
moving the code that computes the features into generic function which is
now called through non-child and child device creation.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>

-- v1 has a bug where the 'result' flag in ipoib_vlan_add may be used uninitialized
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_main.c
poib/ipoib_vlan.c
70c9c0db549245a49cabf42d5a74688077254d46 23-Oct-2008 Or Gerlitz <ogerlitz@voltaire.com> IPoIB: Clean up ethtool support

Add a get_rx_csum method. Remove the driver's own get_tso method, as
the ethtool kernel code uses the default one if nothing is provided.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_ethtool.c
ed09441dacc2a2d6c170aa3b1f79a041291a813f 17-Oct-2008 Linus Torvalds <torvalds@linux-foundation.org> Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (39 commits)
[SCSI] sd: fix compile failure with CONFIG_BLK_DEV_INTEGRITY=n
libiscsi: fix locking in iscsi_eh_device_reset
libiscsi: check reason why we are stopping iscsi session to determine error value
[SCSI] iscsi_tcp: return a descriptive error value during connection errors
[SCSI] libiscsi: rename host reset to target reset
[SCSI] iscsi class: fix endpoint id handling
[SCSI] libiscsi: Support drivers initiating session removal
[SCSI] libiscsi: fix data corruption when target has to resend data-in packets
[SCSI] sd: Switch kernel printing level for DIF messages
[SCSI] sd: Correctly handle all combinations of DIF and DIX
[SCSI] sd: Always print actual protection_type
[SCSI] sd: Issue correct protection operation
[SCSI] scsi_error: fix target reset handling
[SCSI] lpfc 8.2.8 v2 : Add statistical reporting control and additional fc vendor events
[SCSI] lpfc 8.2.8 v2 : Add sysfs control of target queue depth handling
[SCSI] lpfc 8.2.8 v2 : Revert target busy in favor of transport disrupted
[SCSI] scsi_dh_alua: remove REQ_NOMERGE
[SCSI] lpfc 8.2.8 : update driver version to 8.2.8
[SCSI] lpfc 8.2.8 : Add MSI-X support
[SCSI] lpfc 8.2.8 : Update driver to use new Host byte error code DID_TRANSPORT_DISRUPTED
...
a447c0932445f92ce6f4c1bd020f62c5097a7842 13-Oct-2008 Steven Whitehouse <swhiteho@redhat.com> vfs: Use const for kernel parser table

This is a much better version of a previous patch to make the parser
tables constant. Rather than changing the typedef, we put the "const" in
all the various places where its required, allowing the __initconst
exception for nfsroot which was the cause of the previous trouble.

This was posted for review some time ago and I believe its been in -mm
since then.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Alexander Viro <aviro@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
rp/ib_srp.c
8e12452549ba2dfa17db97bc495172fac221a7ab 24-Sep-2008 Mike Christie <michaelc@cs.wisc.edu> [SCSI] libiscsi: rename host reset to target reset

I had this in my patchset to add target reset support, but
it got dropped due to patching conflicts. This initial patch
just renames the function and users. We are actually just
dropping the session, and so this does not have anything to do
with the host exactly. It does for software iscsi because
we allocate a host per session, but for cxgb3i this makes no
sense.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iscsi_iser.c
e5bd7b54e93ef7151469a12b8c28d863b9f8a088 24-Sep-2008 Mike Christie <michaelc@cs.wisc.edu> [SCSI] libiscsi: Support drivers initiating session removal

If the driver knows when hardware is removed like with cxgb3i,
bnx2i, qla4xxx and iser then we will want to remove the sessions/devices
that are bound to that device before removing the host.

cxgb3i and in the future bnx2i will remove the host and that will
remove all the sessions on the hba. iser can call iscsi_kill_session
when it gets an event that indicates that a hca is removed.
And when qla4xxx is hooked in to the lib (it is only hooked into
the class right now) it can call iscsi remove host like the
partial offload card drivers.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iscsi_iser.c
2767840a5ca73fde62b25e0209aa9269ec4fa7c7 11-Oct-2008 Roland Dreier <rolandd@cisco.com> IPoIB: Always initialize poll_timer to avoid crash on unload

ipoib_ib_dev_stop() does del_timer_sync(&priv->poll_timer), but if a
P_key for an interface is not found, poll_timer is not initialized, so
this leads to a crash or hang. Fix this by moving where poll_timer is
initialized to ipoib_ib_dev_init(), which is always called.

This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1172>.

Debugged-by: Yosef Etigin <yosefe@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_ib.c
943c246e9ba9078a61b6bcc5b4a8131ce8befb64 30-Sep-2008 Roland Dreier <rolandd@cisco.com> IPoIB: Use netif_tx_lock() and get rid of private tx_lock, LLTX

Currently, IPoIB is an LLTX driver that uses its own IRQ-disabling
tx_lock. Not only do we want to get rid of LLTX, this actually causes
problems because of the skb_orphan() done with this tx_lock held: some
skb destructors expect to be run with interrupts enabled.

The simplest fix for this is to get rid of the driver-private tx_lock
and stop using LLTX. We kill off priv->tx_lock and use
netif_tx_lock[_bh]() instead; the patch to do this is a tiny bit
tricky because we need to update places that take priv->lock inside
the tx_lock to disable IRQs, rather than relying on tx_lock having
already disabled IRQs.

Also, there are a couple of places where we need to disable BHs to
make sure we have a consistent context to call netif_tx_lock() (since
we no longer can use _irqsave() variants), and we also have to change
ipoib_send_comp_handler() to call drain_tx_cq() through a timer rather
than directly, because ipoib_send_comp_handler() runs in interrupt
context and drain_tx_cq() must run in BH context so it can call
netif_tx_lock().

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_cm.c
poib/ipoib_ib.c
poib/ipoib_main.c
poib/ipoib_multicast.c
c9da4bad5b80c3d9884e2c6ad8d2091252c32d5e 26-Sep-2008 Roland Dreier <rolandd@cisco.com> IPoIB: Fix crash when path record fails after path flush

Commit ee1e2c82 ("IPoIB: Refresh paths instead of flushing them on SM
change events") changed how paths are flushed on an SM event. This
change introduces a problem if the path record query triggered by
fails, causing path->ah to become NULL. A later successful path query
will then trigger WARN_ON() in path_rec_completion(), and crash
because path->ah has already been freed, so the ipoib_put_ah() inside
the lock in path_rec_completion() may actually drop the last reference
(contrary to the comment that claims this is safe).

Fix this by updating path->ah and freeing old_ah only when the path
record query is successful. This prevents the neighbour AH and that
path AH from getting out of sync.

This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1194>

Reported-by: Rabah Salem <ravah@mellanox.com>
Debugged-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
e8224e4b804b4fd26723191c1891101a5959bb8a 16-Sep-2008 Yossi Etigin <yossi.openib@gmail.com> IPoIB: Fix deadlock on RTNL between bcast join comp and ipoib_stop()

Taking rtnl_lock in ipoib_mcast_join_complete() causes a deadlock with
ipoib_stop(). We avoid it by scheduling the piece of code that takes
the lock on ipoib_workqueue instead of executing it directly. This
works because we only flush the ipoib_workqueue with the RTNL not held.

The deadlock happens because ipoib_stop() calls ipoib_ib_dev_down()
which calls ipoib_mcast_dev_flush(), which calls ipoib_mcast_free(),
which calls ipoib_mcast_leave(). The latter calls
ib_sa_free_multicast(), and this waits until the multicast completion
handler finishes. This handler is ipoib_mcast_join_complete(), which
waits for the rtnl_lock(), which was already taken by ipoib_stop().

This bug was introduced in commit a77a57a1 ("IPoIB: Fix deadlock on
RTNL in ipoib_stop()").

Signed-off-by: Yossi Etigin <yosefe@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_main.c
poib/ipoib_multicast.c
7a8fc9b248e77a4eab0613acf30a6811799786b3 17-Aug-2008 Adrian Bunk <bunk@kernel.org> removed unused #include <linux/version.h>'s

This patch lets the files using linux/version.h match the files that
#include it.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ser/iser_verbs.c
a77a57a1a22afc31891d95879fe3cf2ab03838b0 20-Aug-2008 Roland Dreier <rolandd@cisco.com> IPoIB: Fix deadlock on RTNL in ipoib_stop()

Commit c8c2afe3 ("IPoIB: Use rtnl lock/unlock when changing device
flags") added a call to rtnl_lock() in ipoib_mcast_join_task(), which
is run from the ipoib_workqueue. However, ipoib_stop() (which is run
inside rtnl_lock()) flushes this workqueue, which leads to a deadlock
if the join task is pending.

Fix this by simply not flushing the workqueue from ipoib_stop(). It
turns out that we really don't care about workqueue tasks running
during or after ipoib_stop(), as long as we make sure to flush the
workqueue before unregistering a netdev.

This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1114>.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
poib/ipoib_multicast.c
b1404069f64457c94de241738fdca142c2e5698f 09-Aug-2008 David J. Wilder <dwilder@us.ibm.com> IPoIB/cm: Use vmalloc() to allocate rx_rings

There are users that are running UDP applications that require a large
receive queue size in order to get good performance. To prevent
allocation failures for rx_rings when using non-SRQ mode and large
recv_queue_size (1K or larger), use vmalloc() instead of kcalloc() to
alocate rx_rings.

Signed-off-by: David Wilder <dwilder@us.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
e08198169ec5facb3d85bb455efa44a2f8327842 30-Jul-2008 Roland Dreier <rolandd@cisco.com> IPoIB/cm: Set correct SG list in ipoib_cm_init_rx_wr()

wr->sg_list should be set to the sge pointer passed in, not
priv->cm.rx_sge.

Reported-by: Hoang-Nam Nguyen <HNGUYEN@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
9905922446f6dc02fd4650c8f59114d6bdb5b777 25-Jul-2008 Roland Dreier <rolandd@cisco.com> IPoIB: Correct help text for INFINIBAND_IPOIB_DEBUG

The help text for INFINIBAND_IPOIB_DEBUG refers to "ipoib_debugfs,"
which no longer exists. Correct this to talk about the files under
debugfs that are really created.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/Kconfig
99c3a5a9e388e0ac166c617aaf02150e778d2779 25-Jul-2008 Roland Dreier <rolandd@cisco.com> IPoIB/cm: Connected mode is no longer EXPERIMENTAL

Connected mode is now tested and used by lots of people. No need to
hide it under CONFIG_EXPERIMENTAL.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/Kconfig
2cc177364e4746becdf421f926fb967c047ccc32 24-Jul-2008 Roland Dreier <rolandd@cisco.com> Merge branches 'bkl-removal', 'cma', 'ehca', 'for-2.6.27', 'mlx4', 'mthca' and 'nes' into for-linus
01b3fc8b15432f7931e40fe099839e1559fb0e09 22-Jul-2008 Or Gerlitz <ogerlitz@voltaire.com> IPoIB: Include err code in trace message for ib_sa_path_rec_get() failures

Print the return code of ib_sa_path_rec_get() if it fails to help
debug errors.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
2f5de1512884da8c74bec2c76e8f114b972ab4be 22-Jul-2008 Or Gerlitz <ogerlitz@voltaire.com> IB/iser: Add support for RDMA_CM_EVENT_ADDR_CHANGE event

Enhance iser to act upon notification on network stack changes that
make its RDMA connection unaligned with the link used by the stack for
the <src,dst> IPs used to establish the connection.

When RDMA_CM_EVENT_ADDR_CHANGE arrives, just disconnect the
connection, assuming that the user space iscsid daemon will reconnect,
and the new connection will be aligned with the IP stack.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iser_verbs.c
49997d75152b3d23c53b0fa730599f2f74c92c65 18-Jul-2008 David S. Miller <davem@davemloft.net> Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6

Conflicts:

Documentation/powerpc/booting-without-of.txt
drivers/atm/Makefile
drivers/net/fs_enet/fs_enet-main.c
drivers/pci/pci-acpi.c
net/8021q/vlan.c
net/iucv/iucv.c
89a93f2f4834f8c126e8d9dd6b368d0b9e21ec3d 16-Jul-2008 Linus Torvalds <torvalds@linux-foundation.org> Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (102 commits)
[SCSI] scsi_dh: fix kconfig related build errors
[SCSI] sym53c8xx: Fix bogus sym_que_entry re-implementation of container_of
[SCSI] scsi_cmnd.h: remove double inclusion of linux/blkdev.h
[SCSI] make struct scsi_{host,target}_type static
[SCSI] fix locking in host use of blk_plug_device()
[SCSI] zfcp: Cleanup external header file
[SCSI] zfcp: Cleanup code in zfcp_erp.c
[SCSI] zfcp: zfcp_fsf cleanup.
[SCSI] zfcp: consolidate sysfs things into one file.
[SCSI] zfcp: Cleanup of code in zfcp_aux.c
[SCSI] zfcp: Cleanup of code in zfcp_scsi.c
[SCSI] zfcp: Move status accessors from zfcp to SCSI include file.
[SCSI] zfcp: Small QDIO cleanups
[SCSI] zfcp: Adapter reopen for large number of unsolicited status
[SCSI] zfcp: Fix error checking for ELS ADISC requests
[SCSI] zfcp: wait until adapter is finished with ERP during auto-port
[SCSI] ibmvfc: IBM Power Virtual Fibre Channel Adapter Client Driver
[SCSI] sg: Add target reset support
[SCSI] lib: Add support for the T10 (SCSI) Data Integrity Field CRC
[SCSI] sd: Move scsi_disk() accessor function to sd.h
...
b9e40857682ecfc5bcd0356a23ff409883ffb982 15-Jul-2008 David S. Miller <davem@davemloft.net> netdev: Do not use TX lock to protect address lists.

Now that we have a specific lock to protect the network
device unicast and multicast lists, remove extraneous
grabs of the TX lock in cases where the code only needs
address list protection.

Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib_multicast.c
e308a5d806c852f56590ffdd3834d0df0cbed8d7 15-Jul-2008 David S. Miller <davem@davemloft.net> netdev: Add netdev->addr_list_lock protection.

Add netif_addr_{lock,unlock}{,_bh}() helpers.

Use them to protect operations that operate on or read
the network device unicast and multicast address lists.

Also use them in cases where the code simply wants to
block calls into the driver's ->set_rx_mode() and
->set_multicast_list() methods.

Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib_multicast.c
bc3a290b51aaefc6a6af2d6e6d52ed32387c416c 15-Jul-2008 Eli Cohen <eli@dev.mellanox.co.il> IPoIB: Double default RX/TX ring sizes

Increase IPoIB ring sizes to twice their original sizes (RX: 128->256,
TX: 64->128) to act as a shock absorber for high traffic peaks. With
the current settings, we have seen cases that there are many calls to
netif_stop_queue(), which causes degradation in throughput. Also,
larger receive buffer sizes help IPoIB in CM mode to avoid experiencing
RNR NAK conditions due to insufficient receive buffers at the SRQ.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
e112373fd6aa280bd2cbc0d5cc3809115325a1be 15-Jul-2008 Eli Cohen <eli@dev.mellanox.co.il> IPoIB/cm: Reduce connected mode TX object size

Since IPoIB connected mode does not NETIF_F_SG, we only have one DMA
mapping per send, so we don't need a mapping[] array. Define a new
struct with a single u64 mapping member and use it for the CM tx_ring.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_cm.c
bd3606715effbf37df986548c43bbed0842b49d5 15-Jul-2008 Eli Cohen <eli@mellanox.co.il> IPoIB: Use dev_set_mtu() to change mtu

When the driver sets the MTU of the net device outside of its
change_mtu method, it should make use of dev_set_mtu() instead of
directly setting the mtu field of struct netdevice. Otherwise
functions registered to be called upon MTU change will not get called
(this is done through call_netdevice_notifiers() in dev_set_mtu()).

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
poib/ipoib_multicast.c
c8c2afe360b7366f586f6bece1109a72ea334876 15-Jul-2008 Eli Cohen <eli@mellanox.co.il> IPoIB: Use rtnl lock/unlock when changing device flags

Use of this lock is required to synchronize changes to the netdvice's
data structs. Also move the call to ipoib_flush_paths() after the
modification of the netdevice flags in set_mode().

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
poib/ipoib_multicast.c
9eae554c171e086c89ab83da2a2d3c8bf958fcb5 15-Jul-2008 Roland Dreier <rolandd@cisco.com> IPoIB: Get rid of ipoib_mcast_detach() wrapper

ipoib_mcast_detach() does nothing except call ib_detach_mcast(), so just
use the core API in the one place that does a multicast group detach.

add/remove: 0/1 grow/shrink: 0/1 up/down: 0/-105 (-105)
function old new delta
ipoib_mcast_leave 357 319 -38
ipoib_mcast_detach 67 - -67

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_multicast.c
poib/ipoib_verbs.c
d0de13622d5ac658efe7c51521dbdbe0752aa3dd 15-Jul-2008 Eli Cohen <eli@mellanox.co.il> IPoIB: Only set Q_Key once: after joining broadcast group

The current code will set the Q_Key for any join of a non-sendonly
multicast group. The operation involves a modify QP operation, which
is fairly heavyweight, and is only really required after the join of
the broadcast group. Fix this by adding a parameter to ipoib_mcast_attach()
to control when the Q_Key is set.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_multicast.c
poib/ipoib_verbs.c
5892eff91ad60ba365ae7f75050ce464036c5396 15-Jul-2008 Eli Cohen <eli@mellanox.co.il> IPoIB: Remove priv->mcast_mutex

No need for a mutex around calls to ib_attach_mcast/ib_detach_mcast
since these operations are synchronized at the HW driver layer.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_main.c
poib/ipoib_verbs.c
c03d4731b5b6de45b95a10bf1d510dde423d6757 15-Jul-2008 Eli Cohen <eli@mellanox.co.il> IPoIB: Remove unused IPOIB_MCAST_STARTED code

The IPOIB_MCAST_STARTED flag is not used at all since commit b3e2749b
("IPoIB: Don't drop multicast sends when they can be queued"), so
remove it.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_multicast.c
ee1e2c82c245a5fb2864e9dbcdaab3390fde3fcc 15-Jul-2008 Moni Shoua <monis@Voltaire.COM> IPoIB: Refresh paths instead of flushing them on SM change events

The patch tries to solve the problem of device going down and paths being
flushed on an SM change event. The method is to mark the paths as candidates for
refresh (by setting the new valid flag to 0), and wait for an ARP
probe a new path record query.

The solution requires a different and less intrusive handling of SM
change event. For that, the second argument of the flush function
changes its meaning from a boolean flag to a level. In most cases, SM
failover doesn't cause LID change so traffic won't stop. In the rare
cases of LID change, the remote host (the one that hadn't changed its
LID) will lose connectivity until paths are refreshed. This is no
worse than the current state. In fact, preventing the device from
going down saves packets that otherwise would be lost.

Signed-off-by: Moni Levy <monil@voltaire.com>
Signed-off-by: Moni Shoua <monis@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_ib.c
poib/ipoib_main.c
poib/ipoib_verbs.c
af40da894e96d5c826d38be3ea53ee00d9de0367 15-Jul-2008 Vladimir Sokolovsky <vlad@mellanox.co.il> IPoIB: add LRO support

Add "ipoib_use_lro" module parameter to enable LRO and an
"ipoib_lro_max_aggr" module parameter to set the max number of packets
to be aggregated. Make LRO controllable and LRO statistics accessible
through ethtool.

Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il>
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/Kconfig
poib/ipoib.h
poib/ipoib_ethtool.c
poib/ipoib_ib.c
poib/ipoib_main.c
12406734051a26e9fe4c8568e931dfddbb72d431 15-Jul-2008 Ron Livne <ronli@voltaire.com> IPoIB: Use multicast loopback blocking if available

Set IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK for IPoIB's UD QPs if
supported by the underlying device. This creates an improvement of up
to 39% in bandwidth when sending multicast packets with IPoIB, and an
improvment of 12% in cpu usage.

Signed-off-by: Ron Livne <ronli@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_verbs.c
a7d834c4bc6be73e8f83eaa5072fac3c5549f7f2 15-Jul-2008 Roland Dreier <rolandd@cisco.com> IPoIB/cm: Fix racy use of receive WR/SGL in ipoib_cm_post_receive_nonsrq()

For devices that don't support SRQs, ipoib_cm_post_receive_nonsrq() is
called from both ipoib_cm_handle_rx_wc() and ipoib_cm_nonsrq_init_rx(),
and these two callers are not synchronized against each other.
However, ipoib_cm_post_receive_nonsrq() always reuses the same receive
work request and scatter list structures, so multiple callers can end
up stepping on each other, which leads to posting garbled work
requests.

Fix this by having the caller pass in the ib_recv_wr and ib_sge
structures to use, and allocating new local structures in
ipoib_cm_nonsrq_init_rx().

Based on a patch by Pradeep Satyanarayana <pradeep@us.ibm.com> and
David Wilder <dwilder@us.ibm.com>, with debugging help from Hoang-Nam
Nguyen <hnguyen@de.ibm.com>.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
f89271da32bc1a636cf4eb078e615930886cd013 15-Jul-2008 Eli Cohen <eli@dev.mellanox.co.il> IPoIB: Copy small received SKBs in connected mode

The connected mode implementation in the IPoIB driver has a large
overhead in the way SKBs are handled in the receive flow. It usually
allocates an SKB with as big as was used in the currently received SKB
and moves unused fragments from the old SKB to the new one. This
involves a loop on all the remaining fragments and incurs overhead on
the CPU. This patch, for small SKBs, allocates an SKB just large
enough to contain the received data and copies to it the data from the
received SKB. The newly allocated SKB is passed to the stack and the
old SKB is reposted.

When running netperf, UDP small messages, without this pach I get:

UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
14.4.3.178 (14.4.3.178) port 0 AF_INET
Socket Message Elapsed Messages
Size Size Time Okay Errors Throughput
bytes bytes secs # # 10^6bits/sec

114688 128 10.00 5142034 0 526.31
114688 10.00 1130489 115.71

With this patch I get both send and receive at ~315 mbps.

The reason that send performance actually slows down is as follows:
When using this patch, the overhead of the CPU for handling RX packets
is dramatically reduced. As a result, we do not experience RNR NAK
messages from the receiver which cause the connection to be closed and
reopened again; when the patch is not used, the receiver cannot handle
the packets fast enough so there is less time to post new buffers and
hence the mentioned RNR NACKs. So what happens is that the
application *thinks* it posted a certain number of packets for
transmission but these packets are flushed and do not really get
transmitted. Since the connection gets opened and closed many times,
each time netperf gets the CPU time that otherwise would have been
given to IPoIB to actually transmit the packets. This can be verified
when looking at the port counters -- the output of ifconfig and the
oputput of netperf (this is for the case without the patch):

tx packets
==========
port counter: 1,543,996
ifconfig: 1,581,426
netperf: 5,142,034

rx packets
==========
netperf 1,1304,089

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
poib/ipoib.h
poib/ipoib_cm.c
poib/ipoib_main.c
f3781d2e89f12dd5afa046dc56032af6e39bd116 15-Jul-2008 Roland Dreier <rolandd@cisco.com> RDMA: Remove subversion $Id tags

They don't get updated by git and so they're worse than useless.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_cm.c
poib/ipoib_fs.c
poib/ipoib_ib.c
poib/ipoib_main.c
poib/ipoib_multicast.c
poib/ipoib_verbs.c
poib/ipoib_vlan.c
ser/iscsi_iser.c
ser/iscsi_iser.h
ser/iser_initiator.c
ser/iser_memory.c
ser/iser_verbs.c
rp/ib_srp.c
rp/ib_srp.h
969a60f9db3f879f95bd37026a3c3bf02cc2568f 15-Jul-2008 Roland Dreier <rolandd@cisco.com> IB/srp: Remove use of cached P_Key/GID queries

The SRP initiator is currently using ib_find_cached_pkey() and
ib_get_cached_gid() in situations where the uncached ib_find_pkey()
and ib_query_gid() functions serve just as well: sleeping is allowed
and performance is not an issue. Since we want to eliminate the
cached operations in the long term, convert SRP to use the uncached
variants.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
8e9a20cee4511be4560f9c858d9994eb6913731e 16-Jun-2008 Mike Christie <michaelc@cs.wisc.edu> [SCSI] libiscsi, iscsi_tcp, ib_iser: fix setting of can_queue with old tools.

This patch fixes two bugs that are related.

1. Old tools did not set can_queue/cmds_max. This patch modifies
libiscsi so that when we add the host we catch this and set it
to the default.

2. iscsi_tcp thought that the scsi command that was passed to
the eh functions needed a iscsi_cmd_task allocated for it. It
only needed a mgmt task, and now it does not matter since it
all comes from the same pool and libiscsi handles this for the
drivers. ib_iser had copied iscsi_tcp's code and set can_queue
to its max - 1 to handle this. So this patch removes the max -1,
and just sets it to the max.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iscsi_iser.c
913e5bf435617aa529919a4f7567f849f9f35f9f 21-May-2008 Mike Christie <michaelc@cs.wisc.edu> [SCSI] libiscsi, iser, tcp: remove recv_lock

The recv lock was defined so the iscsi layer could block
the recv path from processing IO during recovery. It
turns out iser just set a lock to that pointer which was pointless.

We now disconnect the transport connection before doing recovery
so we do not need the recv lock. For iscsi_tcp we still stop
the recv path incase older tools are being used.

This patch also has iscsi_itt_to_ctask user grab the session lock
and has the caller access the task with the lock or get a ref
to it in case the target is broken and sends a tmf success response
then sends data or a response for the command that was supposed to
be affected bty the tmf.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iscsi_iser.c
ser/iscsi_iser.h
ser/iser_initiator.c
88dfd340b9dece8fcaa1a2d4c782338926c017f7 21-May-2008 Mike Christie <michaelc@cs.wisc.edu> [SCSI] iscsi class: Add session initiatorname and ifacename sysfs attrs.

This adds two new attrs used for creating initiator ports and
binding sessions to hardware.

The session level initiatorname:

Since bnx2i does a scsi_host per host device, we need to add the
iface initiator port settings on the session, so we can create
multiple initiator ports (each with different inames) per device/scsi_host.

The current iname reflects that qla4xxx can have one iname per hba, and we are
allocating a host per session for software. The iname on the host will
remain so we can export and set the hba level qla4xxx setting.

The ifacename attr:

To bind a session to a some peice of hardware in userspace we maintain
some mappings, but during boot or iscsid restart (iscsid contains the user
space part of the driver) we need to be able to figure out which of those
host mappings abstractions maps to certain sessions. This patch adds
a ifacename attr, which userspace can set to id the host side of the
endpoint across pivot_roots and iscsid restarts.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iscsi_iser.c
412eeafa0a51a8d86545d0be637bf84e4374fccf 21-May-2008 Mike Christie <michaelc@cs.wisc.edu> [SCSI] iser: Modify iser to take a iscsi_endpoint struct in ep callouts and session setup

This hooks iser into the iscsi endpoint code. Previously it handled the
lookup and allocation. This has been made generic so bnx2i and iser can
share it. It also allows us to pass iser the leading conn's ep, so we
know the ib_deivce being used and can set it as the scsi_host's parent.
And that allows scsi-ml to set the dma_mask based on those values.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iscsi_iser.c
ser/iscsi_iser.h
ser/iser_verbs.c
7970634b81a6e3561954517bca42615542c4535b 21-May-2008 Mike Christie <michaelc@cs.wisc.edu> [SCSI] iscsi class: user device_for_each_child instead of duplicating session list

Currently we duplicate the list of sessions, because we were using the
test for if a session was on the host list to indicate if the session
was bound or unbound. We can instead use the target_id and fix up
the class so that drivers like bnx2i do not have to manage the target id
space.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iscsi_iser.c
2261ec3d686e35c1a6088ab7f00a1d02b528b994 21-May-2008 Mike Christie <michaelc@cs.wisc.edu> [SCSI] iser: handle iscsi_cmd_task rename

This handles the iscsi_cmd_task rename and renames
the iser cmd task to iser task.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iscsi_iser.c
ser/iscsi_iser.h
ser/iser_initiator.c
ser/iser_memory.c
2747fdb25726caa1a89229f43d99ca50af72576a 21-May-2008 Mike Christie <michaelc@cs.wisc.edu> [SCSI] iser: convert ib_iser to support merged tasks

Convert ib_iser to support merged tasks.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iscsi_iser.c
ser/iscsi_iser.h
ser/iser_initiator.c
0af967f5d4f2dd1e00618d34ac988037d37a6c3b 21-May-2008 Mike Christie <michaelc@cs.wisc.edu> [SCSI] libiscsi, iscsi_tcp, iser: add session cmds array accessor

Currently to get a ctask from the session cmd array, you have to
know to use the itt modifier. To make this easier on LLDs and
so in the future we can easilly kill the session array and use
the host shared map instead, this patch adds a nice wrapper
to strip the itt into a session->cmds index and return a ctask.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iscsi_iser.c
ser/iser_initiator.c
b40977d95fb3a1898ace6a7d97e4ed1a33a440a4 21-May-2008 Mike Christie <michaelc@cs.wisc.edu> [SCSI] iser: fix handling of scsi cmnds during recovery.

After the stop_conn callback has returned the LLD should not
touch the scsi cmds. iscsi_tcp and libiscsi use the
conn->recv_lock and suspend_rx field to halt recv path
processing, but iser does not have any protection.

This patch modifies iser so that userspace can just
call the ep_disconnect callback, which will halt
all recv IO, before calling the stop_conn callback so
we do not have to worry about the conn->recv_lock and
suspend rx field. iser just needs to stop the send side
from accessing the ib conn.

Fixup to handle when the ep poll fails and ep disconnect
is called from Erez.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iscsi_iser.c
ser/iscsi_iser.h
ser/iser_verbs.c
5d91e209fb21fb9cc765729d4c6a85a9fb6c9187 21-May-2008 Mike Christie <michaelc@cs.wisc.edu> [SCSI] iscsi: remove session/conn_data_size from iscsi_transport

This removes the session and conn data_size fields from the iscsi_transport.
Just pass in the value like with host allocation. This patch also makes
it so the LLD iscsi_conn data is allocated with the iscsi_cls_conn.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iscsi_iser.c
a4804cd6eb19318ae8d08ea967cfeaaf5c5b68a6 21-May-2008 Mike Christie <michaelc@cs.wisc.edu> [SCSI] iscsi: add iscsi host helpers

This finishes the host/session unbinding, by adding some helpers
to add and remove hosts and the session they manage.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iscsi_iser.c
756135215ec743be6fdce2bdebe8cdb9f8a231f6 21-May-2008 Mike Christie <michaelc@cs.wisc.edu> [SCSI] iscsi: remove session and host binding in libiscsi

bnx2i allocates a host per netdevice but will use libiscsi,
so this unbinds the session from the host in that code.

This will also be useful for the iser parent device dma settings
fixes.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iscsi_iser.c
d3826721b198001c55353b1c54e10843068aae63 21-May-2008 Mike Christie <michaelc@cs.wisc.edu> [SCSI] iscsi class, iscsi drivers: remove unused iscsi_transport attrs

max_cmd_len and max_conn are not really used. max_cmd_len is
always 16 and can be set by the LLD. max_conn is always one
since we do not support MCS.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iscsi_iser.c
ser/iscsi_iser.h
40753caa364bfba60ebd5e2a8bdf366ef175d03c 21-May-2008 Mike Christie <michaelc@cs.wisc.edu> [SCSI] iscsi class, iscsi_tcp/iser: add host arg to session creation

iscsi offload (bnx2i and qla4xx) allocate a scsi host per hba,
so the session creation path needs a shost/host_no argument.
Software iscsi/iser will follow the same behabior as before
where it allcoates a host per session, but in the future iser
will probably look more like bnx2i where the host's parent is
the hardware (rnic for iser and for bnx2i it is the nic), because
it does not use a socket layer like how iscsi_tcp does.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iscsi_iser.c
e1d50dce5af77cb6d33555af70e2b8748dd84009 21-May-2008 Jack Morgenstein <jackm@dev.mellanox.co.il> IPoIB: Test for NULL broadcast object in ipiob_mcast_join_finish()

We saw a kernel oops in our regression testing when a multicast "join
finish" occurred just after the interface was -- this is
<https://bugs.openfabrics.org/show_bug.cgi?id=1040>. The test
randomly causes the HCA physical port to go down then up.

The cause of this is that ipoib_mcast_join_finish() processing happen
just after ipoib_mcast_dev_flush() was invoked (in which case the
broadcast pointer is NULL). This patch tests for and handles the case
where priv->broadcast is NULL.

Cc: <stable@kernel.org>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_multicast.c
57ce41d1d18279cc90223f3deadca70c7de1cfca 01-May-2008 Eli Cohen <eli@dev.mellanox.co.il> IB/ipoib: Fix transmit queue stalling forever

Commit f56bcd80 ("IPoIB: Use separate CQ for UD send completions")
introduced a bug where the transmit queue could get stopped and never
woken up. The problem is that send completions are only polled at the
end of the xmit function, so if the send queue fills up and the xmit
path stops the queue, then there is no way for send completions to
ever get polled, and so the transmit queue stays stopped forever.

Fix this by arming the send CQ just before posting the last send
request that fills the send queue. Then, when the completion event
handler is called, drain the send CQ. Since it is possible that not
enough send completions are in the CQ, verify that the the net queue
has been woken up after draining the send CQ, and if not arm a timer
and drain again at the timer function.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_ib.c
poib/ipoib_verbs.c
b4132efa1a47858d741ecb05b8735e6fcb603bc8 29-Apr-2008 Eli Cohen <eli@dev.mellanox.co.il> IPoIB: Copy child MTU from parent

When creating a child interface, copy the MTU information from the
parent. Otherwise when the child's multicast join completes, the MTU
will not be updated since the code does

dev->mtu = min(priv->mcast_mtu, priv->admin_mtu);

and priv->admin_mtu will be set to 0.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_vlan.c
f56bcd8013566d4ad4759ae5fc85a6660e4655c7 29-Apr-2008 Eli Cohen <eli@dev.mellanox.co.il> IPoIB: Use separate CQ for UD send completions

Use a dedicated CQ for UD send completions. Also, do not arm the UD
send CQ, which reduces the number of interrupts generated. This patch
farther reduces overhead by not calling poll CQ for every posted send
WR -- it does polls only when there 16 or more outstanding work requests.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_cm.c
poib/ipoib_ethtool.c
poib/ipoib_ib.c
poib/ipoib_main.c
poib/ipoib_verbs.c
87528227dfa8776d12779d073c217f0835fd6d20 29-Apr-2008 Eli Dorfman <dorfman.eli@gmail.com> IB/iser: Count FMR alignment violations per session

Count FMR alignment violations per session as part of the iscsi
statistics.

Signed-off-by: Eli Dorfman <elid@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iscsi_iser.c
ser/iser_memory.c
6f735e36bad6fa4949271b3c3d0f331aad812313 29-Apr-2008 Eli Dorfman <dorfman.eli@gmail.com> IB/iser: Move high-volume debug output to higher debug level

Add another level for debug.

Signed-off-by: Eli Dorfman <elid@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iscsi_iser.h
ser/iser_memory.c
bc7b3a36ba02e4053ca38653e6a753082d9add03 23-Apr-2008 Shirley Ma <mashirle@us.ibm.com> IPoIB: Handle 4K IB MTU for UD (datagram) mode

This patch enables IPoIB to use 4K UD messages (when the underlying
device and fabrics support a 4K MTU) by using two scatter buffers when
PAGE_SIZE is less than or equal to thhe HCA IB MTU size. The first
buffer is for IPoIB header + GRH header, and the second buffer is the
IPoIB payload, which is 4K-4.

Signed-off-by: Shirley Ma <xma@us.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_ib.c
poib/ipoib_main.c
poib/ipoib_multicast.c
poib/ipoib_verbs.c
poib/ipoib_vlan.c
ee959b00c335d7780136c5abda37809191fe52c3 22-Feb-2008 Tony Jones <tonyj@suse.de> SCSI: convert struct class_device to struct device

It's big, but there doesn't seem to be a way to split it up smaller...

Signed-off-by: Tony Jones <tonyj@suse.de>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
rp/ib_srp.c
rp/ib_srp.h
05321937469a8592d6a6d35f1d38ca882d243044 06-Mar-2008 Greg Kroah-Hartman <gregkh@suse.de> IB: rename "dev" to "srp_dev" in srp_host structure

This sets us up to be able to convert the srp_host to use a struct
device instead of a class_device.

Based on a original patch from Tony Jones, but split up into this piece
by Greg.

Signed-off-by: Tony Jones <tonyj@suse.de>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Reviewed-by: Roland Dreier <rolandd@cisco.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
rp/ib_srp.c
rp/ib_srp.h
0a22ab92f51478796d5f3997f4f5922409c98b10 17-Apr-2008 Erez Zilber <erezz@voltaire.com> IB/iser: Don't change itt endianness

The itt field in struct iscsi_data is not defined with any particular
endianness. open-iscsi should use it as-is without byte-swapping it.
This fixes sparse warnings coming from doing ntohl(hdr->itt).

Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iser_initiator.c
9fdd5e5bf682130d1e1dd83d06e99eeafa645c0c 17-Apr-2008 Roland Dreier <rolandd@cisco.com> IPoIB: Handle case when P_Key is deleted and re-added at same index

If a P_Key is deleted and then re-added at the same index, then IPoIB
gets confused because __ipoib_ib_dev_flush() only checks whether the
index is the same without checking whether the P_Key was present, so
the interface is stopped when the P_Key is deleted, but the event when
the P_Key is re-added gets ignored and the interface never gets
restarted.

Also, switch to using ib_find_pkey() instead of ib_find_cached_pkey()
everywhere in IPoIB, since none of the places that look for P_Keys are
in a fast path or in non-sleeping context, and in general we want to
kill off the whole caching infrastructure eventually. This also fixes
consistency problems caused because some IPoIB queries were cached and
some were uncached during the window where the cache was not updated.

Thanks to Venkata Subramonyam <vsubramo@cisco.com> for debugging this
problem and testing this fix.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
poib/ipoib_ib.c
d97c51707d7d0716881be84ffd2100449852e44b 17-Apr-2008 Erez Zilber <erezz@voltaire.com> IB/iser: Release connection resources on RDMA_CM_EVENT_DEVICE_REMOVAL event

When a RDMA_CM_EVENT_DEVICE_REMOVAL event is raised, iSER should
release the connection resources.

This is necessary when the IB HCA module is unloaded while open-iscsi
is still running. Currently, iSER just BUG()s.

Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iser_verbs.c
28d52b3cd8d48ef0ff77d4a8a7a21fc2816bb0a5 17-Apr-2008 Eli Cohen <eli@dev.mellanox.co.il> IPoIB: Support modifying IPoIB CQ event moderation

This can be used to tune at run time the parameters controlling the
event (interrupt) generation rate and thus reduce the overhead
incurred by handling interrupts resulting in better throughput. Since
IPoIB uses a single CQ for both RX and TX, RX is chosen to dictate
configuration for both RX and TX.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_ethtool.c
82c24c18afc5e1c2a955bdc2762b19721283365c 17-Apr-2008 Eli Cohen <eli@dev.mellanox.co.il> IPoIB: Add basic ethtool support

Just add the infrastructure so we can add functionality later.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/Makefile
poib/ipoib.h
poib/ipoib_ethtool.c
poib/ipoib_main.c
40ca1988e03c001747d0b4cc1b25cf38297c9f9e 17-Apr-2008 Eli Cohen <eli@dev.mellanox.co.il> IPoIB: Add LSO support

For HCAs that support TCP segmentation offload (IB_DEVICE_UD_TSO), set
NETIF_F_TSO and use HW LSO to offload TCP segmentation.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_cm.c
poib/ipoib_ib.c
poib/ipoib_main.c
poib/ipoib_verbs.c
157de229468b2a63bbc8f9a7d37c70a2c9731ac8 17-Apr-2008 Robert P. J. Day <rpjday@crashcourse.ca> IB: Use shorter list_splice_init() for brevity

Convert list_splice() + INIT_LIST_HEAD() to the equivalent list_splice_init()

Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
1e89a1946cfd906d12eff437d2a76b3aa7f5e731 17-Apr-2008 David Dillow <dillowda@ornl.gov> IB/srp: Enforce protocol limit on srp_sg_tablesize

The current SRP initiator will allow unlimited s/g entries in the
indirect descriptors lists, but the entry count field in the SRP_CMD
request is 8 bits, so setting srp_sg_tablesize too large will open the
possibility of wrapping the count and generating invalid requests.

Clamp srp_sg_tablesize to the protocol limits to prevent surprises.

Reported by Martin W. Schlining III <mschlining@datadirectnet.com>.

Signed-off-by: David Dillow <dillowda@ornl.gov>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
6046136c742e32d5e6431cdcd8957638d1816821 17-Apr-2008 Eli Cohen <eli@dev.mellanox.co.il> IPoIB: Use checksum offload support if available

For HCAs that support checksum offload (ie that set IB_DEVICE_UD_IP_CSUM
in the device capabilities flags), have IPoIB set NETIF_F_IP_CSUM and
use the HCA to generate and verify IP checksums.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_cm.c
poib/ipoib_ib.c
poib/ipoib_main.c
10313cbb92206450b450e14f2b3f6ccde42d9a34 12-Mar-2008 Roland Dreier <rolandd@cisco.com> IPoIB: Allocate priv->tx_ring with vmalloc()

Commit 7143740d ("IPoIB: Add send gather support") made struct
ipoib_tx_buf significantly larger, since the mapping member changed
from a single u64 to an array with MAX_SKB_FRAGS + 1 entries. This
means that allocating tx_rings with kzalloc() may fail because there
is not enough contiguous memory for the new, much bigger size. Fix
this regression by allocating the rings with vmalloc() instead.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
poib/ipoib_main.c
4200406b8fbbf309f4fffb339bd16c4553ae0c30 12-Mar-2008 Roland Dreier <rolandd@cisco.com> IPoIB/cm: Set tx_wr.num_sge in connected mode post_send()

Commit 7143740d ("IPoIB: Add send gather support") made it possible
for tx_wr.num_sge to be != 1 -- this happens if send gather support is
enabled. However, the code in the connected mode post_send() function
assumes the old invariant, namely that tx_wr.num_sge is always 1. Fix
this by explicitly setting tx_wr.num_sge to 1 in the CM post_send().

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
b3e2749bf32f61e7beb259eb7cfb066d2ec6ad65 11-Mar-2008 Or Gerlitz <ogerlitz@voltaire.com> IPoIB: Don't drop multicast sends when they can be queued

When set_multicast_list() is called the multicast task is restarted
and the IPOIB_MCAST_STARTED bit is cleared. As a result for some
window of time, multicast packets are not transmitted nor queued but
rather dropped by ipoib_mcast_send(). These dropped packets are
painful in two cases:

- bonding fail-over which both calls set_multicast_list() on the new
active slave and sends Gratuitous ARP through that slave.

- IP_DROP_MEMBERSHIP code which both calls set_multicast_list() on the
device and issues IGMP leave.

In both these cases, depending on the scheduling of the IPoIB
multicast task, the packets would be dropped. As a result, in the
bonding case, the failover would not be detected by the peers until
their neighbour is renewed the neighbour (which takes a few tens of
seconds). In the IGMP case, the IP router doesn't get an IGMP leave
and would only learn on that from further probes on the group (also a
delay of at least a few tens of seconds).

Fix this by allowing transmission (or queuing) depending on the
IPOIB_FLAG_OPER_UP flag instead of the IPOIB_MCAST_STARTED flag.

Signed-off-by: Olga Shern <olgas@voltaire.com>
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_multicast.c
d33ed425c6cc14370d8c418b504328d2c3db58b4 04-Mar-2008 Arne Redlich <arne.redlich@xiranet.com> IB/iser: Handle iser_device allocation error gracefully

"iser_device" allocation failure is "handled" with a BUG_ON() right
before dereferencing the NULL-pointer - fix this!

Signed-off-by: Arne Redlich <arne.redlich@xiranet.com>
Signed-off-by: Erez Zilber <erezz@voltaire.com>
ser/iser_verbs.c
9a378270c085080b2f38dee6308de4d8413b5141 04-Mar-2008 Arne Redlich <arne.redlich@xiranet.com> IB/iser: Fix list iteration bug

The iteration through the list of "iser_device"s during device
lookup/creation is broken -- it might result in an infinite loop if
more than one HCA is used with iSER. Fix this by using
list_for_each_entry() instead of the open-coded flawed list iteration
code.

Signed-off-by: Arne Redlich <arne.redlich@xiranet.com>
Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iser_verbs.c
ec229e5e81b3cf757e5e8b6a8bd0b4f32fe52f8c 13-Feb-2008 Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com> IPoIB/cm: Fix ipoib_cm_dev_stop() cleanup when drain times out

Commit efcd9971 ("IPoIB/cm: Factor out ipoib_cm_free_rx_reap_list()")
introduced a bug in ipoib_cm_dev_stop() when the receive drain times
out. In that case, the function moves all the pending rx stuff into a
private list but then calls ipoib_cm_free_rx_reap_list(), which
handles a different list.

Fix this by moving everything to the rx_reap_list that will actually
get freed up.

This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=906>.

Signed-off-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
a9d1884925c80b96a621939a4fef5d74de58debe 14-Feb-2008 Eli Cohen <eli@dev.mellanox.co.il> IPoIB: Remove unused struct ipoib_cm_tx.ibwc member

struct ipoib_cm_tx.ibwc is unused since commit 1b524963 ("IPoIB/cm:
Use common CQ for CM send completions"), so remove it.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
poib/ipoib.h
167c42655cca188657aa9bb4e06d1194af3c73a5 13-Feb-2008 Jack Morgenstein <jackm@dev.mellanox.co.il> IPoIB: On P_Key change event, reset state properly

In P_Key event handling, if the old P_Key is no longer available, the
driver must call ipoib_ib_dev_stop() -- just as it does when the P_Key
is still available (see procedure __ipoib_ib_dev_flush()).

When a P_Key becomes available, the driver will perform ipoib_open(),
which assumes that the QP is in RESET, the cm_id has been
destroyed/deleted, etc. If ipoib_ib_dev_stop() is not called as
described above, then these assumptions will be false, and the attempt
to bring the interface up will fail.

Found by Mellanox QA.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_ib.c
7143740d26098aca84ecc7376ccfe2c58fd0412e 30-Jan-2008 Eli Cohen <eli@mellanox.co.il> IPoIB: Add send gather support

This patch acts as a preparation for using checksum offload for IB
devices capable of inserting/verifying checksum in IP packets. The
patch does not actaully turn on NETIF_F_SG - we defer that to the
patches adding checksum offload capabilities.

We only add support for send gathers for datagram mode, since existing
HW does not support checksum offload on connected QPs.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_cm.c
poib/ipoib_ib.c
poib/ipoib_verbs.c
eb14032f9eb595621270f3269f40094adb3144e8 30-Jan-2008 Eli Cohen <eli@mellanox.co.il> IPoIB: Add high DMA feature flag

All current InfiniBand devices can handle all DMA addresses, and it's
hard to imagine anyone would be silly enough to build a new device
that couldn't. Therefore, enable the NETIF_F_HIGHDMA feature for IPoIB.

This has no effect for no, but is needed when we enable gather/scatter
support and checksum stateless offloads.

Signed-off-by: Eli Cohen <eli@mellnaox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
9fe4bcf45ece0b0081031edaaa41581c85ef7049 08-Jan-2008 David Dillow <dillowda@ornl.gov> IB/srp: Retry stale connections

When a host just goes away (crash, power loss, etc.) without tearing
down its IB connections, it can get stale connection errors when it
tries to reconnect to targets upon rebooting. Retrying the connection
a few times will prevent sysadmins from playing the "which disk(s)
went missing?" game.

This would have made things slightly quicker when tracking down some
of the recent bugs, but it also helps quite a bit when you've got a
large number of targets hanging off a wedged server.

Signed-off-by: David Dillow <dillowda@ornl.gov>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
rp/ib_srp.h
7bc531dd883b955e6198c8e202161f22d2e8c472 28-Jan-2008 Or Gerlitz <ogerlitz@voltaire.com> IPoIB: Remove a misleading debug print

Commit 732a2170 ("IB/ipoib: Bound the net device to the ipoib_neigh
structue") left a misleading debug print (n->dev would be a bond
device only if boding is used). Clean it up.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
bafff9741704959e99fb65a7327c017251019a19 17-Jan-2008 Or Gerlitz <ogerlitz@voltaire.com> IPoIB: Handle bonding failover race for connected neighbours too

Move up the code that checks for a situation where the remote GID
stored in the ipoib_neigh is different than the one present in the
neighbour (handle gratuitous ARP) or that a bonding fail over has
happened but the neighbour still has a pointer to an ipoib_neigh
created by a different device than the current slave. This will cause
the driver to apply the check also for connected mode neighbours.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
d3f46f39b7092594b498abc12f0c73b0b9913bde 15-Jan-2008 James Bottomley <James.Bottomley@HansenPartnership.com> [SCSI] remove use_sg_chaining

With the sg table code, every SCSI driver is now either chain capable
or broken (or has sg_tablesize set so chaining is never activated), so
there's no need to have a check in the host template.

Also tidy up the code by moving the scatterlist size defines into the
SCSI includes and permit the last entry of the scatterlist pools not
to be a power of two.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
rp/ib_srp.c
9b73e76f3cf63379dcf45fcd4f112f5812418d0a 26-Jan-2008 Linus Torvalds <torvalds@linux-foundation.org> Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (200 commits)
[SCSI] usbstorage: use last_sector_bug flag universally
[SCSI] libsas: abstract STP task status into a function
[SCSI] ultrastor: clean up inline asm warnings
[SCSI] aic7xxx: fix firmware build
[SCSI] aacraid: fib context lock for management ioctls
[SCSI] ch: remove forward declarations
[SCSI] ch: fix device minor number management bug
[SCSI] ch: handle class_device_create failure properly
[SCSI] NCR5380: fix section mismatch
[SCSI] sg: fix /proc/scsi/sg/devices when no SCSI devices
[SCSI] IB/iSER: add logical unit reset support
[SCSI] don't use __GFP_DMA for sense buffers if not required
[SCSI] use dynamically allocated sense buffer
[SCSI] scsi.h: add macro for enclosure bit of inquiry data
[SCSI] sd: add fix for devices with last sector access problems
[SCSI] fix pcmcia compile problem
[SCSI] aacraid: add Voodoo Lite class of cards.
[SCSI] aacraid: add new driver features flags
[SCSI] qla2xxx: Update version number to 8.02.00-k7.
[SCSI] qla2xxx: Issue correct MBC_INITIALIZE_FIRMWARE command.
...
1cf18d5aab5144866cb5221250905623b03a32a9 22-Jan-2008 Jan Engelhardt <jengelh@computergmbh.de> IPoIB: Constify seq_operations function pointer tables

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_fs.c
48fe5e594c979177b7f20affd027be56e8ea2767 15-Nov-2007 Krishna Kumar <krkumar2@in.ibm.com> IPoIB: Remove redundant check of netif_queue_stopped() in xmit handler

qdisc_run() now tests for queue_stopped() before calling
__qdisc_run(), and the same check is done in every iteration of
__qdisc_run(), so another check is not required in the driver xmit.
This means that ipoib_start_xmit() no longer needs to test
netif_queue_stopped(); the test was added to fix earlier kernels,
where the networking stack did not guarantee that the xmit method of
an LLTX driver would not be called after the queue was stopped, but
current kernels do provide this guarantee.

To validate, I put a debug in the TX_BUSY path which never hit with 64
threads running overnight exercising this code a few 100 million
times.

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
6410627eb9804e541b83d220c8e914ce64475b31 17-Jan-2008 Erez Zilber <erezz@voltaire.com> IB/iser: Add change_queue_depth method

Add a .change_queue_depth handler to the scsi_host_template in the
iSER driver. iscsi_change_queue_depth was added to iscsi_tcp in order
to solve the problem of queue depth which was too high for some
targets. It is also applicable for iSER.

Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iscsi_iser.c
a4ef1451dfba92f51934e8331f634497b9ed3393 17-Jan-2008 Erez Zilber <erezz@voltaire.com> IB/iser: Print information about unhandled RDMA CM events

Some RDMA CM events are not supported or not handled in iSER.
This patch adds some info (printk) for the user about them.

Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iser_verbs.c
7aa54bd730df1468c90ae84b56ade7f322b44de7 08-Jan-2008 David Dillow <dillowda@ornl.gov> IB/srp: Add identifying information to log messages

When you have multiple targets, it gets really confusing when you try
to track down who did a reset when there is no identifying information
in the log message, especially when the same extension ID is mapped
through two different local IB ports. So, add an identifier that can
be used to track back to which local IB port/remote target pair is the
one having problems.

Signed-off-by: David Dillow <dillowda@ornl.gov>
Acked-by: Pete Wyckoff <pw@osc.edu>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
586a693448676de5174e752426ced69ec79ab174 21-Dec-2007 Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com> IPoIB/CM: Enable SRQ support on HCAs that support fewer than 16 SG entries

Some HCAs (such as ehca2) support SRQ, but only support fewer than 16 SG
entries for SRQs. Currently IPoIB/CM implicitly assumes all HCAs will
support 16 SG entries for SRQs (to handle a 64K MTU with 4K pages). This
patch removes that restriction by limiting the maximum MTU in connected
mode to what the maximum number of SRQ SG entries allows.

This patch addresses <https://bugs.openfabrics.org/show_bug.cgi?id=728>

Signed-off-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_cm.c
poib/ipoib_main.c
fff09a8e6e726f0752254e1f46f7224e3bebb302 19-Dec-2007 David Dillow <dillowda@ornl.gov> IB/srp: Enable SG list chaining

By default, the SCSI mid-layer seems to send down 512KB requests
(sg_tablesize = 256), with some requests occasionally combined. By
allowing the mid-layer to chain requests, we can easily grow to 1024KB
or larger -- I've tested 4096KB I/O requests with no problems.

I looked through the DMA paths on the hardware drivers to ensure they
could take advantage of the SG chaining, and it seems that every one
except ipath uses the system's DMA routines, which have been converted
to handle chaining. ipath looks like it should be OK, but I have no
way to test it.

Signed-off-by: David Dillow <dillowda@ornl.gov>

[ Tested on ipath. - Roland ]

Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
8cba2077325b361dedf058c7dfc6c33691422497 19-Dec-2007 David Dillow <dillowda@ornl.gov> IB/srp: Respect target credit limit

The current SRP initiator will send requests even if it has no credits
available. The results of sending extra requests are vendor specific,
but on some devices, overrunning credits will cost 85% of peak
performance -- e.g. 100 MB/s vs 720 MB/s. Other devices may just drop
the requests.

This patch will tell the SCSI midlayer to queue requests if there are
fewer than two credits remaining, and will not issue a task management
request if there are no credits remaining. The mid-layer will retry
the queued command once an outstanding command completes.

The patch also removes the unlikely() in __srp_get_tx_iu(), as it is
not at all unlikely to hit this limit under heavy load.

Signed-off-by: David Dillow <dillowda@ornl.gov>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
rp/ib_srp.h
a9e527e3f9f4510e9f3450ca3bc51bc3ef2854fd 10-Dec-2007 Rolf Manderscheid <rvm@obsidianresearch.com> IPoIB: improve IPv4/IPv6 to IB mcast mapping functions

An IPoIB subnet on an IB fabric that spans multiple IB subnets can't
use link-local scope in multicast GIDs. The existing routines that
map IP/IPv6 multicast addresses into IB link-level addresses hard-code
the scope to link-local, and they also leave the partition key field
uninitialised. This patch adds a parameter (the link-level broadcast
address) to the mapping routines, allowing them to initialise both the
scope and the P_Key appropriately, and fixes up the call sites.

The next step will be to add a way to configure the scope for an IPoIB
interface.

Signed-off-by: Rolf Manderscheid <rvm@obsidianresearch.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_multicast.c
38dc732f47948b9f91ae846806159a16aab1015f 25-Jan-2008 Oliver Pinter <oliver.pntr@gmail.com> IB/iser: Typo fix (s/destory/destroy/)

Signed-off-by: Oliver Pinter <oliver.pntr@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iser_verbs.c
bd5d7a8585bb196a03655b25d2ec6395a491cd01 25-Jan-2008 Erez Zilber <erezz@voltaire.com> IB/iser: update URLs of iSER docs

Signed-off-by: Erez Zilber <erezz@voltaire.com>
ser/Kconfig
908cf9a565348b5a6d765d120cb189a568ea4883 20-Nov-2007 Joe Perches <joe@perches.com> drivers/infiniband: Add missing "space"

Add missing spaces in the middle of format strings.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iser_initiator.c
68e995a295720439ad2bf8677114cdf9d262d905 25-Jan-2008 Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com> IPoIB/cm: Add connected mode support for devices without SRQs

Some IB adapters (notably IBM's eHCA) do not implement SRQs (shared
receive queues). The current IPoIB connected mode support only works
on devices that support SRQs.

Fix this by adding support for using the receive queue of each
connected mode receive QP. The disadvantage of this compared to using
an SRQ is that it means a full queue of receives must be posted for
each remote connected mode peer, which means that total memory usage
is potentially much higher than when using SRQs. To manage this, add
a new module parameter "max_nonsrq_conn_qp" that limits the number of
connections allowed per interface.

The rest of the changes are fairly straightforward: we use a table of
struct ipoib_cm_rx to hold all the active connections, and put the
table index of the connection in the high bits of receive WR IDs.
This is needed because we cannot rely on the struct ib_wc.qp field for
non-SRQ receive completions. Most of the rest of the changes just
test whether or not an SRQ is available, and post receives or find
received packets in the right place depending on the answer.

Cleaning up dead connections actually becomes simpler, because we do
not have to do the "last WQE reached" dance that is required to
destroy QPs attached to an SRQ. We just move the QP to the error
state and wait for all pending receives to be flushed.

Signed-off-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com>

[ Completely rewritten and split up, based on Pradeep's work. Several
bugs fixed and no doubt several bugs introduced. - Roland ]

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_cm.c
poib/ipoib_main.c
poib/ipoib_verbs.c
efcd99717f76c6d19dd81203c60fe198480de522 25-Jan-2008 Roland Dreier <rolandd@cisco.com> IPoIB/cm: Factor out ipoib_cm_free_rx_reap_list()

Factor out the code for going through the rx_reap list of struct
ipoib_cm_rx and freeing each one. This consolidates the code
duplicated between ipoib_cm_dev_stop() and ipoib_cm_rx_reap() and
reduces the risk of error when adding additional accounting.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
7b3687df66cab4ecd6efb42cfa0c7de60cc4e3b9 25-Jan-2008 Roland Dreier <rolandd@cisco.com> IPoIB/cm: Factor out ipoib_cm_create_srq()

Factor out the code to create an SRQ and allocate the receive ring in
ipoib_cm_dev_init() into a new function ipoib_cm_create_srq(). This
will make the code neater when support for devices that don't implement
SRQs is added.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
1efb61444ca3a598dfafb7a6c573c5d5d42d3432 25-Jan-2008 Roland Dreier <rolandd@cisco.com> IPoIB/cm: Factor out ipoib_cm_free_rx_ring()

Factor out the code to unmap/free skbs and free the receive ring in
ipoib_cm_dev_cleanup() into a new function ipoib_cm_free_rx_ring().
This function will be called from a couple of other places when
support for devices that don't implement SRQs is added.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
2337f80941ac22f747ce6fd2c7a79e91d911a3ce 24-Oct-2007 Roland Dreier <rolandd@cisco.com> IPoIB: Trivial formatting cleanups

Fix whitespace blunders, convert "foo* bar" to "foo *bar", etc.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_cm.c
poib/ipoib_ib.c
poib/ipoib_main.c
poib/ipoib_multicast.c
poib/ipoib_verbs.c
90c18f3c280f80e0bfbab7c1fc4b282842ccb853 21-Jan-2008 Erez Zilber <erezz@voltaire.com> [SCSI] IB/iSER: add logical unit reset support

eh_device_reset_handler was already added to scsi_host_template
in iscsi_tcp, and is now added also for iscsi_iser.

Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iscsi_iser.c
a8ac6311cc21d78fa284cd43f56df2063f536bf1 13-Dec-2007 Olaf Kirch <olaf.kirch@oracle.com> [SCSI] iscsi: convert xmit path to iscsi chunks

Convert xmit to iscsi chunks.

from michaelc@cs.wisc.edu:

Bug fixes, more digest integration, sg chaining conversion and other
sg wrapper changes, coding style sync up, and removal of io fields,
like pdu_sent, that are not needed.

Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iscsi_iser.c
f6d5180c78780d63b0577edeb3ce41eeb3e93eea 13-Dec-2007 Mike Christie <michaelc@cs.wisc.edu> [SCSI] libiscsi: fix nop handling

During root boot and shutdown the target could send us nops.
At this time iscsid cannot be running, so the target will drop
the session and the boot or shutdown will hang.

To handle this and allow us to better control when to check the network
this patch moves the nop handling to the kernel.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iscsi_iser.c
b3a7ea8d50f6028964b468d13a095dfb2508b2fb 13-Dec-2007 Mike Christie <michaelc@cs.wisc.edu> [SCSI] libiscsi: do not block session during logout

There is not need to block the session during logout. Since
we are going to fail the commands that were blocked just fail them
immediately instead.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iser_initiator.c
38ad03de3fd350e683213ddf898a7049534628a8 13-Dec-2007 Boaz Harrosh <boazharrosh@gmail.com> [SCSI] libiscsi,iser: patch for AHS support

- The default initialization of hdr_max is the minimum -
sizeof(struct iscsi_cmd) - Once this patch goes into iser the default
initialization at libiscsi can be removed.
- This is not yet full support for AHSs at iser end. But it should be easy.
Just allocate more space at iser_desc right after iscsi_hdr. Than
at transmission time use ctask->hdr_len to retrieve the total
size of all iscsi pdu headers. See previous patch at iscsi_tcp.[ch]

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iscsi_iser.c
843c0a8a76078cf961b244b839683d0667313740 13-Dec-2007 Mike Christie <michaelc@cs.wisc.edu> [SCSI] libiscsi, iscsi_tcp: add device support

This patch adds logical unit reset support. This should work for ib_iser,
but I have not finished testing that driver so it is not hooked in yet.

This patch also temporarily reverts the iscsi_tcp r2t write out patch.
That code is completely rewritten in this patchset.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ser/iscsi_iser.c
ad696989b4a2fce8494964814376aef41da3ff55 04-Jan-2008 Dave Dillow <dillowda@ornl.gov> IB/srp: Release transport before removing host

The documented call sequence for removing a host is to call the
transport xxx_remove_host() prior to scsi_remove_host(). The SRP
transport used to crash when that order was followed, but as it is now
fixed, use the documented order.

Signed-off-by: David Dillow <dillowda@ornl.gov>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
b0e47c8b79154772a436f25bf7646733e1d6194c 03-Jan-2008 David Dillow <dillowda@ornl.gov> IB/srp: Fix list corruption/oops on module reload

Add a missing call to srp_remove_host() in srp_remove_one() so that we
don't leak SRP transport class list entries.

Tested-by: David Dillow <dillowda@ornl.gov>
Acked-by: FUJITA Tomonori <tomof@acm.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
1401b53acc0328d96bacb2a3393d2852699df96b 26-Nov-2007 Jack Morgenstein <jackm@dev.mellanox.co.il> IPoIB: Fix oops if xmit is called when priv->broadcast is NULL

If a port goes down, ipoib_ib_dev_down() is invoked -- which flushes
the mcasts (clearing priv->broadcast) and clearing the path record
cache. If ipoib_start_xmit() is then invoked (before the broadcast
group is rejoined), a kernel oops results from attempting to access
priv->broadcast, which is still unset.

Returning NULL from path_rec_create() if priv->broadcast is NULL is a
harmless way of bypassing the problem -- the offending packet is
simply discarded "without prejudice."

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
a316b79c3306c59176d7ae04e4aad12374dfed37 21-Nov-2007 Erez Zilber <erezz@voltaire.com> IB/iser: Add missing counter increment in iser_data_buf_aligned_len()

While adding sg chaining support to iSER, a "for" loop was replaced
with a "for_each_sg" loop. The "for" loop included the incrementation
of 2 variables. Only one of them is incremented in the current
"for_each_sg" loop. This caused iSER to think that all data is
unaligned, and all data was copied to aligned buffers.

This patch increments the missing counter inside the "for_each_sg"
loop whenever necessary.

Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iser_memory.c
09f60f8f54c5e2391f0b7c38dccd7b00d83587ab 26-Oct-2007 Roland Dreier <rolandd@cisco.com> IPoIB/cm: Fix receive QP cleanup

Commit 1b524963 ("IPoIB/cm: Use common CQ for CM send completions")
changed how the high-order bits of work request IDs were used, which
had the effect that IPOIB_CM_RX_DRAIN_WRID was no longer handled as a
connected mode receive completion. This leads to the messages

ib1: cm send completion event with wrid 1073741823 (> 64)
ib1: RX drain timing out

when an interface with connected mode QPs is brought down. Fix this
by making sure that both IPOIB_OP_CM and IPOIB_OP_RECV are set in
IPOIB_CM_RX_DRAIN_WRID.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
0b776eb5426752d4e53354ac89e3710d857e09a7 23-Oct-2007 Linus Torvalds <torvalds@woody.linux-foundation.org> Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
mlx4_core: Increase command timeout for INIT_HCA to 10 seconds
IPoIB/cm: Use common CQ for CM send completions
IB/uverbs: Fix checking of userspace object ownership
IB/mlx4: Sanity check userspace send queue sizes
IPoIB: Rewrite "if (!likely(...))" as "if (unlikely(!(...)))"
IB/ehca: Enable large page MRs by default
IB/ehca: Change meaning of hca_cap_mr_pgsize
IB/ehca: Fix ehca_encode_hwpage_size() and alloc_fmr()
IB/ehca: Fix masking error in {,re}reg_phys_mr()
IB/ehca: Supply QP token for SRQ base QPs
IPoIB: Use round_jiffies() for ah_reap_task
RDMA/cma: Fix deadlock destroying listen requests
RDMA/cma: Add locking around QP accesses
IB/mthca: Avoid alignment traps when writing doorbells
mlx4_core: Kill mlx4_write64_raw()
45711f1af6eff1a6d010703b4862e0d2b9afd056 22-Oct-2007 Jens Axboe <jens.axboe@oracle.com> [SG] Update drivers to use sg helpers

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
ser/iser_memory.c
1b524963fd2d7fb20ea68df497151aa9d17fbca4 16-Aug-2007 Michael S. Tsirkin <mst@mellanox.co.il> IPoIB/cm: Use common CQ for CM send completions

Use the same CQ for CM send completions as for all other IPoIB
completions. This means all completions are processed via the same
NAPI polling routine. This should help reduce the number of
interrupts for bi-directional traffic (such as TCP) and fixes "driver
is hogging interrupts" errors reported for IPoIB send side, e.g.
<https://bugs.openfabrics.org/show_bug.cgi?id=508>

To do this, keep a per-interface counter of outstanding send WRs, and
stop the interface when this counter reaches the send queue size to
avoid CQ overruns.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_cm.c
poib/ipoib_ib.c
poib/ipoib_main.c
fd312561adcc90e924f35d3032d5493aeb4c3017 18-Oct-2007 Roland Dreier <rolandd@cisco.com> IPoIB: Rewrite "if (!likely(...))" as "if (unlikely(!(...)))"

It's too hard to figure out what "!likely(...)" really means, and who
knows how compilers interpret the hint.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
69fc507a1424ce31d2096be5b1e5b1750bdfe235 15-Oct-2007 Anton Blanchard <anton@samba.org> IPoIB: Use round_jiffies() for ah_reap_task

Use round_jiffies() to align the 1 second ah_reap_task with other work
and potentially save power by sleeping cores for longer.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_ib.c
53d412fce05e73dd0b25b0ebfa83c7ee94f16451 24-Jul-2007 Jens Axboe <jens.axboe@oracle.com> infiniband: sg chaining support

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
ser/iser_memory.c
200d1713b47200aa478f27e454e3d957264d49be 10-Oct-2007 Moni Shoua <monis@voltaire.com> IB/ipoib: Verify address handle validity on send

When the bonding device senses a carrier loss of its active slave it replaces
that slave with a new one. In between the times when the carrier of an IPoIB
device goes down and ipoib_neigh is destroyed, it is possible that the
bonding driver will send a packet on a new slave that uses an old ipoib_neigh.
This patch detects and prevents this from happenning.

Signed-off-by: Moni Shoua <monis at voltaire.com>
Signed-off-by: Or Gerlitz <ogerlitz at voltaire.com>
Acked-by: Roland Dreier <rdreier@cisco.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
poib/ipoib_main.c
732a2170f499ce7cf5f0bdd4f9e0b0c8337b67e1 10-Oct-2007 Moni Shoua <monis@voltaire.com> IB/ipoib: Bound the net device to the ipoib_neigh structue

IPoIB uses a two layer neighboring scheme, such that for each struct neighbour
whose device is an ipoib one, there is a struct ipoib_neigh buddy which is
created on demand at the tx flow by an ipoib_neigh_alloc(skb->dst->neighbour)
call.

When using the bonding driver, neighbours are created by the net stack on behalf
of the bonding (master) device. On the tx flow the bonding code gets an skb such
that skb->dev points to the master device, it changes this skb to point on the
slave device and calls the slave hard_start_xmit function.

Under this scheme, ipoib_neigh_destructor assumption that for each struct
neighbour it gets, n->dev is an ipoib device and hence netdev_priv(n->dev)
can be casted to struct ipoib_dev_priv is buggy.

To fix it, this patch adds a dev field to struct ipoib_neigh which is used
instead of the struct neighbour dev one, when n->dev->flags has the
IFF_MASTER bit set.

Signed-off-by: Moni Shoua <monis at voltaire.com>
Signed-off-by: Or Gerlitz <ogerlitz at voltaire.com>
Acked-by: Roland Dreier <rdreier@cisco.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
poib/ipoib.h
poib/ipoib_main.c
poib/ipoib_multicast.c
df3d80f5a5c74168be42788364d13cf6c83c7b9c 15-Oct-2007 Linus Torvalds <torvalds@woody.linux-foundation.org> Merge master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (207 commits)
[SCSI] gdth: fix CONFIG_ISA build failure
[SCSI] esp_scsi: remove __dev{init,exit}
[SCSI] gdth: !use_sg cleanup and use of scsi accessors
[SCSI] gdth: Move members from SCp to gdth_cmndinfo, stage 2
[SCSI] gdth: Setup proper per-command private data
[SCSI] gdth: Remove gdth_ctr_tab[]
[SCSI] gdth: switch to modern scsi host registration
[SCSI] gdth: gdth_interrupt() gdth_get_status() & gdth_wait() fixes
[SCSI] gdth: clean up host private data
[SCSI] gdth: Remove virt hosts
[SCSI] gdth: Reorder scsi_host_template intitializers
[SCSI] gdth: kill gdth_{read,write}[bwl] wrappers
[SCSI] gdth: Remove 2.4.x support, in-kernel changelog
[SCSI] gdth: split out pci probing
[SCSI] gdth: split out eisa probing
[SCSI] gdth: split out isa probing
gdth: Make one abuse of scsi_cmnd less obvious
[SCSI] NCR5380: Use scsi_eh API for REQUEST_SENSE invocation
[SCSI] usb storage: use scsi_eh API in REQUEST_SENSE execution
[SCSI] scsi_error: Refactoring scsi_error to facilitate in synchronous REQUEST_SENSE
...
aebd5e476ecc8ceb53577b20f2a352ff4ceffd8d 11-Jul-2007 FUJITA Tomonori <tomof@acm.org> [SCSI] transport_srp: add rport roles attribute

This adds a 'roles' attribute to rport like transport_fc. The role can
be initiator or target. That is, the initiator driver creates target
remote ports and the target driver creates initiator remote ports.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
rp/ib_srp.c
3236822b1c9b67ad10745d965515b528818f1120 27-Jun-2007 FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> [SCSI] ib_srp: convert to use the srp transport class

This converts ib_srp to use the srp transport class.

I don't have ib hardware so I've not tested this patch.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Roland Dreier <rolandd@cisco.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
rp/Kconfig
rp/ib_srp.c
ce9d3c9a6a9aef61525be07fe6ba27d937236aa2 12-Oct-2007 Linus Torvalds <torvalds@woody.linux-foundation.org> Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (87 commits)
mlx4_core: Fix section mismatches
IPoIB: Allow setting policy to ignore multicast groups
IB/mthca: Mark error paths as unlikely() in post_srq_recv functions
IB/ipath: Minor fix to ordering of freeing and zeroing of tid pages.
IB/ipath: Remove redundant link state checks
IB/ipath: Fix IB_EVENT_PORT_ERR event
IB/ipath: Better handling of unexpected GPIO interrupts
IB/ipath: Maintain active time on all chips
IB/ipath: Fix QHT7040 serial number check
IB/ipath: Indicate a couple of chip bugs to userspace
IB/ipath: iba6110 rev4 no longer needs recv header overrun workaround
IB/ipath: Use counters in ipath_poll and cleanup interrupts in ipath_close
IB/ipath: Remove duplicate copy of LMC
IB/ipath: Add ability to set the LMC via the sysfs debugging interface
IB/ipath: Optimize completion queue entry insertion and polling
IB/ipath: Implement IB_EVENT_QP_LAST_WQE_REACHED
IB/ipath: Generate flush CQE when QP is in error state
IB/ipath: Remove redundant code
IB/ipath: Future proof eeprom checksum code (contents reading)
IB/ipath: UC RDMA WRITE with IMMEDIATE doesn't send the immediate
...
9153f66a5b8e63c61374df4e6a4cbd0056e45178 10-Oct-2007 Roland Dreier <rdreier@cisco.com> IPoIB: Fix unused variable warning

The conversion to use netdevice internal stats left an unused variable
in ipoib_neigh_free(), since there's no longer any reason to get
netdev_priv() in order to increment dropped packets. Delete the
unused priv variable.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
poib/ipoib_main.c
de90351219a1f1fd3cb45cf6fcc4e9d6407fd2c9 29-Sep-2007 Roland Dreier <rolandd@cisco.com> [IPoIB]: Convert to netdevice internal stats

Use the stats member of struct netdevice in IPoIB, so we can save
memory by deleting the stats member of struct ipoib_dev_priv, and save
code by deleting ipoib_get_stats().

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib.h
poib/ipoib_cm.c
poib/ipoib_ib.c
poib/ipoib_main.c
poib/ipoib_multicast.c
3b04ddde02cf1b6f14f2697da5c20eca5715017f 09-Oct-2007 Stephen Hemminger <shemminger@linux-foundation.org> [NET]: Move hardware header operations out of netdevice.

Since hardware header operations are part of the protocol class
not the device instance, make them into a separate object and
save memory.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib_main.c
10d024c1b2fd58af8362670d7d6e5ae52fc33353 17-Sep-2007 Ralf Baechle <ralf@linux-mips.org> [NET]: Nuke SET_MODULE_OWNER macro.

It's been a useless no-op for long enough in 2.6 so I figured it's time to
remove it. The number of people that could object because they're
maintaining unified 2.4 and 2.6 drivers is probably rather small.

[ Handled drivers added by netdev tree and some missed IRDA cases... -DaveM ]

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib_main.c
bea3348eef27e6044b6161fd04c3152215f96411 04-Oct-2007 Stephen Hemminger <shemminger@linux-foundation.org> [NET]: Make NAPI polling independent of struct net_device objects.

Several devices have multiple independant RX queues per net
device, and some have a single interrupt doorbell for several
queues.

In either case, it's easier to support layouts like that if the
structure representing the poll is independant from the net
device itself.

The signature of the ->poll() call back goes from:

int foo_poll(struct net_device *dev, int *budget)

to

int foo_poll(struct napi_struct *napi, int budget)

The caller is returned the number of RX packets processed (or
the number of "NAPI credits" consumed if you want to get
abstract). The callee no longer messes around bumping
dev->quota, *budget, etc. because that is all handled in the
caller upon return.

The napi_struct is to be embedded in the device driver private data
structures.

Furthermore, it is the driver's responsibility to disable all NAPI
instances in it's ->stop() device close handler. Since the
napi_struct is privatized into the driver's private data structures,
only the driver knows how to get at all of the napi_struct instances
it may have per-device.

With lots of help and suggestions from Rusty Russell, Roland Dreier,
Michael Chan, Jeff Garzik, and Jamal Hadi Salim.

Bug fixes from Thomas Graf, Roland Dreier, Peter Zijlstra,
Joseph Fannin, Scott Wood, Hans J. Koch, and Michael Chan.

[ Ported to current tree and all drivers converted. Integrated
Stephen's follow-on kerneldoc additions, and restored poll_list
handling to the old style to fix mutual exclusion issues. -DaveM ]

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib.h
poib/ipoib_ib.c
poib/ipoib_main.c
335a64a5a958002bc238c90de695e120c3c8c120 08-Oct-2007 Or Gerlitz <ogerlitz@voltaire.com> IPoIB: Allow setting policy to ignore multicast groups

The kernel IB stack allows (through the RDMA CM) userspace
applications to join and use multicast groups from the IPoIB MGID
range. This allows multicast traffic to be handled directly from
userspace QPs, without going through the kernel stack, which gives
better performance for some applications.

However, to fully interoperate with IP multicast, such userspace
applications need to participate in IGMP reports and queries, or else
routers may not forward the multicast traffic to the system where the
application is running. The simplest way to do this is to share the
kernel IGMP implementation by using the IP_ADD_MEMBERSHIP option to
join multicast groups that are being handled directly in userspace.

However, in such cases, the actual multicast traffic should not also
be handled by the IPoIB interface, because that would burn resources
handling multicast packets that will just be discarded in the kernel.

To handle this, this patch adds lookup on the database used for IB
multicast group reference counting when IPoIB is joining multicast
groups, and if a multicast group is already handled by user space,
then the IPoIB kernel driver ignores the group. This is controlled by
a per-interface policy flag. When the flag is set, IPoIB will not
join and attach its QP to a multicast group which already has an entry
in the database; when the flag is cleared, IPoIB will behave as before
this change.

For each IPoIB interface, the /sys/class/net/$intf/umcast attribute
controls the policy flag. The default value is off/0.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_main.c
poib/ipoib_multicast.c
poib/ipoib_vlan.c
ede6bc04f3a07a9c93f02c92cdc281d254398321 07-Oct-2007 Dotan Barak <dotanb@dev.mellanox.co.il> IPoIB/cm: Clean up initialization of QP attr in ipoib_cm_create_tx_qp()

Make the way QP is being created in ipoib_cm_create_tx_qp()
consistent with ipoib_cm_create_rx_qp().

Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
ec2a1344ad348a789b1d9d9b32cccbef33161574 10-Oct-2007 Roland Dreier <rolandd@cisco.com> IB/iser: Remove unnecessary includes

<asm/scatterlist.h> is not needed because everyplace it appears,
<linux/scatterlist.h> also appears. <asm/io.h> is not needed because
nothing seems to be using device IO anyway.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iser_initiator.c
ser/iser_memory.c
ser/iser_verbs.c
247e020ee5e2a7bf46f2d7a3d4490a670a712a40 09-Aug-2007 Sean Hefty <sean.hefty@intel.com> IB/srp: Add QoS support through service ID

Provide the target service ID when performing a path record query to
support optional QoS capability. QoS requires support from the SA.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
81668838c4583b19276b16382e0c61e21ef5adf0 02-Aug-2007 Sean Hefty <sean.hefty@intel.com> IPoIB: Specify Traffic Class with path record queries for QoS support

To support QoS within and between subnets, modify IPoIB to request
specific Traffic Class values with path record queries, using
the value associated with the IPoIB broadcast group.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>

[ See some comments I made on this at v1 and v2 of the posts
<http://lists.openfabrics.org/pipermail/general/2007-August/039275.html>
<http://lists.openfabrics.org/pipermail/general/2007-September/040312.html> ]

Reviewed-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_main.c
poib/ipoib_multicast.c
ca6de177acef8f2c7c3901ea583a263364ca7bbb 21-Aug-2007 Eli Cohen <eli@mellanox.co.il> IPoIB: Fix error path memory leak

Clean up properly if ib_query_pkey() or ib_query_gid() fail.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
b3ac60fc243f2312d27ecded058ef96f52f25fe0 10-Oct-2007 Eli Cohen <eli@mellanox.co.il> IPoIB: Fix typo to end statement with ';' instead of ','

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_verbs.c
ce423ef50ee1b6b7db63c748034423aa0afce224 10-Oct-2007 Roland Dreier <rolandd@cisco.com> IPoIB: Make sure no receives are handled when stopping device

The current IPoIB code might process receive completions from
ipoib_drain_cq() when bringing down the interface. This could cause
packets to be passed up the stack without the device's poll method
being called. Avoid this by setting the status of any successful
completions to IB_WC_WR_FLUSH_ERR.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_ib.c
6958e827f187c9c5cd39af075567f74f02bf3dd1 06-Aug-2007 Jack Morgenstein <jackm@dev.mellanox.co.il> IPoIB: Fix leak in ipoib_transport_dev_init() error path

ipoib_transport_dev_init() calls ipoib_cm_dev_init(), so it needs to
call ipoib_cm_dev_cleanup() to unwind that on the error path.

Found by Dotan Barak of Mellanox.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_verbs.c
3d1ff48da760968793f3c36672961ffd23088787 03-Aug-2007 Raghava Kondapalli <rakondap@cisco.com> IB/srp: Add OUI for new Cisco targets

New Cisco IB SRP targets use the Cisco OUI 00-1b-0d but still need the
Topspin workarounds. Add this OUI to srp_target_is_topspin().

Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
5d7cbfd63136e4469a896acfadb33e19ed62f068 03-Aug-2007 Roland Dreier <rolandd@cisco.com> IB/srp: Wrap OUI checking for workarounds in helper functions

Wrap the checking for Mellanox and Topspin OUIs to decide whether to
use a workaround into helper functions. This will make it cleaner to
add a new OUI to check (as we need to do now that some targets with a
Cisco OUI still need the Topspin workarounds).

Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
7974392c0b0d4e7a2a17ca3597d51a29b9841aa5 26-Jul-2007 Mike Christie <michaelc@cs.wisc.edu> [SCSI] iscsi_tcp, ib_iser Enable module refcounting for iscsi host template

This prevents the iscsi modules from being unloaded while
there are active mounts from an iscsi target.

Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
ser/iscsi_iser.c
20c2df83d25c6a95affe6157a4c9cac4cf5ffaac 20-Jul-2007 Paul Mundt <lethal@linux-sh.org> mm: Remove slab destructors from kmem_cache_create().

Slab destructors were no longer supported after Christoph's
c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been
BUGs for both slab and slub, and slob never supported them
either.

This rips out support for the dtor pointer from kmem_cache_create()
completely and fixes up every single callsite in the kernel (there were
about 224, not including the slab allocator definitions themselves,
or the documentation references).

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
ser/iscsi_iser.c
41179e2de6962b46d1d9f2b4437243ac740efdec 18-Jul-2007 Roland Dreier <rolandd@cisco.com> IB/iser: Make a couple of functions static

Make iser_conn_release() and iser_start_rdma_unaligned_sg() static,
since they are only used in the .c file where they are defined. In
addition to being a cleanup, this even shrinks the generated code by
allowing the single call of iser_start_rdma_unaligned_sg() to be
inlined into its callsite. On x86_64:

add/remove: 0/1 grow/shrink: 1/0 up/down: 466/-533 (-67)
function old new delta
iser_reg_rdma_mem 1518 1984 +466
iser_start_rdma_unaligned_sg 533 - -533

Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iscsi_iser.h
ser/iser_memory.c
ser/iser_verbs.c
bc06cffdec85d487c77109dffcd2f285bdc502d3 16-Jul-2007 Linus Torvalds <torvalds@woody.linux-foundation.org> Merge master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (166 commits)
[SCSI] ibmvscsi: convert to use the data buffer accessors
[SCSI] dc395x: convert to use the data buffer accessors
[SCSI] ncr53c8xx: convert to use the data buffer accessors
[SCSI] sym53c8xx: convert to use the data buffer accessors
[SCSI] ppa: coding police and printk levels
[SCSI] aic7xxx_old: remove redundant GFP_ATOMIC from kmalloc
[SCSI] i2o: remove redundant GFP_ATOMIC from kmalloc from device.c
[SCSI] remove the dead CYBERSTORMIII_SCSI option
[SCSI] don't build scsi_dma_{map,unmap} for !HAS_DMA
[SCSI] Clean up scsi_add_lun a bit
[SCSI] 53c700: Remove printk, which triggers because of low scsi clock on SNI RMs
[SCSI] sni_53c710: Cleanup
[SCSI] qla4xxx: Fix underrun/overrun conditions
[SCSI] megaraid_mbox: use mutex instead of semaphore
[SCSI] aacraid: add 51245, 51645 and 52245 adapters to documentation.
[SCSI] qla2xxx: update version to 8.02.00-k1.
[SCSI] qla2xxx: add support for NPIV
[SCSI] stex: use resid for xfer len information
[SCSI] Add Brownie 1200U3P to blacklist
[SCSI] scsi.c: convert to use the data buffer accessors
...
1d84612649427a85e1f311baa7215f9a6252d856 18-Jun-2007 Sean Hefty <sean.hefty@intel.com> IB/cm: Include HCA ACK delay in local ACK timeout

The IB CM should include the HCA ACK delay when calculating the local
ACK timeout value to use for RC QPs. If the HCA ACK delay is large
enough relative to the packet life time, then if it is not taken into
account, the calculated timeout value ends up being too small, which
can result in "retry exceeded" errors.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
1b844afe9e67d6cd441ae6df71051b4004f31dd2 10-Jul-2007 Roland Dreier <rolandd@cisco.com> IPoIB: Recycle loopback skbs instead of freeing and reallocating

InfiniBand HCAs replicate multicast packets back to the QP that sent
them if that QP is attached to the destination multicast group. This
means that IPoIB multicasts are often replicated back to the receive
queue of the interface that generated them. To avoid confusing the
network stack, we drop these duplicates within the IPoIB driver.

However, there's no reason to free the skb that received the duplicate
and then immediately allocate a new skb to post to the receive queue.
We can be more efficient and just repost the same skb.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_ib.c
20089ca55786a243c7b72becd1bf670f4e2c2028 10-Jul-2007 Roland Dreier <rolandd@cisco.com> IPoIB/cm: Fix warning if IPV6 is not enabled

Fix

drivers/infiniband/ulp/ipoib/ipoib_cm.c:1151: warning: unused variable 'dev'

by getting rid of the variable dev, which is only used if CONFIG_IPV6
is enabled, and replacing the one use of it with the value it is
assigned, namely priv->dev.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
06cc85086e6896939f8c68f8518224748f6b0b2f 23-May-2007 Jan Engelhardt <jengelh@linux01.gwdg.de> IB: Use menuconfig for InfiniBand menu

Change Kconfig objects from "menu, config" into "menuconfig" so
that the user can disable the whole feature without having to
enter the menu first.

Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/Kconfig
ser/Kconfig
rp/Kconfig
841adfca9c5fc0fec6b1f0b2e5eb7a3b239a7730 29-Jun-2007 Ralph Campbell <ralph.campbell@qlogic.com> IPoIB/cm: Partial error clean up unmaps wrong address

If a page can't be allocated for the frag list of a skb, the code to
unmap the partially allocated list is off by one. For exaple, if
'frags' equals one, i == 0, and the alloc_page() fails, then the old
loop would have unmapped mapping[1] which is uninitialized. The same
would happen if the call to ib_dma_map_page() failed.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Acked-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
13ef5f44c3931dff1d75443a875e97b588d4b8f0 21-Jun-2007 Roland Dreier <rolandd@cisco.com> IPoIB/cm: Remove dead definition of struct ipoib_cm_id

It's completely unused.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
82c3aca6ad9004169df8f2f8c0747686fe4003b3 20-Jun-2007 Michael S. Tsirkin <mst@dev.mellanox.co.il> IPoIB/cm: Fix interoperability when MTU doesn't match

IPoIB connected mode currently rejects a connection request unless the
supported MTU is >= the local netdevice MTU. This breaks
interoperability with implementations that might have tweaked
IPOIB_CM_MTU, and there's real no longer a reason to do so: this test
is just a leftover from when we did not tweak MTU per-connection. Fix
this by making the test as permissive as possible.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
3ec7393a6858a1716e74aa81be6af76fd180021d 19-Jun-2007 Michael S. Tsirkin <mst@dev.mellanox.co.il> IPoIB/cm: Initialize RX before moving QP to RTR

Fix a crasher bug in IPoIB CM: once a QP is in the RTR state, a
receive completion (or even an asynchronous error) might be observed
on this QP, so we have to initialize all of our receive data
structures before moving to the RTR state.

As an optimization (since modify_qp might take a long time), the
jiffies update done when moving RX to the passive_ids list is also
left in place to reduce the chance of the RX being misdetected as
stale.

This fixes bug <https://bugs.openfabrics.org/show_bug.cgi?id=662>.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
da9c0c770e775e655e3f77c96d91ee557b117adb 01-Jun-2007 FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> [SCSI] iscsi_iser: convert to use the data buffer accessors

iscsi_iser: convert to use the data buffer accessors

- remove the unnecessary map_single path.

- convert to use the new accessors for the sg lists and the
parameters.

TODO: use scsi_for_each_sg().

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
ser/iscsi_iser.c
ser/iser_initiator.c
bb350d1decd9c48ffaa7f7e263df3056df9f4f21 25-May-2007 FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> [SCSI] ib_srp: convert to use the data buffer accessors

- remove the unnecessary map_single path.

- convert to use the new accessors for the sg lists and the
parameters.

Jens Axboe <jens.axboe@oracle.com> did the for_each_sg cleanup.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Roland Dreier <rdreier@cisco.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
rp/ib_srp.c
rp/ib_srp.h
d8196ed2181b4595eaf464a5bcbddb6c28649a39 30-May-2007 Mike Christie <michaelc@cs.wisc.edu> [SCSI] iscsi class, iscsi_tcp, iser, qla4xxx: add netdevname sysfs attr

iSCSI must support software iscsi (iscsi_tcp, iser), hardware iscsi (qla4xxx),
and partial offload (broadcom). To be able to allow each stack or driver
or port (virtual or physical) to be able to log into the same target portal
we use the initiator tuple [[HWADDRESS | NETDEVNAME], INITIATOR_NAME] and
the target tuple [TARGETNAME, CONN_ADDRESS, CONN_PORT] to id a session.
This patch adds the netdev name, which is used by software iscsi when
it binds a session to a netdevice using the SO_BINDTODEVICE sock opt.
It cannot use HWADDRESS because if someone did vlans then the same netdevice
will have the same mac and the initiator,target id will not be unique.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: Roland Dreier <rdreier@cisco.com>
Cc: David C Somayajulu <david.somayajulu@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
ser/iscsi_iser.c
1548271ece9e9312fd5feb41fd58773b56a71d39 30-May-2007 Mike Christie <michaelc@cs.wisc.edu> [SCSI] libiscsi: make can_queue configurable

This patch allows us to set can_queue and cmds_per_lun from userspace
when we create the session/host. From there we can set it on a per
target basis. The patch fully converts iscsi_tcp, but only hooks
up ib_iser for cmd_per_lun since it currently has a lots of preallocations
based on can_queue.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: Roland Dreier <rdreier@cisco.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
ser/iscsi_iser.c
ser/iscsi_iser.h
ser/iser_verbs.c
77a23c21aaa723f6b0ffc4a701be8c8e5a32346d 30-May-2007 Mike Christie <michaelc@cs.wisc.edu> [SCSI] libiscsi: fix iscsi cmdsn allocation

The cmdsn allocation and pdu transmit code can race, and we can end
up sending a pdu with cmdsn 10 before a pdu with 5. The target will
then fail the connection/session. This patch fixes the problem by
delaying the cmdsn allocation until we are about to send the pdu.

This also removes the xmitmutex. We were using the connection xmitmutex
during error handling to handle races with mtask and ctask cleanup and
completion. For ctasks we now have nice refcounting and for the mtask,
if we hit the case where the mtask timesout and it is floating
around somewhere in the driver, we end up dropping the session.
And to handle session level cleanup, we use the xmit suspend bit
along with scsi_flush_queue and the session lock to make sure
that the xmit thread is not possibly transmitting a task while
we are trying to kill it.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: Roland Dreier <rdreier@cisco.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
ser/iscsi_iser.c
b2c6416736b847b91950bd43cc5153e11a1f83ee 30-May-2007 Mike Christie <michaelc@cs.wisc.edu> [SCSI] iscsi class, iscsi_tcp, ib_iser: add sysfs chap file

The attached patches add sysfs files for the chap settings
to the iscsi transport class, iscsi_tcp and ib_iser. This is
needed for software iscsi because there are times when iscsid
can die and it will need to reread the values it was using.
And it is needed by qla4xxx for basic management opertaions.
This patch does not hook in qla4xxx yet, because I am not sure
the mbx command to use.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: Roland Dreier <rdreier@cisco.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
ser/iscsi_iser.c
857ae0bdb72999936a28ce621e38e2e288c485da 30-May-2007 Mike Christie <michaelc@cs.wisc.edu> [SCSI] iscsi: Some fixes in preparation for bidirectional support - total_length

- Remove shadow of request length from struct iscsi_cmd_task.
- change all users to use scsi_cmnd->request_bufflen directly

(With bidi we will use scsi-ml API to retrieve in/out length)

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: Roland Dreier <rdreier@cisco.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
ser/iscsi_iser.c
8ad5781ae9702a8f95cfdf30967752e4297613ee 30-May-2007 Mike Christie <michaelc@cs.wisc.edu> [SCSI] iscsi class, qla4xxx, iscsi_tcp, ib_iser: export/set initiator name

For iscsi root boot, software iscsi needs to know what the BIOS/OF
initiator used for the initiator name so this puts it in sysfs
for userspace to be able to pick up.

For hw iscsi, it is nice to see what the card is using.

This patch adds the new param, and hooks in qla4xxx, iscsi_tcp, and ib_iser.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: Roland Dreier <rdreier@cisco.com>
Cc: David C Somayajulu <david.somayajulu@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
ser/iscsi_iser.c
0801c242a33426fddc005c2f559a3d2fa6fca7eb 30-May-2007 Mike Christie <michaelc@cs.wisc.edu> [SCSI] libiscsi, iscsi_tcp, ib_iser : add sw iscsi host get/set params helpers

iscsid and udev need to key off the hw address being
used so add some helpers for iser and iscsi tcp.

Also convert them

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: Roland Dreier <rdreier@cisco.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
ser/iscsi_iser.c
ec56dc0b7f6c3fec20bbc2e98ff1a06edf2fc9b9 28-May-2007 Michael S. Tsirkin <mst@dev.mellanox.co.il> IPoIB/cm: Fix performance regression on Mellanox

commit 518b1646 ("IPoIB/cm: Fix SRQ WR leak") introduced a severe
performance regression on Mellanox cards, because keeping a QP in the
error state for extended periods of time moves hardware to the slow
path (until the QP is destroyed). For example, MPI latency goes from
~3 usecs to ~7 usecs.

Fix this by posting a send WR on one of the QPs that are being
flushed, instead of using a separate drain QP that is kept in the
error state.

This fixes bug <https://bugs.openfabrics.org/show_bug.cgi?id=636>,
reported and bisected by Scott Weitzenkamp at Cisco and debugged by
Sasha Mikheev at Voltaire.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_cm.c
2dfbfc37121d307e1f1d24c2979382cb17b19347 24-May-2007 Michael S. Tsirkin <mst@dev.mellanox.co.il> IPoIB/cm: Drain cq in ipoib_cm_dev_stop()

Since NAPI polling is disabled while ipoib_cm_dev_stop() is running,
ipoib_cm_dev_stop() must poll the CQ itself in order to see the
packets draining.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_cm.c
poib/ipoib_ib.c
8fd357a6e3375083f7d321413eb8f6739491f342 24-May-2007 Michael S. Tsirkin <mst@dev.mellanox.co.il> IPoIB/cm: Fix timeout check in ipoib_cm_dev_stop()

time_after() was used backwards, so the timeout occurred immediately.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
518b1646f8a31904ca637b8df0c1e31c34a7a3c2 21-May-2007 Michael S. Tsirkin <mst@dev.mellanox.co.il> IPoIB/cm: Fix SRQ WR leak

SRQ WR leakage has been observed with IPoIB/CM: e.g. flipping ports on
and off will, with time, leak out all WRs and then all connections
will start getting RNR NAKs. Fix this in the way suggested by spec:
move the QP being destroyed to the error state, wait for "Last WQE
Reached" event and then post WR on a "drain QP" connected to the same
CQ. Once we observe a completion on the drain QP, it's safe to call
ib_destroy_qp.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_cm.c
poib/ipoib_verbs.c
24bd1e4e32e88cd3d0675482d15bea498a922ca8 18-May-2007 Michael S. Tsirkin <mst@dev.mellanox.co.il> IB/ipoib: Fix typos in error messages

Trivial error message fixups.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_ib.c
poib/ipoib_multicast.c
26bbf13ce1ca21ec69175bcc4b995cb8ffdf8593 19-May-2007 Yosef Etigin <yosefe@voltaire.com> IPoIB: Handle P_Key table reordering

SM reconfiguration or failover possibly causes a shuffling of the values
in the P_Key table. Right now, IPoIB only queries for the P_Key index
once when it creates the device QP, and hence there are problems if the
index of a P_Key value changes. Fix this by using the PKEY_CHANGE event
to trigger a recheck of the P_Key index.

Signed-off-by: Yosef Etigin <yosefe@voltaire.com>
Acked-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_ib.c
poib/ipoib_main.c
poib/ipoib_verbs.c
7c5b9ef8577bfa7b74ea58fc9ff2934ffce13532 14-May-2007 Michael S. Tsirkin <mst@dev.mellanox.co.il> IPoIB/cm: Optimize stale connection detection

In the presence of some running RX connections, we repeat
queue_delayed_work calls each 4 RX WRs, which is a waste. It's enough
to start stale task when a first passive connection is added, and
rerun it every IPOIB_CM_RX_DELAY as long as there are outstanding
passive connections.

This removes some code from RX data path.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
e63340ae6b6205fef26b40a75673d1c9c0c8bb90 08-May-2007 Randy Dunlap <randy.dunlap@oracle.com> header cleaning: don't include smp_lock.h when not used

Remove includes of <linux/smp_lock.h> where it is not used/needed.
Suggested by Al Viro.

Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc,
sparc64, and arm (all 59 defconfigs).

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ser/iser_verbs.c
972d45fb43f0f0793fa275c4a22998106760cd61 07-May-2007 Linus Torvalds <torvalds@woody.linux-foundation.org> Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband

* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband:
IPoIB: Convert to NAPI
IB: Return "maybe missed event" hint from ib_req_notify_cq()
IB: Add CQ comp_vector support
IB/ipath: Fix a race condition when generating ACKs
IB/ipath: Fix two more spin lock problems
IB/fmr_pool: Add prefix to all printks
IB/srp: Set proc_name
IB/srp: Add orig_dgid sysfs attribute to scsi_host
IPoIB/cm: Don't crash if remote side uses one QP for both directions
RDMA/cxgb3: Support for new abort logic
RDMA/cxgb3: Initialize cpu_idx field in cpl_close_listserv_req message
RDMA/cxgb3: Fail qp creation if the requested max_inline is too large
RDMA/cxgb3: Fix TERM codes
IPoIB/cm: Fix error handling in ipoib_cm_dev_open()
IB/ipath: Don't corrupt pending mmap list when unmapped objects are freed
IB/mthca: Work around kernel QP starvation
IB/ipath: Don't put QP in timeout queue if waiting to send
IB/ipath: Don't call spin_lock_irq() from interrupt context
8d1cc86a6278687efbab7b8c294ab01efe4d4231 07-May-2007 Roland Dreier <rolandd@cisco.com> IPoIB: Convert to NAPI

Convert the IP-over-InfiniBand network device driver over to using
NAPI to handle completions for the main CQ. This covers all receives
as well as datagram mode sends; send completions for connected mode
connections are still handled from interrupt context.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_cm.c
poib/ipoib_ib.c
poib/ipoib_main.c
f4fd0b224d60044d2da5ca02f8f2b5150c1d8731 03-May-2007 Michael S. Tsirkin <mst@dev.mellanox.co.il> IB: Add CQ comp_vector support

Add a num_comp_vectors member to struct ib_device and extend
ib_create_cq() to pass in a comp_vector parameter -- this parallels
the userspace libibverbs API. Update all hardware drivers to set
num_comp_vectors to 1 and have all ULPs pass 0 for the comp_vector
value. Pass the value of num_comp_vectors to userspace rather than
hard-coding a value of 1.

We want multiple CQ event vector support (via MSI-X or similar for
adapters that can generate multiple interrupts), but it's not clear
how many vectors we want, or how we want to deal with policy issues
such as how to decide which vector to use or how to set up interrupt
affinity. This patch is useful for experimenting, since no core
changes will be necessary when updating a driver to support multiple
vectors, and we know that we want to make at least these changes
anyway.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
poib/ipoib_verbs.c
ser/iser_verbs.c
rp/ib_srp.c
b7f008fdc92e498af34671048556fd17ddfe9be9 07-May-2007 Roland Dreier <rolandd@cisco.com> IB/srp: Set proc_name

Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
3633b3d096286cf21bc07b16fa6265fb006d0844 07-May-2007 Ishai Rabinovitz <ishai@mellanox.co.il> IB/srp: Add orig_dgid sysfs attribute to scsi_host

Add an orig_dgid attribute in sysfs for SRP scsi_hosts, so that
userspace can tell what the original dgid value written to the
add_target file was, even if the connection is redirected to a
different port while connecting.

Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
rp/ib_srp.h
d6ef7d68f6f51c5b9de01c542dab8b90067a9c27 02-May-2007 Michael S. Tsirkin <mst@dev.mellanox.co.il> IPoIB/cm: Don't crash if remote side uses one QP for both directions

The IPoIB CM spec allows the use of a single connection in both
active->passive and passive->active directions. The current Linux
code uses one connection for both directions, but if another node only
uses one connection for both directions, we oops when we try to look
up the passive connection. Fix by checking that qp_context is
non-NULL before dereferencing it.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
poib/ipoib_cm.c
4f7a307dc6e4d8bfeb56f7cf7231b08cb845687c 05-May-2007 Linus Torvalds <torvalds@woody.linux-foundation.org> Merge master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (87 commits)
[SCSI] fusion: fix domain validation loops
[SCSI] qla2xxx: fix regression on sparc64
[SCSI] modalias for scsi devices
[SCSI] sg: cap reserved_size values at max_sectors
[SCSI] BusLogic: stop using check_region
[SCSI] tgt: fix rdma transfer bugs
[SCSI] aacraid: fix aacraid not finding device
[SCSI] aacraid: Correct SMC products in aacraid.txt
[SCSI] scsi_error.c: Add EH Start Unit retry
[SCSI] aacraid: [Fastboot] Panics for AACRAID driver during 'insmod' for kexec test.
[SCSI] ipr: Driver version to 2.3.2
[SCSI] ipr: Faster sg list fetch
[SCSI] ipr: Return better qc_issue errors
[SCSI] ipr: Disrupt device error
[SCSI] ipr: Improve async error logging level control
[SCSI] ipr: PCI unblock config access fix
[SCSI] ipr: Fix for oops following SATA request sense
[SCSI] ipr: Log error for SAS dual path switch
[SCSI] ipr: Enable logging of debug error data for all devices
[SCSI] ipr: Add new PCI-E IDs to device table
...
6473d160b4aba8023bcf38519a5989694dfd51a7 06-Mar-2007 Jean Delvare <khali@linux-fr.org> PCI: Cleanup the includes of <linux/pci.h>

I noticed that many source files include <linux/pci.h> while they do
not appear to need it. Here is an attempt to clean it all up.

In order to find all possibly affected files, I searched for all
files including <linux/pci.h> but without any other occurence of "pci"
or "PCI". I removed the include statement from all of these, then I
compiled an allmodconfig kernel on both i386 and x86_64 and fixed the
false positives manually.

My tests covered 66% of the affected files, so there could be false
positives remaining. Untested files are:

arch/alpha/kernel/err_common.c
arch/alpha/kernel/err_ev6.c
arch/alpha/kernel/err_ev7.c
arch/ia64/sn/kernel/huberror.c
arch/ia64/sn/kernel/xpnet.c
arch/m68knommu/kernel/dma.c
arch/mips/lib/iomap.c
arch/powerpc/platforms/pseries/ras.c
arch/ppc/8260_io/enet.c
arch/ppc/8260_io/fcc_enet.c
arch/ppc/8xx_io/enet.c
arch/ppc/syslib/ppc4xx_sgdma.c
arch/sh64/mach-cayman/iomap.c
arch/xtensa/kernel/xtensa_ksyms.c
arch/xtensa/platform-iss/setup.c
drivers/i2c/busses/i2c-at91.c
drivers/i2c/busses/i2c-mpc.c
drivers/media/video/saa711x.c
drivers/misc/hdpuftrs/hdpu_cpustate.c
drivers/misc/hdpuftrs/hdpu_nexus.c
drivers/net/au1000_eth.c
drivers/net/fec_8xx/fec_main.c
drivers/net/fec_8xx/fec_mii.c
drivers/net/fs_enet/fs_enet-main.c
drivers/net/fs_enet/mac-fcc.c
drivers/net/fs_enet/mac-fec.c
drivers/net/fs_enet/mac-scc.c
drivers/net/fs_enet/mii-bitbang.c
drivers/net/fs_enet/mii-fec.c
drivers/net/ibm_emac/ibm_emac_core.c
drivers/net/lasi_82596.c
drivers/parisc/hppb.c
drivers/sbus/sbus.c
drivers/video/g364fb.c
drivers/video/platinumfb.c
drivers/video/stifb.c
drivers/video/valkyriefb.c
include/asm-arm/arch-ixp4xx/dma.h
sound/oss/au1550_ac97.c

I would welcome test reports for these files. I am fine with removing
the untested files from the patch if the general opinion is that these
changes aren't safe. The tested part would still be nice to have.

Note that this patch depends on another header fixup patch I submitted
to LKML yesterday:
[PATCH] scatterlist.h needs types.h
http://lkml.org/lkml/2007/3/01/141

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
poib/ipoib.h
347fcfbed261fdd11f46fa03d524e1bddddab3a6 01-May-2007 Michael S. Tsirkin <mst@dev.mellanox.co.il> IPoIB/cm: Fix error handling in ipoib_cm_dev_open()

If skb allocation fails when we start the device, we call
ipoib_cm_dev_stop() even though ipoib_cm_dev_open() did not run to
completion, so we pass an invalid pointer to ib_destroy_cm_id and get
an oops.

Fix by clearing cm.id on error, and testing it in cm_dev_stop().
This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=561>

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
afc2e82c0851317931a9bfdb98271253371825c6 27-Apr-2007 Linus Torvalds <torvalds@woody.linux-foundation.org> Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband

* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband: (49 commits)
IB: Set class_dev->dev in core for nice device symlink
IB/ehca: Implement modify_port
IB/umad: Clarify documentation of transaction ID
IPoIB/cm: spin_lock_irqsave() -> spin_lock_irq() replacements
IB/mad: Change SMI to use enums rather than magic return codes
IB/umad: Implement GRH handling for sent/received MADs
IB/ipoib: Use ib_init_ah_from_path to initialize ah_attr
IB/sa: Set src_path_bits correctly in ib_init_ah_from_path()
IB/ucm: Simplify ib_ucm_event()
RDMA/ucma: Simplify ucma_get_event()
IB/mthca: Simplify CQ cleaning in mthca_free_qp()
IB/mthca: Fix mthca_write_mtt() on HCAs with hidden memory
IB/mthca: Update HCA firmware revisions
IB/ipath: Fix WC format drift between user and kernel space
IB/ipath: Check that a UD work request's address handle is valid
IB/ipath: Remove duplicate stuff from ipath_verbs.h
IB/ipath: Check reserved memory keys
IB/ipath: Fix unit selection when all CPU affinity bits set
IB/ipath: Don't allow QPs 0 and 1 to be opened multiple times
IB/ipath: Disable IB link earlier in shutdown sequence
...
459a98ed881802dee55897441bc7f77af614368e 19-Mar-2007 Arnaldo Carvalho de Melo <acme@redhat.com> [SK_BUFF]: Introduce skb_reset_mac_header(skb)

For the common, open coded 'skb->mac.raw = skb->data' operation, so that we can
later turn skb->mac.raw into a offset, reducing the size of struct sk_buff in
64bit land while possibly keeping it as a pointer on 32bit.

This one touches just the most simple case, next will handle the slightly more
"complex" cases.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib_cm.c
poib/ipoib_ib.c
37aebbde7023d75bf09fbadb6796276d0a65a068 25-Apr-2007 Roland Dreier <rolandd@cisco.com> IPoIB/cm: spin_lock_irqsave() -> spin_lock_irq() replacements

There are quite a few places in ipoib_cm.c where we know IRQs are
enabled because we do something that sleeps in the same function, so
we can convert several occurrences of spin_lock_irqsave() to a plain
spin_lock_irq(). This cleans up the source a little and makes the
code smaller too:

add/remove: 0/0 grow/shrink: 1/5 up/down: 3/-51 (-48)
function old new delta
ipoib_cm_tx_reap 403 406 +3
ipoib_cm_stale_task 146 145 -1
ipoib_cm_dev_stop 173 172 -1
ipoib_cm_tx_handler 964 956 -8
ipoib_cm_rx_handler 956 937 -19
ipoib_cm_skb_reap 212 190 -22

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
46f1b3d7aff99ef4c1e729e023b9c8ee51de5973 05-Apr-2007 Sean Hefty <sean.hefty@intel.com> IB/ipoib: Use ib_init_ah_from_path to initialize ah_attr

To support destinations that are not on the local IB subnet, IPoIB
should include the GRH information when constructing an address
handle. Using the existing ib_init_ah_from_path() call will do this
for us.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
poib/ipoib_main.c
a89875fc7e23ec91561bc3742df3bd5d12b376b4 19-Apr-2007 Roland Dreier <rolandd@cisco.com> IPoIB: Remove pointless opcode field from debugging output

There's no point in printing the opcode field in the completion
handling debugging output, since the type of completion is already
printed at the beginning of the line. In fact the opcode field is not
even defined for completions with a status other than success.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
poib/ipoib_ib.c
6371ea3d48e17d4638a91a4a1e0364e56204e418 10-Apr-2007 Michael S. Tsirkin <mst@dev.mellanox.co.il> IPoIB/cm: Fix DMA direction typo

Receive buffers need to be mapped with DMA_FROM_DEVICE. Incorrectly
mapping with DMA_TO_DEVICE causes a hard lock on ppc64 machines with
an IOMMU.

This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=431>

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
1d426d6418d1914b592c9c307c02e488d9182fa8 01-Apr-2007 Erez Zilber <erezz@voltaire.com> IB/iser: Don't defer connection failure notification to workqueue

When a connection is terminated asynchronously from the iSCSI layer's
perspective, iSER needs to notify the iSCSI layer that the connection
has failed. This is done using a workqueue (switched to from the iSER
tasklet context). Meanwhile, the connection object (that holds the
work struct) is released. If the workqueue function wasn't called
yet, it will be called later with a NULL pointer, which will crash the
kernel.

The context switch (tasklet to workqueue) is not required, and
everything can be done from the iSER tasklet. This eliminates the NULL
work struct bug (and simplifies the code).

Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iscsi_iser.h
ser/iser_verbs.c
a26b5fce06b92b2f02967d4c845fa628024e5414 28-Mar-2007 Linus Torvalds <torvalds@woody.linux-foundation.org> Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband

* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband:
IB/iser: Handle aborting a command after it is sent
IB/mthca: Fix thinko in init_mr_table()
RDMA/cxgb3: Fix resource leak in cxio_hal_init_ctrl_qp()
3104a2175dc04b7a597acea90f19b033abcfc7d8 24-Mar-2007 Erez Zilber <erezz@voltaire.com> IB/iser: Handle aborting a command after it is sent

The SCSI midlayer may abort a command that was already sent. If the
initiator is still trying to send the command (or data-out PDUs for
that command), the QP may time out after the midlayer times
out. Therefore, when aborting the command, iSER may still have
references for the command's buffers. When sending these PDUs, the
sends will complete with an error and their resources will be released
then.

Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iser_initiator.c
ecbb416939da77c0d107409976499724baddce7b 24-Mar-2007 Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> [NET]: Fix neighbour destructor handling.

->neigh_destructor() is killed (not used), replaced with
->neigh_cleanup(), which is called when neighbor entry goes to dead
state. At this point everything is still valid: neigh->dev,
neigh->parms etc.

The device should guarantee that dead neighbor entries (neigh->dead !=
0) do not get private part initialized, otherwise nobody will cleanup
it.

I think this is enough for ipoib which is the only user of this thing.
Initialization private part of neighbor entries happens in ipib
start_xmit routine, which is not reached when device is down. But it
would be better to add explicit test for neigh->dead in any case.

Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib_main.c
77d8e1efea0e313edc710160c232a6fd2dc9f907 21-Mar-2007 Michael S. Tsirkin <mst@dev.mellanox.co.il> IB/ipoib: Fix thinko in packet length checks

The packet length checks in ipoib are broken: we add 4 bytes (IPoIB
encapsulation header) when sending a packet, not 20 bytes (hardware
address length) to each packet. Therefore, if connected mode is
enabled so that the interface MTU is larger than the multicast MTU,
IPoIB may end up trying to send too-long multicast packets. For
example, multicast is broken if a message of size 2048 bytes is sent
on an interface with UD MTU 2048, because 2048 is bigger than the real
limit of 2044 but the code tests against the wrong limit of 2060.

This patch fixes <https://bugs.openfabrics.org/show_bug.cgi?id=418>,
submitted by Scott Weitzenkamp <sweitzen@cisco.com>.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
poib/ipoib_ib.c
d04d01b113be5b88418eb30087753c3de0a39fd8 22-Mar-2007 Michael S. Tsirkin <mst@dev.mellanox.co.il> IPoIB: Fix use-after-free in path_rec_completion()

The connected mode code added the possibility that an neigh struct
gets freed in the list_for_each_entry() loop in path_rec_completion(),
which causes a use-after-free. Fix this by changing to the _safe
variant of the list walking macro.

This was spotted by the Coverity checker (CID 1567).

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
e07832b66285124038a96b25a2056e91a55d8b1e 19-Mar-2007 Sean Hefty <sean.hefty@intel.com> IPoIB: Fix race in detaching from mcast group before attaching

There's a race between ipoib_mcast_leave() and ipoib_mcast_join_finish()
where we can try to detach from a multicast group before we've
attached to it. Fix this by reordering the code in ipoib_mcast_leave
to free the multicast group first, which waits for the multicast
callback thread (which calls ipoib_mcast_join_finish()) to complete
before detaching from the group.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_multicast.c
60a596dab7c82bdfa5ee7abcee8e0ce385d4ef21 22-Mar-2007 Michael S. Tsirkin <mst@dev.mellanox.co.il> IPoIB/cm: Fix reaping of stale connections

The sense of the time_after_eq() test in ipoib_cm_stale_task() is
reversed so that only non-stale connections are reaped. Fix this by
changing to time_before_eq().

Noticed by Pradeep Satyanarayana <pradeep@us.ibm.com>.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
bf32ed33e97ac7905fa5a2bf49a634c2eaf62457 01-Mar-2007 Mike Christie <michaelc@cs.wisc.edu> [SCSI] iscsi: rename DEFAULT_MAX_RECV_DATA_SEGMENT_LENGTH

This patch renames DEFAULT_MAX_RECV_DATA_SEGMENT_LENGTH to avoid
confusion with the drivers default values (DEFAULT_MAX_RECV_DATA_SEGMENT_LENGTH
is the iscsi RFC specific default).

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
ser/iser_initiator.c
55c9adde13dfc6b738e0f70c071a0622e52f35ed 08-Mar-2007 Shirley Ma <xma@us.ibm.com> IPoIB: Turn on interface's carrier after broadcast group is joined

Do netif_carrier_on() right after the IPv4 broadcast multicast group
is joined, rather than waiting for all of the initial set of multicast
group joins to finish. This allows at least IPv4 traffic to limp
along on broken fabrics where not all multicast groups can be joined.

Signed-off-by: Shirley Ma <xma@us.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_multicast.c
a27cbe878203076247c1b5287f5ab59ed143b560 27-Feb-2007 Roland Dreier <rolandd@cisco.com> IPoIB: Only handle async events for one port

An asynchronous event carries the port number that the event occurred
on, so there's no reason for an IPoIB interface to process an event
associated with a different local HCA port.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_verbs.c
843613b04744d5b65c2f37975c5310f366a0d070 26-Feb-2007 Roland Dreier <rolandd@cisco.com> IPoIB: Correct debugging output when path record lookup fails

If path_rec_completion() is passed a non-NULL path record pointer
along with an unsuccessful status value, the tracing code incorrectly
prints the (invalid) DLID from the path record rather than the more
interesting status code. The actual logic of the function correctly
uses the path record only if the status indicates a successful lookup.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
658bcef619f50d9eb6028452ff9e1ad4a96c2af9 22-Feb-2007 Roland Dreier <rolandd@cisco.com> IPoIB: Remove unused local_rate tracking

Now that low-level drivers handle the conversion from an absolute rate
to a relative rate, there's no need for the IPoIB driver to keep track
of the local port's data rate.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_multicast.c
1812063ba3365aeb5eb9117861a7fb05432f7ed0 20-Feb-2007 Michael S. Tsirkin <mst@mellanox.co.il> IPoIB/cm: Improve small message bandwidth

Avoid the overhead of freeing/reallocating and mapping/unmapping for
DMA pages that have not been written to by hardware.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
faec2f7b96b555055d0aa6cc6b83a537270bed52 16-Feb-2007 Sean Hefty <sean.hefty@intel.com> IB/sa: Track multicast join/leave requests

The IB SA tracks multicast join/leave requests on a per port basis and
does not do any reference counting: if two users of the same port join
the same group, and one leaves that group, then the SA will remove the
port from the group even though there is one user who wants to stay a
member left. Therefore, in order to support multiple users of the
same multicast group from the same port, we need to perform reference
counting locally.

To do this, add an multicast submodule to ib_sa to perform reference
counting of multicast join/leave operations. Modify ib_ipoib (the
only in-kernel user of multicast) to use the new interface.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_multicast.c
8a2e65f87c66ab1e720f49378750cdd800f9e9cf 15-Feb-2007 Michael S. Tsirkin <mst@mellanox.co.il> IPoIB: CM error handling thinko fix

ipoib_cm_alloc_rx_skb() might be called from IRQ context, so it must
use dev_kfree_skb_any(), not kfree_skb().

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
551fd6122d247d76124c4fdb6eb898cc8e3d74aa 16-Feb-2007 Roland Dreier <rolandd@cisco.com> IPoIB: Only allow root to change between datagram and connected mode

Change the permissions of the "mode" sysfs attribute to be S_IWUSR
instead of S_IWUGO.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_cm.c
93bbad8fe13a25dcf7f3bc628a71d1a7642ae61b 14-Feb-2007 Linus Torvalds <torvalds@woody.linux-foundation.org> Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband

* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband:
IB/mthca: Always fill MTTs from CPU
IB/mthca: Merge MR and FMR space on 64-bit systems
IB/mthca: Fix access to MTT and MPT tables on non-cache-coherent CPUs
IB/mthca: Give reserved MTTs a separate cache line
IB/mthca: Fix reserved MTTs calculation on mem-free HCAs
RDMA/cxgb3: Add driver for Chelsio T3 RNIC
IB: Remove redundant "_wq" from workqueue names
RDMA/cma: Increment port number after close to avoid re-use
IB/ehca: Fix memleak on module unloading
IB/mthca: Work around gcc bug on sparc64
IPoIB: Connected mode experimental support
IB/core: Use ARRAY_SIZE macro for mandatory_table
IB/mthca: Use correct structure size in call to memset()
2b8693c0617e972fc0b2fd1ebf8de97e15b656c3 12-Feb-2007 Arjan van de Ven <arjan@linux.intel.com> [PATCH] mark struct file_operations const 3

Many struct file_operations in the kernel can be "const". Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data. In addition it'll catch accidental writes at compile time to
these shared resources.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
poib/ipoib_fs.c
839fcaba355abaffb7b44f0f4504093acb0b11cf 05-Feb-2007 Michael S. Tsirkin <mst@mellanox.co.il> IPoIB: Connected mode experimental support

The following patch adds experimental support for IPoIB connected
mode, as defined by the draft from the IETF ipoib working group. The
idea is to increase performance by increasing the MTU from the maximum
of 2K (theoretically 4K) supported by IPoIB on top of UD. With this
code, I'm able to get 800MByte/sec or more with netperf without
options on a Mellanox 4x back-to-back DDR system.

Some notes on code:
1. SRQ is used for scalability to large cluster sizes
2. Only RC connections are used (UC does not support SRQ now)
3. Retry count is set to 0 since spec draft warns against retries
4. Each connection is used for data transfers in only 1 direction, so
each connection is either active(TX) or passive (RX). 2 sides that
want to communicate create 2 connections.
5. Each active (TX) connection has a separate CQ for send completions -
this keeps the code simple without CQ resize and other tricks
6. To detect stale passive side connections (where the remote side is
down), we keep an LRU list of passive connections (updated once per
second per connection) and destroy a connection after it has been
unused for several seconds. The LRU rule makes it possible to avoid
scanning connections that have recently been active.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/Kconfig
poib/Makefile
poib/ipoib.h
poib/ipoib_cm.c
poib/ipoib_ib.c
poib/ipoib_main.c
poib/ipoib_multicast.c
poib/ipoib_verbs.c
poib/ipoib_vlan.c
b4377356450e2358f5f92d34f130d6cb6574bf76 09-Feb-2007 Al Viro <viro@ftp.linux.org.uk> [PATCH] iscsi endianness annotations

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ser/iser_initiator.c
43cb76d91ee85f579a69d42bc8efc08bac560278 09-Apr-2002 Greg Kroah-Hartman <gregkh@suse.de> Network: convert network devices to use struct device instead of class_device

This lets the network core have the ability to handle suspend/resume
issues, if it wants to.

Thanks to Frederik Deweerdt <frederik.deweerdt@gmail.com> for the arm
driver fixes.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
poib/ipoib_main.c
poib/ipoib_vlan.c
1033ff670d49760604f5d4c73a1b60741863a406 16-Jan-2007 Ishai Rabinovitz <ishai@mellanox.co.il> IB/srp: Don't wait for response when QP is in error state.

When there is a call to send_tsk_mgmt SRP posts a send and waits for 5
seconds to get a response.

When the QP is in the error state it is obvious that there will be no
response so it is quite useless to wait. In fact, the timeout causes
SRP to wait a long time to reconnect when a QP error occurs. (Each
abort and each reset_device calls send_tsk_mgmt, which waits for the
timeout). The following patch solves this problem by identifying the
failure and returning an immediate error code.

Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
rp/ib_srp.h
a20f3a6d7e67a8aee571fb04634a631ba59f6e92 16-Jan-2007 Ishai Rabinovitz <ishai@mellanox.co.il> IB/srp: Check match_strdup() return

Checks if the kmalloc in match_strdup() was successful, and bail out
on looking at the token if it failed.

Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
f0938401f2252bf39615c0815734650eab9053c8 06-Jan-2007 Erez Zilber <erezz@voltaire.com> IB/iser: Return error code when PDUs may not be sent

iSER limits the number of outstanding PDUs to send. When this threshold
is reached, it should return an error code (-ENOBUFS) instead of setting
the suspend_tx bit (which should be used only by libiscsi).

Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iscsi_iser.c
ser/iser_initiator.c
bf628dc22a09ed2022abb32c76011ae5f99ad6b0 15-Dec-2006 Roland Dreier <rolandd@cisco.com> IB/srp: Fix FMR mapping for 32-bit kernels and addresses above 4G

struct srp_device.fmr_page_mask was unsigned long, which means that
the top part of addresses above 4G was being chopped off on 32-bit
architectures. Of course nothing good happens when data from SRP
targets is DMAed to the wrong place.

Fix this by changing fmr_page_mask to u64, to match the addresses
actually used by IB devices.

Thanks to Brian Cain <Brian.Cain@ge.com> and David McMillen
<davem@systemfabricworks.com> for help diagnosing the bug and testing
the fix.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
rp/ib_srp.h
82b399133b6ebf667ee635fc69ef26b61eede4bc 12-Dec-2006 Roland Dreier <rolandd@cisco.com> IPoIB: Make sure struct ipoib_neigh.queue is always initialized

Move the initialization of ipoib_neigh's skb_queue into
ipoib_neigh_alloc(), since commit 2745b5b7 ("IPoIB: Fix skb leak when
freeing neighbour") will make iterate over the skb_queue to free any
packets left over when freeing the ipoib_neigh structure.

This fixes a crash when freeing ipoib_neigh structures allocated in
ipoib_mcast_send(), which otherwise don't have their skb_queue
initialized.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
5180311fe93842e9e16eb7297cfc4aded752ab33 12-Dec-2006 Ralph Campbell <ralph.campbell@qlogic.com> IB/iser: Use the new verbs DMA mapping functions

Convert iSER to use the new verbs DMA mapping functions for kernel
verbs consumers.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iscsi_iser.h
ser/iser_memory.c
85507bcce0cd6ec859943da4e07227c124a18f3f 12-Dec-2006 Ralph Campbell <ralph.campbell@qlogic.com> IB/srp: Use new verbs IB DMA mapping functions

Convert SRP to use the new verbs DMA mapping functions for kernel
verbs consumers.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
rp/ib_srp.h
37ccf9df974f55e99bf21278133b065cbbcf3f79 12-Dec-2006 Ralph Campbell <ralph.campbell@qlogic.com> IPoIB: Use the new verbs DMA mapping functions

Convert IPoIB to use the new DMA mapping functions
for kernel verbs consumers.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_ib.c
dee234f48aa6b5ee6041d33c4fd59d2f1176e9a1 12-Dec-2006 Roland Dreier <rolandd@cisco.com> IB/iser: Remove unused "write-only" variables

Remove variables that are set but then never looked at in the iSER
initiator. These cleanups came from David Binderman's list of "set
but never used" warnings from icc.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iser_initiator.c
ser/iser_memory.c
f0d1b0b30d250a07627ad8b9fbbb5c7cc08422e8 08-Dec-2006 David Howells <dhowells@redhat.com> [PATCH] LOG2: Implement a general integer log2 facility in the kernel

This facility provides three entry points:

ilog2() Log base 2 of unsigned long
ilog2_u32() Log base 2 of u32
ilog2_u64() Log base 2 of u64

These facilities can either be used inside functions on dynamic data:

int do_something(long q)
{
...;
y = ilog2(x)
...;
}

Or can be used to statically initialise global variables with constant values:

unsigned n = ilog2(27);

When performing static initialisation, the compiler will report "error:
initializer element is not constant" if asked to take a log of zero or of
something not reducible to a constant. They treat negative numbers as
unsigned.

When not dealing with a constant, they fall back to using fls() which permits
them to use arch-specific log calculation instructions - such as BSR on
x86/x86_64 or SCAN on FRV - if available.

[akpm@osdl.org: MMC fix]
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David Howells <dhowells@redhat.com>
Cc: Wojtek Kaniewski <wojtekka@toxygen.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
ser/iser_memory.c
9db73724453a9350e1c22dbe732d427e2939a5c9 05-Dec-2006 David Howells <dhowells@redhat.com> Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6

Conflicts:

drivers/ata/libata-scsi.c
include/linux/libata.h

Futher merge of Linus's head and compilation fixups.

Signed-Off-By: David Howells <dhowells@redhat.com>
4c1ac1b49122b805adfa4efc620592f68dccf5db 05-Dec-2006 David Howells <dhowells@redhat.com> Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6

Conflicts:

drivers/infiniband/core/iwcm.c
drivers/net/chelsio/cxgb2.c
drivers/net/wireless/bcm43xx/bcm43xx_main.c
drivers/net/wireless/prism54/islpci_eth.c
drivers/usb/core/hub.h
drivers/usb/input/hid-core.c
net/core/netpoll.c

Fix up merge failures with Linus's head and fix new compilation failures.

Signed-Off-By: David Howells <dhowells@redhat.com>
a1f8e7f7fb9d7e2cbcb53170edca7c0ac4680697 19-Oct-2006 Al Viro <viro@zeniv.linux.org.uk> [PATCH] severing skbuff.h -> highmem.h

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
ser/iser_memory.c
2745b5b713bf3457d8977c62dc2b3aa61f4a14b0 16-Nov-2006 Michael S. Tsirkin <mst@mellanox.co.il> IPoIB: Fix skb leak when freeing neighbour

ipoib_neigh_free() is sometimes called while neighbour is still alive,
so it might still have queued skbs. Fix skb leak in this case.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_main.c
poib/ipoib_multicast.c
d2fcea7d68473b8bb3e0addb4926c87e2217ca83 21-Nov-2006 Vu Pham <vu@mellanox.com> IB/srp: Fix memory leak on reconnect

SRP reallocates the IU buffers for tx_ring and rx_ring without freeing
the old buffers when it reconnects to a target. Fix this by keeping
the old IU buffers around.

Signed-off-by: Vu Pham <vu@mellanox.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
e54f81889cd5228e7087637c377d76301c7c5663 30-Nov-2006 Roland Dreier <rolandd@cisco.com> IB: Convert kmem_cache_t -> struct kmem_cache

Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iscsi_iser.h
3c8edf0eca2e47340c960b2f27c659c097118ffe 15-Nov-2006 Arne Redlich <arne.redlich@xiranet.com> IB/srp: Increase supported CDB size

Set the Scsi_Host's max_cmd_len from 12 (default) to 16 for
SRP. Otherwise scsi_dispatch_cmd() won't pass down certain commands
such as READ CAPACITY 16, required for supporting disks > 2TB.

Signed-off-by: Arne Redlich <arne.redlich@xiranet.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
c4028958b6ecad064b1a6303a6a5906d4fe48d73 22-Nov-2006 David Howells <dhowells@redhat.com> WorkStruct: make allyesconfig

Fix up for make allyesconfig.

Signed-Off-By: David Howells <dhowells@redhat.com>
poib/ipoib.h
poib/ipoib_ib.c
poib/ipoib_main.c
poib/ipoib_multicast.c
ser/iser_verbs.c
rp/ib_srp.c
073ae841d6a5098f7c6e17fc1f329350d950d1ce 16-Nov-2006 Michael S. Tsirkin <mst@mellanox.co.il> IPoIB: Clear high octet in QP number

IPoIB assumes that high (reserved) octet in the hardware address is 0,
and copies it into the QPN. This violates RFC 4391 (which requires
that the high 8 bits are ignored on receive), and will result in an
invalid QPN being used when interoperating with IPoIB connected mode.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
2e7a7426282bfa2d7dff6eddc5485af8c79a68f3 22-Oct-2006 Erez Zilber <erezz@voltaire.com> IB/iser: Start connection after enabling iSER

When a connection is started (a new connection or a recovered one),
iSER should prepare its resources for full-featured mode and only then
notify the iSCSI layer that it is ready to start queueing commands.

Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iscsi_iser.c
73fbe8be73512b8a3ffa3d20c9d7f531af99679c 10-Oct-2006 Roland Dreier <rolandd@cisco.com> IPoIB: Check for DMA mapping error for TX packets

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_ib.c
01cb9bcbd34b7ba768a7f05375faf43becdb8a60 04-Oct-2006 Ishai Rabinovitz <ishai@mellanox.co.il> IB/srp: Enable multiple connections to the same target

Enable multiple concurrent connections to the same SRP target:

1) Use port GUID instead of node GUID in the initiator port
identifier. This allows connections to be made from multiple HCA
ports at the same time.
2) Let the user specify the identifier extention when adding the
device. This allows userspace to make multiple connections even
from the same port, if it wants too.

Without this, only one connection can be made from any given HCA, even
if it has multiple ports, because we don't use multi-channel mode, so
targets will only allow one connection from a given initiator port ID.

Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
rp/ib_srp.h
9b0af401aae336975e620fccdd294bb763424f3f 10-Oct-2006 Ishai Rabinovitz <ishai@mellanox.co.il> IB/srp: Remove redundant memset()

scsi_host_alloc() already allocates with kzalloc(), so the struct Scsi_Host
is zeroed out, including the private data portion. Remove the redundant
memset that zeros this out again in the SRP initiator.

Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
cab00891c5489cb6d0cde0a55d39bd5f2871fa70 03-Oct-2006 Matt LaPlante <kernel1@cyberdogtech.com> Still more typo fixes

Signed-off-by: Adrian Bunk <bunk@stusta.de>
poib/Kconfig
fd6a79a786b84510d00ee6aa6449a468e6d75dee 27-Sep-2006 Erez Zilber <erezz@voltaire.com> IB/iser: Fix the description of iSER in Kconfig

Fix the description of iSER in Kconfig. It is not accurate.

Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/Kconfig
74a2078061409e027ccb51a28cf6174c31ab8f99 27-Sep-2006 Erez Zilber <erezz@voltaire.com> IB/iser: DMA unmap unaligned for RDMA data before touching it

iSER uses the DMA mapping api to map the page holding the
SCSI command data to the HCA DMA address space. When the
command data is not aligned for RDMA, the data is copied
to/from an allocated buffer which in turn is used for
executing this command. The pages associated with the
command must be unmapped before being touched.

Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iscsi_iser.h
ser/iser_initiator.c
ser/iser_memory.c
87e8df7a273c7c1acb864c90b5253609c44375a6 27-Sep-2006 Erez Zilber <erezz@voltaire.com> IB/iser: Have iSER data transaction object point to iSER conn

iSER uses a data transaction object (struct iser_dto) as part
of its IB data descriptors (struct iser_desc) management.
It also uses a hierarchy of connection structures pointing to
each other. A DTO may exist even after the iscsi_iser connection
pointed by it is destroyed (eg one that is bound to a post
receive buffer which was flushed by the IB HW). Hence DTOs need
point to the lowest connection, which is struct iser_conn.

Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iscsi_iser.c
ser/iscsi_iser.h
ser/iser_initiator.c
ser/iser_verbs.c
8e18e2941c53416aa219708e7dcad21fb4bd6794 27-Sep-2006 Theodore Ts'o <tytso@mit.edu> [PATCH] inode_diet: Replace inode.u.generic_ip with inode.i_private

The following patches reduce the size of the VFS inode structure by 28 bytes
on a UP x86. (It would be more on an x86_64 system). This is a 10% reduction
in the inode size on a UP kernel that is configured in a production mode
(i.e., with no spinlock or other debugging functions enabled; if you want to
save memory taken up by in-core inodes, the first thing you should do is
disable the debugging options; they are responsible for a huge amount of bloat
in the VFS inode structure).

This patch:

The filesystem or device-specific pointer in the inode is inside a union,
which is pretty pointless given that all 30+ users of this field have been
using the void pointer. Get rid of the union and rename it to i_private, with
a comment to explain who is allowed to use the void pointer. This is just a
cleanup, but it allows us to reuse the union 'u' for something something where
the union will actually be used.

[judith@osdl.org: powerpc build fix]
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Judith Lebzelter <judith@osdl.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
poib/ipoib_fs.c
c9802cd9574a80444e689c7525627b40d7dc3a06 23-Sep-2006 James Bottomley <jejb@sparkweed.localdomain> Merge mulgrave-w:git/scsi-misc-2.6

Conflicts:

drivers/scsi/iscsi_tcp.c
drivers/scsi/iscsi_tcp.h

Pretty horrible merge between crypto hash consolidation
and crypto_digest_...->crypto_hash_... conversion

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
a8bfca024326560d86c6323b0504288ca55a75fc 23-Sep-2006 Eli Cohen <eli@dev.mellanox.co.il> IPoIB: Add some likely/unlikely annotations in hot path

Signed-off-by: Eli Cohen <eli@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_ib.c
poib/ipoib_main.c
507c33504686e733a14ef0b2dc9db0c20fae4653 21-Sep-2006 Dotan Barak <dotanb@dev.mellanox.co.il> IPoIB: Remove unused include of vmalloc.h

IPoIB doesn't use anything from <linux/vmalloc.h>, so don't include it.

Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
5ccd025553d73e523212ee0860b7f4a75e886bfa 23-Sep-2006 Eli Cohen <eli@mellanox.co.il> IPoIB: Rejoin all multicast groups after a port event

When ipoib_ib_dev_flush() is called because of a port event, the
driver needs to rejoin all multicast groups, since the flush will call
ipoib_mcast_dev_flush() (via ipoib_ib_dev_down()). Otherwise no
(non-broadcast) multicast groups will be rejoined until the networking
core calls ->set_multicast_list again, and so multicast reception will
be broken for potentially a long time.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_ib.c
d0df6d6d4539241179a1ef5394787825bf05bbce 23-Sep-2006 Roland Dreier <rolandd@cisco.com> IPoIB: Create MCGs with all attributes required by RFC

RFC 4391 ("Transmission of IP over InfiniBand (IPoIB)") says:

If the IB multicast group does not already exist, one must be
created first with the IPoIB link MTU. The MGID MUST use the same
P_Key, Q_Key, SL, MTU, and HopLimit as those used in the
broadcast-GID. The rest of attributes SHOULD follow the values used
in the broadcast-GID as well.

However, the current IPoIB driver is only setting the attributes
required by the InfiniBand spec to create a multicast group, so in
particular the MTU and HopLimit are not being set. Add these
attributes when creating MCGs, and also set the Rate attribute, since
IPoIB pays attention to that attribute as well.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_multicast.c
5755d6dad95808a24a65dd9e61e23c305f9b077c 23-Sep-2006 Roland Dreier <rolandd@cisco.com> IB/iser: INFINIBAND_ISER depends on INET

iSER won't build without CONFIG_INET enabled, so make Kconfig reflect that.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/Kconfig
c1a0b23bf477c2e1068905f4e2b5c3cee139e853 22-Aug-2006 Michael S. Tsirkin <mst@mellanox.co.il> IB/sa: Require SA registration

Require users to register with SA module, to prevent the sa_query
module text from going away while an SA query callback is still
running. Update all in-tree users for the new interface.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_main.c
poib/ipoib_multicast.c
rp/ib_srp.c
2439a6e65ff09729c3b4215f134dc5cd4e8a30c0 23-Sep-2006 Roland Dreier <rolandd@cisco.com> IPoIB: Refactor completion handling

Split up ipoib_ib_handle_wc() into ipoib_ib_handle_rx_wc() and
ipoib_ib_handle_tx_wc() to make the code easier to read. This will
also help implement NAPI in the future.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_ib.c
d81110285f7f6c07a0ce8f99a5ff158a647cd649 10-Sep-2006 Erez Zilber <erezz@voltaire.com> IB/iser: Do not use FMR for a single dma entry sg

Fast Memory Registration (fmr) is used to register for rdma an sg whose
elements are not linearly sequential after dma mapping.

The IB verbs layer provides an "all dma memory MR (memory region)" which
can be used for RDMA-ing a dma linearly sequential buffer.

Change the code to use the dma mr instead of doing fmr when dma mapping
produces a single dma entry sg.

Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iscsi_iser.h
ser/iser_memory.c
ser/iser_verbs.c
e981f1d4b8288072ba7cf6b7141cd4aefb404383 10-Sep-2006 Erez Zilber <erezz@voltaire.com> IB/iser: fix some debug prints

fix and add some debug prints related to iser
handling of memory for rdma.

Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iser_memory.c
8dfa0876d3dde5f9c1818a4c35caaabc3ddba78b 10-Sep-2006 Erez Zilber <erezz@voltaire.com> IB/iser: make FMR "page size" be 4K and not PAGE_SIZE

As iser is able to use at most one rdma operation for the
execution of a scsi command, and registration of the sg
associated with scsi command has its restrictions, the code
checks if an sg is "aligned for rdma".

Alignment for rdma is measured in "fmr page" units whose
possible resolutions are different between HCAs and can be
smaller, equal or bigger to the system page size.

When the system page size is bigger than 4KB (eg the default
with ia64 kernels) there a bigger chance that an sg would be
aligned for rdma if the fmr page size is 4KB.

Change the code to create FMR whose pages are of size 4KB
and to take that into account when processing the sg.

Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iscsi_iser.h
ser/iser_memory.c
ser/iser_verbs.c
8072ec2f8f6790df91e85d833e672c9c30a7ab3c 10-Sep-2006 Erez Zilber <erezz@voltaire.com> IB/iser: Limit the max size of a scsi command

Currently, the data length of a command coming down from scsi-ml
is limited only by the size of its sg list (sg_tablesize). The
max data length may be different for different page size values.
By setting max_sectors, we limit the data length to
max_sectors*512 bytes.

Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iscsi_iser.c
777a71dd4d901f055967ddbd038d2a74ffce0eb8 10-Sep-2006 Erez Zilber <erezz@voltaire.com> IB/iser: fix a check of SG alignment for RDMA

dma mapping may include a "compaction" of the sg associated with scsi command.
Hence, the size of the maximal prefix of the SG which is aligned for rdma must be
compared against the length of the dma mapped sg (mem->dma_nents) and not against
the size of it before it was mapped (mem->size).

Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iser_memory.c
07ebafbaaa72aa6a35472879008f5a1d1d469a0c 03-Aug-2006 Tom Tucker <tom@opengridcomputing.com> RDMA: iWARP Core Changes.

Modifications to the existing rdma header files, core files, drivers,
and ulp files to support iWARP, including:
- Hook iWARP CM into the build system and use it in rdma_cm.
- Convert enum ib_node_type to enum rdma_node_type, which includes
the possibility of RDMA_NODE_RNIC, and update everything for this.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
rp/ib_srp.c
3cd965646b7cb75ae84dd0daf6258adf20e4f169 23-Sep-2006 Roland Dreier <rolandd@cisco.com> IB: Whitespace fixes

Remove some trailing whitespace that has snuck in despite the best
efforts of whitespace=error-all. Also fix a few other whitespace
bogosities.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_multicast.c
rp/ib_srp.c
ded7f1a16d50527359be02f8b04f9ba56bc923e6 15-Aug-2006 Ishai Rabinovitz <ishai@mellanox.co.il> IB/srp: Add port/device attributes

Add local_ib_device and local_ib_port attributes to srp scsi_host.
These are needed when we want to connect to the same target through
multiple distinct ports.

Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
9217b27b12eb5ab910d14b3376c2b6cd13d87711 03-Aug-2006 Michael S. Tsirkin <mst@mellanox.co.il> IB/ipoib: Fix flush/start xmit race (from code review)

Prevent flush task from freeing the ipoib_neigh pointer, while
ipoib_start_xmit() is accessing the ipoib_neigh through the pointer it
has loaded from the skb's hardware address.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
c11bd42a7676b49d41c780f2ede3709204f8da83 14-Sep-2006 Eli Cohen <eli@mellanox.co.il> IPoIB: Retry failed send-only multicast group joins

When a send-only multicast group join fails, mcast->query must be set
to NULL. Otherwise, IPoIB will never retry the join and the multicast
group will never be reachable.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_multicast.c
add7afc756eddd5d02fd986d19e6300b3e1a5ae8 07-Sep-2006 Ishai Rabinovitz <ishai@mellanox.co.il> IB/srp: Don't schedule reconnect from srp

If there is a problem in the connection, the SCSI mid-layer will
eventually call srp_reset_host(), which will call srp_reconnect(), so
we do not need to schedule a call to srp_reconnect_work() from
srp_completion().

Removing this prevents srp_reset_host() from failing if a reconnect
scheduled from srp_completion() is already in progress, which in turn
was causing crashes as both SCSI midlayer and srp_reconnect() were
cancelling commands.

Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
ffd0436ed2e5a741c8d30062b489b989acf0a526 01-Sep-2006 Mike Christie <michaelc@cs.wisc.edu> [SCSI] libiscsi, iscsi_tcp, iscsi_iser: check that burst lengths are valid.

iSCSI RFC states that the first burst length must be smaller than the
max burst length. We currently assume targets will be good, but that may
not be the case, so this patch adds a check.

This patch also moves the unsol data out offset to the lib so the LLDs
do not have to track it.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
ser/iscsi_iser.c
ser/iscsi_iser.h
00dd7b7d26a3bf3780cfcebfdde2b86126b3a082 06-Aug-2006 James Bottomley <jejb@mulgrave.il.steeleye.com> Merge ../linux-2.6

Conflicts:

arch/ia64/hp/sim/simscsi.c

Stylistic differences in two separate fixes for buffer->request_buffer
problem.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
8ddc7c5326064434048ec1ecfe57659e08345cc1 13-Jul-2006 Or Gerlitz <ogerlitz@voltaire.com> IB/ipoib: Remove broken link from Kconfig and documentation

Remove references to the IPoIB IETF working group as it has been closed.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/Kconfig
559ce8f150d7d031c79c4d79173860f1bdfe3ce4 03-Aug-2006 Ishai Rabinovitz <ishai@mellanox.co.il> IB/srp: Work around data corruption bug on Mellanox targets

Data corruption has been seen with Mellanox SRP targets when FMRs
create a memory region with I/O virtual address != 0. Add a
workaround that disables FMR merging for Mellanox targets (OUI 0002c9).

Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
d916a8f1b43b358685b1672390ead11f2d3b74c6 25-Jul-2006 Ishai Rabinovitz <ishai@mellanox.co.il> IB/srp: Fix crash in srp_reconnect_target

Protect against srp_reset_device() clearing the req_queue while
srp_reconnect_target() is in progress (note that state change at
the top of srp_reconnect_target() is not sufficient for this since
srp_reset_device() ignores the state).

Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
1c83469d36a9dd30dbf1fb9fc5ca3be3a0e64ff4 24-Jul-2006 Mike Christie <michaelc@cs.wisc.edu> [SCSI] iscsi bugfixes: fix oops when iser is flushing io

When we enter recovery and flush the running commands
we cannot freee the connection before flushing the commands.
Some commands may have a reference to the connection
that needs to be released before. iscsi_stop was forcing
the term and suspend too early and was causing a oops
in iser, so this patch removes those callbacks all together
and allows the LLD to handle that detail.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
ser/iscsi_iser.c
8a7f752125a930a83f4d8dfe37fa5a081ab19d31 19-Jul-2006 Michael S. Tsirkin <mst@mellanox.co.il> IB/ipoib: Fix packet loss after hardware address update

The neighbour ha field may get updated without destroying the
neighbour. In this case, the ha field gets out of sync with the
address handle stored in ipoib_neigh->ah, with the result that
the ah field would point to an incorrect path, resulting in all
packets being lost.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_main.c
624d01f899f6bbd75fd06890f231e1f46555d376 24-Jul-2006 Or Gerlitz <ogerlitz@voltaire.com> IB/ipoib: Fix oops with ipoib_debug_mcast set

Need to set mcast->ah before debug code dereferences it.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_multicast.c
adfaa888a292e7f38fb43668d8994f246e371f0f 14-Jul-2006 Michael S. Tsirkin <mst@mellanox.co.il> [PATCH] fmr pool: remove unnecessary pointer dereference

ib_fmr_pool_map_phys gets the virtual address by pointer but never writes
there, and users (e.g. srp) seem to assume this and ignore the value
returned. This patch cleans up the API to get the VA by value, and updates
all users.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Acked-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
ser/iser_verbs.c
rp/ib_srp.c
6583eb3dcc1f03ce969594dae5573dbefce015dc 14-Jul-2006 Vu Pham <vu@mellanox.com> [PATCH] srp: fix fmr error handling

srp_unmap_data assumes req->fmr is NULL if the request is not mapped, so we
must clean it out in case of an error.

Signed-off-by: Vu Pham <vu@mellanox.com>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Acked-by: Roland Dreier <rolandd@cisco.com>
Cc: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
rp/ib_srp.c
c4e00fac42f268ed0a547cdd1d12bb8399864040 03-Jul-2006 James Bottomley <jejb@mulgrave.il.steeleye.com> Merge ../scsi-misc-2.6

Conflicts:

drivers/scsi/nsp32.c
drivers/scsi/pcmcia/nsp_cs.c

Removal of randomness flag conflicts with SA_ -> IRQF_ global
replacement.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
22a3e233ca08a2ddc949ba1ae8f6e16ec7ef1a13 01-Jul-2006 Linus Torvalds <torvalds@g5.osdl.org> Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial

* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial:
Remove obsolete #include <linux/config.h>
remove obsolete swsusp_encrypt
arch/arm26/Kconfig typos
Documentation/IPMI typos
Kconfig: Typos in net/sched/Kconfig
v9fs: do not include linux/version.h
Documentation/DocBook/mtdnand.tmpl: typo fixes
typo fixes: specfic -> specific
typo fixes in Documentation/networking/pktgen.txt
typo fixes: occuring -> occurring
typo fixes: infomation -> information
typo fixes: disadvantadge -> disadvantage
typo fixes: aquire -> acquire
typo fixes: mecanism -> mechanism
typo fixes: bandwith -> bandwidth
fix a typo in the RTC_CLASS help text
smb is no longer maintained

Manually merged trivial conflict in arch/um/kernel/vmlinux.lds.S
cfa7b0d46964300c849243d1a38a138b870bdc13 30-Jun-2006 Andrew Morton <akpm@osdl.org> [PATCH] infiniband: devfs fix

Remove devfs leftovers.

Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
ser/iscsi_iser.c
6ab3d5624e172c553004ecc862bfeac16d9d68b7 30-Jun-2006 Jörn Engel <joern@wohnheim.fh-wedel.de> Remove obsolete #include <linux/config.h>

Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
poib/ipoib.h
358ff019b89aa530ab6c0dd139d8089c932b103f 28-Jun-2006 Mike Christie <michaelc@cs.wisc.edu> [SCSI] iscsi: convert iser to new set/get param fns

Convert iser to libiscsi get/set param functions.
Fix bugs in it returning old error return values and
have it expose exp_statsn.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
ser/iscsi_iser.c
179e09172ab663b8587ecc46bb18a56a770304a9 26-Jun-2006 Akinobu Mita <mita@miraclelinux.com> [PATCH] drivers: use list_move()

This patch converts the combination of list_del(A) and list_add(A, B) to
list_move(A, B) under drivers/.

Acked-by: Corey Minyard <minyard@mvista.com>
Cc: Ben Collins <bcollins@debian.org>
Acked-by: Roland Dreier <rolandd@cisco.com>
Cc: Alasdair Kergon <dm-devel@redhat.com>
Cc: Gerd Knorr <kraxel@bytesex.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frank Pavlic <fpavlic@de.ibm.com>
Acked-by: Matthew Wilcox <matthew@wil.cx>
Cc: Andrew Vasquez <linux-driver@qlogic.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Greg Kroah-Hartman <greg@kroah.com>
Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
poib/ipoib_multicast.c
3f1244a2f8d3892f991b662cea49b2a0b4e0c115 11-May-2006 Or Gerlitz <ogerlitz@voltaire.com> IB/iser: iSER Kconfig and Makefile

Kconfig and Makefile for iSER.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/Kconfig
ser/Makefile
6461f64ab51e6929680df064b2682004a1548290 11-May-2006 Or Gerlitz <ogerlitz@voltaire.com> IB/iser: iSER handling of memory for RDMA

This file contains the processing carried over an SG list associated with
a SCSI command such that it can be registered with the IB verbs. The
registration produces a network virtual address (VA) and a remote access
key (RKEY or STAG in iSER spec notation) which are used by the target for
its RDMA operation.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iser_memory.c
1cfa0a75dbef1d5bf687aacafabb023288f6b36a 11-May-2006 Or Gerlitz <ogerlitz@voltaire.com> IB/iser: iSER RDMA CM (CMA) and IB verbs interaction

This file contains the low level interaction with the RDMA CM
and the IB verbs, where iSER is consumer of both.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iser_verbs.c
e85b24b5e7de9f507c6253183d089370f37618c5 11-May-2006 Or Gerlitz <ogerlitz@voltaire.com> IB/iser: iSER initiator iSCSI PDU and TX/RX

This file contains the iSER initiator processing of iSCSI PDUs - controls,
commands and data-outs along with processing of TX and RX completions.
It interacts with the lower level iser code doing the memory registration
and and the cma and verbs calls.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iser_initiator.c
65e7ae7bfc71219f13162b3bbad44e6471cd67f9 11-May-2006 Or Gerlitz <ogerlitz@voltaire.com> IB/iser: iSCSI iSER transport provider high level code

This file contains the code that registeres with the iscsi transport manager
and with the SCSI Mid Layer, where much of the provided functions to iSCSI and
SCSI are implemented in libiscsi.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iscsi_iser.c
49cd5382f629bde2aee9f817cefb271106dc47ee 11-May-2006 Or Gerlitz <ogerlitz@voltaire.com> IB/iser: iSCSI iSER transport provider header file

iSER (iSCSI Extensions for RDMA) transport provider driver for the iSCSI
initiator, whose other parts (under drivers/scsi) are scsi_transport_iscsi
- the transport management module, iscsi_tcp - the TCP transport provider
module and libiscsi - a kernel library (module) implementing functionality
needed by both TCP and iSER transports. iSER is both a provider of the iSCSI
transport api and a SCSI low level driver.

This file contains internal data structures and non static service functions.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ser/iscsi_iser.h
4c84a39c8adba6bf2f829b217e78bfd61478191a 20-Jun-2006 Linus Torvalds <torvalds@g5.osdl.org> Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (46 commits)
IB/uverbs: Don't serialize with ib_uverbs_idr_mutex
IB/mthca: Make all device methods truly reentrant
IB/mthca: Fix memory leak on modify_qp error paths
IB/uverbs: Factor out common idr code
IB/uverbs: Don't decrement usecnt on error paths
IB/uverbs: Release lock on error path
IB/cm: Use address handle helpers
IB/sa: Add ib_init_ah_from_path()
IB: Add ib_init_ah_from_wc()
IB/ucm: Get rid of duplicate P_Key parameter
IB/srp: Factor out common request reset code
IB/srp: Support SRP rev. 10 targets
[SCSI] srp.h: Add I/O Class values
IB/fmr: Use device's max_map_map_per_fmr attribute in FMR pool.
IB/mthca: Fill in max_map_per_fmr device attribute
IB/ipath: Add client reregister event generation
IB/mthca: Add client reregister event generation
IB: Move struct port_info from ipath to <rdma/ib_smi.h>
IPoIB: Handle client reregister events
IB: Add client reregister event type
...
932ff279a43ab7257942cddff2595acd541cc49b 09-Jun-2006 Herbert Xu <herbert@gondor.apana.org.au> [NET]: Add netif_tx_lock

Various drivers use xmit_lock internally to synchronise with their
transmission routines. They do so without setting xmit_lock_owner.
This is fine as long as netpoll is not in use.

With netpoll it is possible for deadlocks to occur if xmit_lock_owner
isn't set. This is because if a printk occurs while xmit_lock is held
and xmit_lock_owner is not set can cause netpoll to attempt to take
xmit_lock recursively.

While it is possible to resolve this by getting netpoll to use
trylock, it is suboptimal because netpoll's sole objective is to
maximise the chance of getting the printk out on the wire. So
delaying or dropping the message is to be avoided as much as possible.

So the only alternative is to always set xmit_lock_owner. The
following patch does this by introducing the netif_tx_lock family of
functions that take care of setting/unsetting xmit_lock_owner.

I renamed xmit_lock to _xmit_lock to indicate that it should not be
used directly. I didn't provide irq versions of the netif_tx_lock
functions since xmit_lock is meant to be a BH-disabling lock.

This is pretty much a straight text substitution except for a small
bug fix in winbond. It currently uses
netif_stop_queue/spin_unlock_wait to stop transmission. This is
unsafe as an IRQ can potentially wake up the queue. So it is safer to
use netif_tx_disable.

The hamradio bits used spin_lock_irq but it is unnecessary as
xmit_lock must never be taken in an IRQ handler.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib_multicast.c
526b4caa0a48382115fa9d8f7d8caf68dbcaa2bf 18-Jun-2006 Ishai Rabinovitz <ishai@mellanox.co.il> IB/srp: Factor out common request reset code

Misc cleanups in ib_srp:
1) I think that it is more efficient to move the req entries from req_list
to free_list in srp_reconnect_target (rather than rebuild the free_list).
(In any case this code is shorter).
2) This allows us to reuse code in srp_reset_device and srp_reconnect_target
and call a new function srp_reset_req.

Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
0c0450db31481aa01a04e7faecc93ee6841972d6 18-Jun-2006 Ramachandra K <rkuchimanchi@silverstorm.com> IB/srp: Support SRP rev. 10 targets

There has been a change in the format of port identifiers between
revision 10 of the SRP specification and the current revision 16A.

Revision 10 specifies port identifier format as

lower 8 bytes : GUID upper 8 bytes : Extension

Whereas revision 16A specifies it as

lower 8 bytes : Extension upper 8 bytes : GUID

There are older targets (e.g. SilverStorm Virtual Fibre Channel
Bridge) which conform to revision 10 of the SRP specification.

The I/O class of revision 10 is 0xFF00 and the I/O class of revision
16A is 0x0100.

For supporting older targets, this patch:

1) Adds a new optional target creation parameter "io_class". Default
value of io_class is 0x0100 (i.e. revision 16A)
2) Uses the correct port identifier format for targets with IO class
of 0xFF00 (i.e. conforming to revision 10)

Signed-off-by: Ramachandra K <rkuchimanchi@silverstorm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
rp/ib_srp.h
508e434123b136c96d1bf989e8d4ab0c8d7498b1 18-Jun-2006 Leonid Arsh <leonida@voltaire.com> IPoIB: Handle client reregister events

Handle client reregister events by treating them just like LID or
SM changes -- flush all cached paths and rejoin multicast groups.

Signed-off-by: Leonid Arsh <leonida@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_verbs.c
37c22a77212c13201497378cc8becc5c95d0f3f5 29-May-2006 Jack Morgenstein <jackm@mellanox.co.il> IPoIB: Fix kernel unaligned access on ia64

Fix misaligned access faults on ia64: never cast a misaligned
neighbour->ha + 4 pointer to union ib_gid type; pass a void * pointer
instead. The memcpy was being optimized to use full word accesses
because the compiler thought that union ib_gid is always aligned.

The cast in IPOIB_GID_ARG is safe, since it is fixed to access each
byte separately.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_main.c
poib/ipoib_multicast.c
31c02e215700c2b704d9441f629ae87bb9aeb561 18-Jun-2006 Roland Dreier <rolandd@cisco.com> IPoIB: Avoid using stale last_send counter when reaping AHs

The comparisons of priv->tx_tail to ah->last_send in ipoib_free_ah()
and ipoib_post_receive() are slightly unsafe, because priv->tx_lock is
not held and hence a stale value of ah->last_send might be used, which
would lead to freeing an AH before the driver was really done with it.
The simple way to fix this is to the optimization of early free from
ipoib_free_ah() and unconditionally queue AHs for reaping, and then
take priv->tx_lock in __ipoib_reap_ah().

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_ib.c
6bfa24fa3e189269e113197a80e12862c211b3d3 18-Jun-2006 Roland Dreier <rolandd@cisco.com> IB/srp: Get rid of "Target has req_lim 0" messages

It's perfectly valid for a connection to an SRP target to have a
request limit of 0, so get rid of the message about it, which can spam
kernel logs even with printk_ratelimit(). Keep a count of such events
in a "zero_req_lim" SCSI host attribute instead, so someone who cares
can look at the statistics.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
rp/ib_srp.h
b7ac4ab497e44cba75fb0e9e5afca06776518934 18-Jun-2006 Ishai Rabinovitz <ishai@mellanox.co.il> IB/srp: Handle DREQ events from CM

Handle IB_CM_DREQ_ERROR and IB_CM_DREQ_RECEIVED events from the CM,
instead of just printing "Unhandled CM event". In the case of
DREQ_ERROR, just ignore the event -- a TIMEWAIT_EXIT will be generated
also. For DREQ_RECEIVED, send a DREP in response to shut the
connection down cleanly.

Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
74b0a15b5e18e44206c98419745a472c3d28e561 18-Jun-2006 Vu Pham <vu@mellanox.com> IB/srp: Allow sg_tablesize to be adjusted

Make the sg_tablesize used by SRP adjustable at module load time via a
module parameter. Calculate the corresponding IU length required to
support this.

Signed-off-by: Vu Pham <vu@mellanox.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
rp/ib_srp.h
52fb2b50c4baa1430064c1e6c1c7df473d469df1 18-Jun-2006 Vu Pham <vu@mellanox.com> IB/srp: Allow cmd_per_lun to be set per target port

Allow userspace to throttle traffic on a given connection to a target
port by adding "max_cmd_per_lun=xyz" to lower the cmd_per_lun value
set for that scsi_host.

Signed-off-by: Vu Pham <vu@mellanox.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
0c5b395239cdea4db3d9c23a5738fdaf3b9ada4c 18-Jun-2006 Ishai Rabinovitz <ishai@mellanox.co.il> IB/srp: Clean up loop in srp_remove_one()

Interrupts will always be enabled in srp_remove_one(), so
spin_lock_irq() can be used instead of spin_lock_irqsave().
Also, the loop takes target->scsi_host->host_lock, so target->state
can just be set to SRP_TARGET_REMOVED witout testing the old value.

Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
b3589fd49067bab9fe0c60430860e6befbd5ba37 18-Jun-2006 Matthew Wilcox <matthew@wil.cx> IB/srp: Change target_mutex to a spinlock

The SRP driver never sleeps while holding target_mutex, and it's just
used to protect some simple list operations, so hold times will be
short. So just convert it to a spinlock, which is smaller and faster.

Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
rp/ib_srp.h
549c5fc2c8149498118f2494a1b6a4938ca05985 18-Jun-2006 Matthew Wilcox <matthew@wil.cx> IB/srp: Get rid of unneeded use of list_for_each_entry_safe()

list_for_each_entry_safe() is used in one place where the list isn't
modified. So just change it to list_for_each_entry().

Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
1962a4a1e4b3716aa836ebeb5b80c804a7f7c5ba 18-Jun-2006 Matthew Wilcox <matthew@wil.cx> IB/srp: Use SCAN_WILD_CARD from SCSI headers

SCAN_WILD_CARD is indeed available from <scsi/scsi.h>, which is
already included. So get rid of private hack.

Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
f5358a172f79e3f995919224401b25637f4324f6 18-Jun-2006 Roland Dreier <rolandd@cisco.com> IB/srp: Use FMRs to map gather/scatter lists

Create an SRP FMR pool on HCAs that support FMRs, and use FMRs to map
gather/scatter lists that have more than one entry into a single
memory region that appears virtually contiguous to the SRP target
(which is the RDMA initiator).

This patch bails out on FMR mapping for SCSI commands where the
gather/scatter list cannot be mapped into a single FMR because there
are sub-page-sized entries in middle of the list. An unaligned
start or end of the list is OK.

Based on a patch by Vu Pham <vuhuong@mellanox.com>.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
rp/ib_srp.h
959eb39297e8c82f61fbfc283ad4ff11c883bf1e 05-Jun-2006 Eli Cohen <eli@mellanox.co.il> IPoIB: Fix AH leak at interface down

When ipoib_stop() is called it first calls netif_stop_queue() to stop
the kernel from passing more packets to the network driver. However,
the completion handler may call netif_wake_queue() re-enabling packet
transfer.

This might result in leaks (we see AH leaks which we think can be
attributed to this bug) as new packets get posted while the interface
is going down.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Michael Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_ib.c
093beac189e4295d968f0d38787b46f76cb0eaaa 17-May-2006 Ishai Rabinovitz <ishai@mellanox.co.il> IB/srp: Complete correct SCSI commands on device reset

When flushing out queued commands after a successful device reset,
make sure that SRP completes the right commands, instead of calling
scsi_done on the command passed into the device reset handler over and
over.

Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
ec2d7208494fe599a5ff13b40a0a20c9881f2737 17-May-2006 Roland Dreier <rolandd@cisco.com> IB/srp: Get rid of extra scsi_host_put()s if reconnection fails

If a reconnection attempt fails, then SRP does two scsi_host_put()s.
This is a historical relic from an earlier version of the driver that
took a reference on the scsi_host before trying to reconnect, so get
rid of the extra scsi_host_put().

Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
e65810566f3e613d9baa5512b8724ebde42ace0f 17-May-2006 Roland Dreier <rolandd@cisco.com> IB/srp: Don't wait for disconnection if sending DREQ fails

Sending a DREQ may fail, for example because the remote target has
already broken the connection. If so, then SRP should not wait for
the disconnection to complete, because it never will.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
5941d079f2c3bdf0dffed1afb8941678fcd0bcb7 10-May-2006 Roland Dreier <rolandd@cisco.com> IPoIB: Free child interfaces properly

When deleting a child interface with a non-default P_Key via
/sys/class/net/ibX/delete_child, the interface must be freed with
free_netdev() (rather than kfree() on the private data).

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_vlan.c
d945e1df28ca07642b3e1a9b9d07074ba5f76be0 09-May-2006 Roland Dreier <rolandd@cisco.com> IB/srp: Fix tracking of pending requests during error handling

If a SCSI abort completes, or the command completes successfully, then
the driver must remove the command from its queue of pending
commands. Similarly, if a device reset succeeds, then all commands
queued for the given device must be removed from the queue.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
rp/ib_srp.h
f80887d0b9e1af481dc4a30fc145dfed24ddfd59 19-Apr-2006 Roland Dreier <rolandd@cisco.com> IB/srp: Remove request from list when SCSI abort succeeds

If a SCSI abort succeeds, then the aborted request should to be
removed from the list of pending requests. This fixes list corruption
after an abort occurs.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
f697f74a6b189702474b2fd457e1f9365fa213e3 10-Apr-2006 Roland Dreier <rolandd@cisco.com> IPoIB: Use spin_lock_irq() instead of spin_lock_irqsave()

We know ipoib_flush_paths() is called from plain process context with
interrupts enabled, since it does wait_for_completion(). So there's
no need to use spin_lock_irqsave() -- spin_lock_irq() is fine.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
a30bb96c6f5aca6513e4dbd94962da03d14b20a9 05-Apr-2006 Eli Cohen <eli@mellanox.co.il> IPoIB: Close race in ipoib_flush_paths()

ib_sa_cancel_query() must be called with priv->lock held since
a completion might arrive and set path->query to NULL.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
0f4852513fb07405ce88da40d8c497060561246e 10-Apr-2006 Shirley Ma <xma@us.ibm.com> IPoIB: Make send and receive queue sizes tunable

Make IPoIB's send and receive queue sizes tunable via module
parameters ("send_queue_size" and "recv_queue_size"). This allows the
queue sizes to be enlarged to fix disastrously bad performance on some
platforms and workloads, without bloating memory usage when large
queues aren't needed.

Signed-off-by: Shirley Ma <xma@us.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_ib.c
poib/ipoib_main.c
poib/ipoib_verbs.c
f2de3b06126ddb07d0e4617225d74dce0855add3 05-Apr-2006 Eli Cohen <eli@mellanox.co.il> IPoIB: Wait for join to finish before freeing mcast struct

ipoib_mcast_restart_task() might free an mcast object while a join
request is still outstanding, leading to an oops when the query
completes. Fix this by waiting for query to complete, similar to what
ipoib_stop_thread() is doing. The wait for mcast completion code is
consolidated in wait_for_mcast_join().

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_multicast.c
bf6a9e31cfa768ce0a8e18474b3ca808641d9243 10-Apr-2006 Jack Morgenstein <jackm@mellanox.co.il> IB: simplify static rate encoding

Push translation of static rate to HCA format into low-level drivers,
where it belongs. For static rate encoding, use encoding of rate
field from IB standard PathRecord, with addition of value 0, for
backwards compatibility with current usage. The changes are:

- Add enum ib_rate to midlayer includes.
- Get rid of static rate translation in IPoIB; just use static rate
directly from Path and MulticastGroup records.
- Update mthca driver to translate absolute static rate into the
format used by hardware. This also fixes mthca's static rate
handling for HCAs that are capable of 4X DDR.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_fs.c
poib/ipoib_main.c
poib/ipoib_multicast.c
d2e0655ede1d91c3a586455d03a4a2d57e659830 04-Apr-2006 Michael S. Tsirkin <mst@mellanox.co.il> IPoIB: Consolidate private neighbour data handling

Consolidate IPoIB's private neighbour data handling into
ipoib_neigh_alloc() and ipoib_neigh_free(). This will make it easier
to keep track of the neighbour structures that IPoIB is handling, and
is a nice cleanup of the code:

add/remove: 2/1 grow/shrink: 1/8 up/down: 100/-178 (-78)
function old new delta
ipoib_neigh_alloc - 61 +61
ipoib_neigh_free - 36 +36
ipoib_mcast_join_finish 1288 1291 +3
path_rec_completion 575 573 -2
ipoib_mcast_join_task 664 660 -4
ipoib_neigh_destructor 101 92 -9
ipoib_neigh_setup_dev 14 3 -11
ipoib_neigh_setup 17 - -17
path_free 238 215 -23
ipoib_mcast_free 329 306 -23
ipoib_mcast_send 718 684 -34
neigh_add_path 705 650 -55

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_main.c
poib/ipoib_multicast.c
ce1823f0323be9f38bbe0df229a5bba025404923 03-Apr-2006 Roland Dreier <rolandd@cisco.com> IB/srp: Fix memory leak in options parsing

Fix memory leak if parsing destination GID fails.

Coverity bug 1042

Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
f5545d24b8aa9fccd8071203e83bc9f4b26e17a6 02-Apr-2006 Roland Dreier <rolandd@cisco.com> IPoIB: Always build debugging code unless CONFIG_EMBEDDED=y

Don't allow CONFIG_INFINIBAND_IPOIB_DEBUG to be disabled unless
CONFIG_EMBEDDED is selected. We want users (and especially distros)
to have this turned on unless they really need to save space, because
by the time we want debugging output, it's usually too late to rebuild
a kernel. The debugging output can be controlled at runtime via the
debug_level module parameter in sysfs.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/Kconfig
ef12d4561900d4af37f46a8f2d97dec3c4d59bf9 29-Mar-2006 Roland Dreier <rolandd@cisco.com> IPoIB: Fix oops with raw sockets

ipoib_hard_header() needs to handle the case that daddr is NULL. This
can happen when packets are injected via a raw socket, and IPoIB
shouldn't oops in this case.

Reported by Anton Blanchard <anton@samba.org>

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
3f89f834497c0f37f16a3b6c32b1d60782facbca 29-Mar-2006 Roland Dreier <rolandd@cisco.com> IB/srp: Fix unmapping of fake scatterlist

The recently merged patch to create a fake scatterlist for non-SG SCSI
commands had a bug: the driver ended up doing dma_unmap_sg() on a
scatterlist scmnd->request_buffer rather than the fake scatter list it
created. Fix this so that the driver unmaps the same thing it maps.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
7a343d4c46bc59fe617f968e996ce2fd67c5d179 23-Mar-2006 Leonid Arsh <leonida@voltaire.com> IPoIB: P_Key change event handling

This patch causes the network interface to respond to P_Key change
events correctly. As a result, you'll see a child interface in the
"RUNNING" state (netif_carrier_on()) only when the corresponding P_Key
is configured by the SM. When SM removes a P_Key, the "RUNNING" state
will be disabled for the corresponding network interface. To
implement this, I added IB_EVENT_PKEY_CHANGE event handling. To
prevent flushing the device before the device is open by the "delay
open" mechanism, I added an additional device flag called
IPOIB_FLAG_INITIALIZED.

This also prevents the child network interface from trying to join to
multicast groups until the PKEY is configured. We used to get error
messages like:

ib0.f2f2: couldn't attach QP to multicast group ff12:401b:f2f2:0:0:0:ffff:ffff

in this case. To fix this, I just check IPOIB_FLAG_OPER_UP flag in
ipoib_set_mcast_list().

Signed-off-by: Leonid Arsh <leonida@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_ib.c
poib/ipoib_main.c
poib/ipoib_verbs.c
4e37b956161c3a3b160972c11c55f07b38b9830c 22-Mar-2006 Leonid Arsh <leonida@voltaire.com> IPoIB: Fix network interface "RUNNING" status

With the current IPoIB driver, the status of network interfaces stays
"RUNNING" even if the link goes down (for example because a cable is
unplugged). Fix this by flushing the IPoIB interface when the link
goes down.

Signed-off-by: Leonid Arsh <leonida@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_verbs.c
cf368713a3f3b2eb737a92d1b7186dedcc51167c 25-Mar-2006 Roland Dreier <rolandd@cisco.com> IB/srp: Use a fake scatterlist for non-SG SCSI commands

Since the SCSI midlayer is moving towards entirely getting rid of
commands with use_sg == 0, we should treat this case as an exception.
Therefore, change the IB SRP initiator to create a fake scatterlist
for these commands with sg_init_one(). This simplifies the flow of
DMA mapping and unmapping, since SRP can just use dma_map_sg() and
dma_unmap_sg() unconditionally, rather than having to choose between
the dma_{map,unmap}_sg() and dma_{map,unmap}_single() variants.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
rp/ib_srp.h
6f633c8d69415aabbccfcc494008e8e1300a98c1 25-Mar-2006 Leonid Arsh <leonida@voltaire.com> IPoIB: Pass correct pointer when flushing child interfaces

ipoib_ib_dev_flush() should get passed cpriv->dev, not &cpriv->dev.

Signed-off-by: Leonid Arsh <leonida@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_ib.c
3d1f337b3e7378923c89f37afb573a918ef40be5 21-Mar-2006 Linus Torvalds <torvalds@g5.osdl.org> Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (235 commits)
[NETFILTER]: Add H.323 conntrack/NAT helper
[TG3]: Don't mark tg3_test_registers() as returning const.
[IPV6]: Cleanups for net/ipv6/addrconf.c (kzalloc, early exit) v2
[IPV6]: Nearly complete kzalloc cleanup for net/ipv6
[IPV6]: Cleanup of net/ipv6/reassambly.c
[BRIDGE]: Remove duplicate const from is_link_local() argument type.
[DECNET]: net/decnet/dn_route.c: fix inconsequent NULL checking
[TG3]: make drivers/net/tg3.c:tg3_request_irq() static
[BRIDGE]: use LLC to send STP
[LLC]: llc_mac_hdr_init const arguments
[BRIDGE]: allow show/store of group multicast address
[BRIDGE]: use llc for receiving STP packets
[BRIDGE]: stp timer to jiffies cleanup
[BRIDGE]: forwarding remove unneeded preempt and bh diasables
[BRIDGE]: netfilter inline cleanup
[BRIDGE]: netfilter VLAN macro cleanup
[BRIDGE]: netfilter dont use __constant_htons
[BRIDGE]: netfilter whitespace
[BRIDGE]: optimize frame pass up
[BRIDGE]: use kzalloc
...
e35fc385655ac584902edd98dd07ac488e986aa1 21-Mar-2006 Arnaldo Carvalho de Melo <acme@mandriva.com> [INFINIBAND] ipoib: Remove leftover use of neigh_ops->destructor

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib_multicast.c
c5ecd62c25400a3c6856e009f84257d5bd03f03b 21-Mar-2006 Michael S. Tsirkin <mst@mellanox.co.il> [NET]: Move destructor from neigh->ops to neigh_params

struct neigh_ops currently has a destructor field, which no in-kernel
drivers outside of infiniband use. The infiniband/ulp/ipoib in-tree
driver stashes some info in the neighbour structure (the results of
the second-stage lookup from ARP results to real link-level path), and
it uses neigh->ops->destructor to get a callback so it can clean up
this extra info when a neighbour is freed. We've run into problems
with this: since the destructor is in an ops field that is shared
between neighbours that may belong to different net devices, there's
no way to set/clear it safely.

The following patch moves this field to neigh_parms where it can be
safely set, together with its twin neigh_setup. Two additional
patches in the patch series update ipoib to use this new interface.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib_main.c
bfef73fa78ca1e56175dcbd33aa11de4764f85a5 20-Mar-2006 Roland Dreier <rolandd@cisco.com> IPoIB: Get rid of useless test of queue length

In neigh_add_path(), the queue of delayed packets can never be full,
because the queue is always freshly created and cannot be found by any
other code path. In fact, the test of the queue length is worse than
useless: if somehow the test ever triggered and path_rec_start() also
failed, then dev_kfree_skb_any() will be called twice on the same skb.
Fix this by deleting the useless test. Pointed out by Michael
S. Tsirkin <mst@mellanox.co.il>.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
bf17c1c7cc9250d7c3c01b0ae898aefa1853535a 20-Mar-2006 Roland Dreier <rolandd@cisco.com> IB/srp: Coverity fix to srp_parse_options()

Fix leak found by Coverity: in the SRP_OPT_DGID case,
srp_parse_options() didn't free the result of match_strdup().

Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
0b3ea0829cbcdaee6e018a83a2949ef458213f3b 20-Mar-2006 Jack Morgenstein <jackm@mellanox.co.il> IPoIB: Move ipoib_ib_dev_flush() to ipoib workqueue

Move ipoib_ib_dev_flush() to ipoib's workqueue. This keeps it ordered
with respect to other work scheduled by the ipoib driver. This fixes
problems with races, for example:
- ipoib_ib_dev_flush() has started running because of an IB event
- user does ifconfig ib0 down
- ipoib_mcast_stop_thread() gets called twice and waits for the same
completion twice

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_ib.c
poib/ipoib_main.c
poib/ipoib_verbs.c
8b9ab02b690e988f19c9d740ef642d7d833d23d5 07-Mar-2006 Roland Dreier <rdreier@cisco.com> IPoIB: Fix build now that neighbour destructor is in neigh_params

Fix the IPoIB build (which is broken in net-2.6.17 because of my
screw-up, which left out this chunk in ipoib_multicast.c).
The neighbour destructor is now in neigh_params, so we don't
need to clear it in the ops structure.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_multicast.c
6ecb0c849625e830ab96495d473bb704812c30e1 20-Mar-2006 Roland Dreier <rolandd@cisco.com> IB/srp: Add SCSI host attributes to show target port

Add SCSI host attributes in sysfs that show the ID extension, IOC
GUID, service ID, P_Key and destination GID for each target port that
the SRP initiator connects to.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
9acf6a8570dcfc9f55724b8b71099fc8768e8c26 02-Mar-2006 Michael S. Tsirkin <mst@mellanox.co.il> IPoIB: Fix multicast race between canceling and completing

ipoib_mcast_stop_thread currently tests mcast->query and if it is
NULL, does not perform wait_for_completion on the mcast and frees the
mcast object directly.

However, since both operations are done without locking, it is
possible that ipoib_mcast_join_complete is in progress on this mcast
object and has set mcast->query to NULL already.

Solve this by:
- taking priv->lock before we change mcast->query in ipoib_mcast_join_complete,
and keeping it until we no longer need the mcast object
- taking priv->lock around mcast->query test in ipoib_mcast_stop_thread

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_multicast.c
54d07e2a1ead2f093ce054cda2e0f5ec163c650c 02-Mar-2006 Eli Cohen <eli@mellanox.co.il> IPoIB: Clean up if posting receives fails

If posting receives in ipoib_ib_dev_open() fails, call
ipoib_ib_dev_stop() to move the device's QP back to the RESET state so
that we can try again later.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_ib.c
7343b231f22cec11f069bcdbb0c9a417df2750d3 28-Feb-2006 Eli Cohen <eli@mellanox.co.il> IPoIB: Close race in setting mcast->ah

ipoib_mcast_send() tests mcast->ah twice. If this value is changed
between these two points, we leak an skb. However,
ipoib_mcast_join_finish() sets mcast->ah with no locking, so it could
race against ipoib_mcast_send().

As a solution, take priv->lock around assignment to mcast->ah thus
making sure ipoib_mcast_send() (which also takes priv->lock) is not in
flight.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_multicast.c
44af79f9524c29d6850591cc972f2667a27234d4 21-Feb-2006 Michael S. Tsirkin <mst@mellanox.co.il> IPoIB: clarify to_ipoib_neigh()

Cosmetic change: make alignment explicit in to_ipoib_neigh.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
1285b3a0b0aa2391ac6f6939e6737203c8220f68 04-Mar-2006 Roland Dreier <rolandd@cisco.com> IB/srp: Don't send task management commands after target removal

Just fail abort and reset requests that come in after we've already
decided to remove a target.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
20b83382d1c5d4d1a73fc5671261db5239d1dbb3 11-Feb-2006 Roland Dreier <rolandd@cisco.com> IPoIB: Yet another fix for send-only joins

Even after the last fix, it's still possible for a send-only join to
start before the join for the broadcast group has finished. This
could cause us to create a multicast group using attributes from the
broadcast group that haven't been initialized yet, so we would use
garbage for the Q_Key, etc. Fix this by waiting until the broadcast
group's attached flag is set before starting send-only joins.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_multicast.c
7bcb974ef6a0ae903888272c92c66ea779388c01 08-Feb-2006 Michael S. Tsirkin <mst@mellanox.co.il> IPoIB: Fix another send-only join race

Further, there's an additional issue that I saw in testing:
ipoib_mcast_send may get called when priv->broadcast is NULL (e.g. if
the device was downed and then upped internally because of a port
event).

If this happends and the send-only join request gets completed before
priv->broadcast is set, we get an oops.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_multicast.c
479a079663bd4c5f3d2714643b1b8c406aaba3e0 08-Feb-2006 Michael S. Tsirkin <mst@mellanox.co.il> IPoIB: Don't start send-only joins while multicast thread is stopped

Fix the following race scenario:
- Device is up.
- Port event or set mcast list triggers ipoib_mcast_stop_thread,
this cancels the query and waits on mcast "done" completion.
- Completion is called and "done" is set.
- Meanwhile, ipoib_mcast_send arrives and starts a new query,
re-initializing "done".

Fix this by adding a "multicast started" bit and checking it before
starting a send-only join.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_multicast.c
8e9e5f4f5eb1d44ddabfd1ddea4ca4e4244a9ffb 31-Jan-2006 Ingo Molnar <mingo@elte.hu> IB/srp: Semaphore to mutex conversion

Convert srp_host->target_mutex from a semaphore to a mutex.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
rp/ib_srp.h
b36f170b617a7cd147b694dabf504e856a50ee9d 17-Jan-2006 Michael S. Tsirkin <mst@mellanox.co.il> IPoIB: Lock accesses to multicast packet queues

Avoid corrupting mcast->pkt_queue by serializing access with
priv->tx_lock. Also, update dropped packet statistics to count
multicast packets removed from pkt_queue as dropped.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_multicast.c
47f7a0714b67b904a3a36e2f2d85904e8064219b 17-Jan-2006 Michael S. Tsirkin <mst@mellanox.co.il> IPoIB: Make sure path is fully initialized before using it

The SA path record query completion can initialize path->pathrec.dlid
before IPoIB's callback runs and initializes path->ah, so we must test
ah rather than dlid.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
95ed644fd12f53c6fc778f3f246974e5fe3a9468 13-Jan-2006 Ingo Molnar <mingo@elte.hu> IB: convert from semaphores to mutexes

semaphore to mutex conversion by Ingo and Arjan's script.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
[ Sanity-checked on real IB hardware ]
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_ib.c
poib/ipoib_main.c
poib/ipoib_multicast.c
poib/ipoib_verbs.c
poib/ipoib_vlan.c
988bd50300ef2e2d5cb8563e2ac99453dd9acd86 12-Jan-2006 Eli Cohen <eli@mellanox.co.il> IPoIB: Fix memory leak of multicast group structures

The current handling of multicast groups in IPoIB ends up never
freeing send-only multicast groups. It turns out the logic was much
more complicated than it needed to be; we can fix this bug and
completely kill ipoib_mcast_dev_down() at the same time.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_ib.c
poib/ipoib_multicast.c
78bfe0b5b67fe126ed98608e42e42fb6ed9aabd4 11-Jan-2006 Michael S. Tsirkin <mst@mellanox.co.il> IPoIB: Take dev->xmit_lock around mc_list accesses

dev->mc_list accesses must be protected by dev->xmit_lock.
Found by Eli Cohen <eli@mellanox.co.il>.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_multicast.c
97460df37ea3335ca11562568932c9f9facfecdb 10-Jan-2006 Eli Cohen <eli@mellanox.co.il> IPoIB: Fix address handle refcounting for multicast groups

Multiple ipoib_neigh structures on mcast->neigh_list may point to the
same ah. This means that ipoib_mcast_free() can't just make a list of
ah structs to free, since this might end up trying to add the same ah
to the list more than once. Handle this in ipoib_multicast.c in the
same way as it is handled in ipoib_main.c for struct ipoib_path.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_multicast.c
70b4c8cdc168bb5d18e23fd205c4ede1b756a8b2 10-Jan-2006 Eli Cohen <eli@mellanox.co.il> IPoIB: Fix error path in ipoib_mcast_dev_flush()

Don't leak memory on allocation failure for broadcast mcast group.
Also, print a warning to match handling for other mcast groups.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_multicast.c
cf311cd49a78f1e431787068cc31d29d06a415e6 10-Jan-2006 Sean Hefty <sean.hefty@intel.com> IB: Add node_guid to struct ib_device

Add a node_guid field to struct ib_device. It is the responsibility
of the low-level driver to initialize this field before registering a
device with the midlayer. Convert everyone to looking at this field
instead of calling ib_query_device() when all they want is the node
GUID, and remove the node_guid field from struct ib_device_attr.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
de25968cc87cc5b76d09de8b4cbddc8f24fcf5f7 08-Jan-2006 Tim Schmielau <tim@physik3.uni-rostock.de> [PATCH] fix more missing includes

Include fixes for 2.6.14-git11. Should allow to remove sched.h from
module.h on i386, x86_64, arm, ia64, ppc, ppc64, and s390. Probably more
to come since I haven't yet checked the other archs.

Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
rp/ib_srp.c
14c850212ed8f8cbb5972ad6b8812e08a0bc901c 27-Dec-2005 Arnaldo Carvalho de Melo <acme@mandriva.com> [INET_SOCK]: Move struct inet_sock & helper functions to net/inet_sock.h

To help in reducing the number of include dependencies, several files were
touched as they were getting needed headers indirectly for stuff they use.

Thanks also to Alan Menegotto for pointing out that net/dccp/proto.c had
linux/dccp.h include twice.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
poib/ipoib_main.c
poib/ipoib_multicast.c
267ee88ed34c76dc527eeb3d95f9f9558ac99973 29-Nov-2005 Roland Dreier <rolandd@cisco.com> IPoIB: fix error handling in ipoib_open

If ipoib_ib_dev_up() fails after ipoib_ib_dev_open() is called, then
ipoib_ib_dev_stop() needs to be called to clean up.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
4f71055a45a503273c039d80db8ba9b13cb17549 29-Nov-2005 Michael S. Tsirkin <mst@mellanox.co.il> IPoIB: protect child list in ipoib_ib_dev_flush

race condition: ipoib_ib_dev_flush is accessing child list without locks.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_ib.c
2e86541ec878de9ec5771600a77f451a80bebfc4 29-Nov-2005 Roland Dreier <rolandd@cisco.com> IPoIB: don't zero members after we allocate with kzalloc

ipoib_mcast_alloc() uses kzalloc(), so there's no need to zero out
members of the mcast struct after it's allocated.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_multicast.c
de922487890936470660e89f9095aee980637989 29-Nov-2005 Michael S. Tsirkin <mst@mellanox.co.il> IPoIB: reinitialize mcast structs' completions for every query

Make sure mcast->done is initialized to uncompleted value before we
submit a new query, so that it's safe to wait on.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_multicast.c
5872a9fc28e6cd3a4e51479a50970d19a01573b3 29-Nov-2005 Roland Dreier <rolandd@cisco.com> IPoIB: always set path->query to NULL when query finishes

Always set path->query to NULL when the SA path record query
completes, rather than only when we don't have an address handle.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
65c7eddaba33995e013ef3c04718f6dc8fdf2335 29-Nov-2005 Roland Dreier <rolandd@cisco.com> IPoIB: reinitialize path struct's completion for every query

It's possible that IPoIB will issue multiple SA queries for the same
path struct. Therefore the struct's completion needs to be
initialized for each query rather than only once when the struct is
allocated, or else we might not wait long enough for later queries to
finish and free the path struct too soon.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
47f2bce9021b4974ed33b072ebb8348c8145c946 15-Nov-2005 Roland Dreier <rolandd@cisco.com> [IB] srp: don't post receive if no send buf available

Have __srp_get_tx_iu() fail if the target port's request limit will
not allow the initiator to post a send. This avoids continuing on and
posting a receive, and then failing to post a corresponding send. If
that happens, then the initiator will end up with an extra receive
posted, and if this happens to much, the receive queue will overflow.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
5f068992a1bccda5574b4f6d33458ef806686d7f 11-Nov-2005 Roland Dreier <rolandd@cisco.com> [IB] srp: increase max_luns

Increase SRP max_luns to 512 to match the kernel's default, since SRP
storage targets can have lots of LUNs and the SRP initiator itself
doesn't have any particular limit.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/ib_srp.c
rp/ib_srp.h
78b9c0f91cf908616b8f9f356e1d1220e727ea88 10-Nov-2005 Linus Torvalds <torvalds@g5.osdl.org> Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband
8c608a32e3cd7ff14498ad996ca32d1452245a97 07-Nov-2005 Roland Dreier <rolandd@cisco.com> [IPoIB] no need to set skb->dev right before freeing skb

For cut-and-paste reasons, the IPoIB driver was setting skb->dev right
before calling dev_kfree_skb_any(). Get rid of this.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_multicast.c
1732b0ef3b3a02e3df328086fb3018741c5476da 07-Nov-2005 Roland Dreier <rolandd@cisco.com> [IPoIB] add path record information in debugfs

Add ibX_path files to debugfs that contain information about the IPoIB
path cache. IPoIB ARP only gives GIDs, which the IPoIB driver must
resolve to real IB paths through the ib_sa module. For debugging,
when the ARP table looks OK but traffic isn't flowing, it's useful to
be able to see if the resolution from GID to path worked.

Also clean up the formatting of the existing _mcg debugfs files.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_fs.c
poib/ipoib_main.c
poib/ipoib_multicast.c
poib/ipoib_vlan.c
733482e445ca4450cf41381b1c95e2b8c7145114 09-Nov-2005 Olaf Hering <olh@suse.de> [PATCH] changing CONFIG_LOCALVERSION rebuilds too much, for no good reason

This patch removes almost all inclusions of linux/version.h. The 3
#defines are unused in most of the touched files.

A few drivers use the simple KERNEL_VERSION(a,b,c) macro, which is
unfortunatly in linux/version.h.

There are also lots of #ifdef for long obsolete kernels, this was not
touched. In a few places, the linux/version.h include was move to where
the LINUX_VERSION_CODE was used.

quilt vi `find * -type f -name "*.[ch]"|xargs grep -El '(UTS_RELEASE|LINUX_VERSION_CODE|KERNEL_VERSION|linux/version.h)'|grep -Ev '(/(boot|coda|drm)/|~$)'`

search pattern:
/UTS_RELEASE\|LINUX_VERSION_CODE\|KERNEL_VERSION\|linux\/\(utsname\|version\).h

Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
rp/ib_srp.c
127f2fa31ac624c744f3767363c4919209980956 05-Nov-2005 Linus Torvalds <torvalds@g5.osdl.org> Merge branch 'srp' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband
8ae5a8a24f7fe797027d481f88c1464b0e47eede 03-Nov-2005 Roland Dreier <rolandd@cisco.com> [IPoIB] don't compile debug code if debugging isn't enabled

Don't build ipoib_mcast_iter_ functions if CONFIG_INFINIBAND_IPOIB_DEBUG
is not enabled -- their only callers will not be built either.

Also move the prototype for ipoib_open() to ipoib.h to fix a sparse warning.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_ib.c
poib/ipoib_multicast.c
aef9ec39c47f0cece886ddd6b53c440321e0b2a6 02-Nov-2005 Roland Dreier <rolandd@cisco.com> IB: Add SCSI RDMA Protocol (SRP) initiator

Add an InfiniBand SCSI RDMA Protocol (SRP) initiator. This driver is
used to talk talk to InfiniBand SRP targets (storage devices).

Signed-off-by: Roland Dreier <rolandd@cisco.com>
rp/Kbuild
rp/Kconfig
rp/ib_srp.c
rp/ib_srp.h
21a384897d48c116b879924c3dd9e96f6f1e764b 02-Nov-2005 Roland Dreier <rolandd@cisco.com> [IPoIB] remove unneeded initializations to 0

Shrink our source and .text a little by removing a few assignments of
NULL and 0 to memory that is already cleared as part of the allocation.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
de6eb66b56d9df5ce6bd254994f05e065214e8cd 02-Nov-2005 Roland Dreier <rolandd@cisco.com> [IB] kzalloc() conversions

Replace kmalloc()+memset(,0,) with kzalloc(), for a net savings of 35
source lines and about 500 bytes of text.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
poib/ipoib_multicast.c
3bc12e75b23c0499cc2c0873a5f77494be173761 30-Oct-2005 Roland Dreier <rolandd@cisco.com> [IPoIB] cleanups: fix comment, remove useless variables

Minor cleanups: fix a misleading comment, and get rid of attr_mask
variables that are only used to hold constants (just use the constants
directly).

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_ib.c
poib/ipoib_verbs.c
a20583a7c2e35d80b1dfc1f60c9729498838725e 29-Oct-2005 Roland Dreier <rolandd@cisco.com> [IPoIB] use spin_trylock_irqsave()

Use spin_trylock_irqsave() in ipoib_start_xmit() instead of
reinventing it out of local_irq_save(), spin_trylock() and
local_irq_restore().

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
1993d683f39f77ddb46a662d7146247877d50b8f 29-Oct-2005 Roland Dreier <rolandd@cisco.com> [IPoIB] Drop RX packets when out of memory

Change the way IPoIB handles RX packets when it can't allocate a new
receive skbuff. If the allocation of a new receive skb fails, we now
drop the packet we just received and repost the original receive skb.
This means that the receive ring always stays full and we don't have
to monkey around with trying to schedule a refill task for later.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_ib.c
poib/ipoib_main.c
4b2d319b53810ab00ef3d8fdfc1c1ab0647ab0a7 18-Oct-2005 Roland Dreier <rolandd@cisco.com> [IPoIB] Improve ipoib_timeout() output

Use jiffies_to_msecs() so we print a human-readable time so
we don't have to worry about what HZ is configured to, and
print out a few values to make post-mortem analysis easier.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
5b6810e048435de508ef66aebd6b78db13d651b8 11-Oct-2005 Roland Dreier <rolandd@cisco.com> [IPoIB] Rename ipoib_create_qp() -> ipoib_init_qp() and fix error cleanup

ipoib_create_qp() no longer creates IPoIB's QP, so it shouldn't
destroy the QP on failure -- that unwinding happens elsewhere, so the
current code can cause a double free. While we're at it, the
function's name should match what it actually does, so rename it to
ipoib_init_qp().

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_ib.c
poib/ipoib_verbs.c
d70ed6075f15bdbb0548d162394bf10332769c88 29-Sep-2005 Roland Dreier <rolandd@cisco.com> [IPoIB] Rename IPoIB's path_lookup() to avoid name clashes

Rename IPoIB driver's path_lookup() to ipoib_path_lookup() to avoid a
clashes with the kernel global path_lookup(). We don't hit this with
the current kernel source, but some external patches seem to trigger
this, and it's cleaner to avoid clashing with global names anyway.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
refs/heads/for-linus
poib/ipoib_main.c
8d2cae0651502028bf64844508ab18528bbd65c2 20-Sep-2005 Roland Dreier <rolandd@cisco.com> [PATCH] IPoIB: Don't flush workqueue from within workqueue

ipoib_mcast_restart_task() is always called from within the
single-threaded IPoIB workqueue, so flushing the workqueue from within
the function can lead to a recursion overflow. But since we're
running in a single-threaded workqueue, we're already synchronized
against other items in the workqueue, so just get rid of the flush in
ipoib_mcast_restart_task().

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_ib.c
poib/ipoib_multicast.c
ce5b65cc9626feac0d4ffb96f798407e50c45575 18-Sep-2005 Hal Rosenstock <halr@voltaire.com> [PATCH] IPoIB: Fix SA client retransmission strategy

We got a little mixed up with what the backoff member holds in the
IPoIB multicast group structure: sometimes it was used as a number of
seconds, and sometimes it was used as a number of jiffies. Fix the
code so that backoff is always in seconds.

Signed-off-by: Hal Rosenstock <halr@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_multicast.c
51574e0398a2d93cbf7f26e36b673cd919062268 12-Sep-2005 Michael S. Tsirkin <mst@mellanox.co.il> [PATCH] IPoIB: fix module removal race

Since ipoib uses queue_delayed_work to run flush task on port state events,
it must flush scheduled work after unregistering the event handler.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
06c56e44f3e32a859420ecac97996cc6f12827bb 01-Sep-2005 Michael S. Tsirkin <mst@mellanox.co.il> [PATCH] IPoIB: fix memory leak

Fix IPoIB memory leak on device removal.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
a4d61e84804f3b14cc35c5e2af768a07c0f64ef6 25-Aug-2005 Roland Dreier <roland@eddore.topspincom.com> [PATCH] IB: move include files to include/rdma

Move the InfiniBand headers from drivers/infiniband/include to include/rdma.
This allows InfiniBand-using code to live elsewhere, and lets us remove the
ugly EXTRA_CFLAGS include path from the InfiniBand Makefiles.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/Makefile
poib/ipoib.h
poib/ipoib_ib.c
poib/ipoib_verbs.c
1ad62a19f177e61d4dde111ba35fb4badd0c2106 24-Aug-2005 Michael S. Tsirkin <mst@mellanox.co.il> [PATCH] IPoIB: Fix device removal race

Currently we may have work scheduled in default kernel workqueue when
the device is going down. The device could get freed before this
workqueue gets serviced. I am actually seeing this causing system
hangs.

The following patch fixes this by using ipoib_workqueue which gets
flushed when the device is going down.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
4ce059378c04b40c2e9f658b1c6a2e9078b85c7c 19-Aug-2005 Roland Dreier <roland@eddore.topspincom.com> [PATCH] IPoIB: Set full membership bit in P_Keys

Always make sure that the full membership bit is set in the P_Keys
that IPoIB uses. This makes sure that all hosts join the correct
multicast groups so that hosts that are partial partition members
can talk to the rest of the network.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
2aeba9a03b0d249fc710b9939fc089ce53d8cd30 15-Aug-2005 Olaf Hering <olh@suse.de> [PATCH] IB: Remove unnecessary includes of <linux/version.h>

changing CONFIG_LOCALVERSION rebuilds too much, for no appearent reason.
Remove unneeded includes of <linux/version.h>.

Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
poib/ipoib_vlan.c
97f52eb438be7caebe026421545619d8a0c1398a 14-Aug-2005 Sean Hefty <sean.hefty@intel.com> [PATCH] IB: sparse endianness cleanup

Fix sparse warnings. Use __be* where appropriate.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_fs.c
poib/ipoib_main.c
poib/ipoib_multicast.c
92a6b34bf4d0d11c54b2a6bdd6240f98cb326200 14-Aug-2005 Hal Rosenstock <halr@voltaire.com> [PATCH] IB: Eliminate redundant NULL checks

IPoIB: Eliminate NULL checks prior to calling kfree

Signed-off-by: Hal Rosenstock <halr@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
2a1d9b7f09aaaacf235656cb32a40ba2c79590b3 11-Aug-2005 Roland Dreier <roland@eddore.topspincom.com> [PATCH] IB: Add copyright notices

Make some lawyers happy and add copyright notices for people who
forgot to include them when they actually touched the code.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib.h
poib/ipoib_ib.c
poib/ipoib_main.c
poib/ipoib_multicast.c
poib/ipoib_verbs.c
0dca0f7bf82face7b700890318d5550fd542cabf 28-Jul-2005 Hal Rosenstock <halr@voltaire.com> [PATCH] [IPoIB] Handle sending of unicast RARP responses

RARP replies are another valid case where IPoIB may need to send a
unicast packet with no neighbour structure.

Signed-off-by: Hal Rosenstock <halr@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_main.c
2181858bb814b51de8ec25b3ddd37cd06c53b0c9 27-Jul-2005 Roland Dreier <roland@eddore.topspincom.com> [IB/ipoib]: Fix unsigned comparisons to handle wraparound

Fix handling of tx_head/tx_tail comparisons to handle wraparound.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
poib/ipoib_ib.c
9adec1a808603698bd7ff47f3883bd7cd1383f90 17-Apr-2005 Roland Dreier <roland@topspin.com> [PATCH] IPoIB: convert to debugfs

Convert IPoIB to use debugfs instead of its own custom debugging filesystem.

Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
poib/ipoib_fs.c
poib/ipoib_main.c
e6ded99cbbbfef2cef537d717ad61d2f77f4dfd6 17-Apr-2005 Roland Dreier <roland@topspin.com> [PATCH] IPoIB: fix static rate calculation

Correct and simplify calculation of static rate. We need to round up the
quotient of (local_rate - path_rate) / path_rate. To round up we add
(path_rate - 1) to the numerator, so the quotient simplifies to (local_rate -
1) / path_rate.

No idea how I came up with the old formula.

Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
poib/ipoib_main.c
poib/ipoib_multicast.c
62241eb497721be7640e5d9330e60f4a88a4db46 17-Apr-2005 Hal Rosenstock <halr@voltaire.com> [PATCH] IPoIB: set skb->mac.raw on receive

Set skb->mac.raw on receive. This fixes crashes when this is
dereferenced, for example by netfilter or when PF_PACKET is used.

Signed-off-by: Hal Rosenstock <halr@voltaire.com>
Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
poib/ipoib_ib.c
1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 17-Apr-2005 Linus Torvalds <torvalds@ppc970.osdl.org> Linux-2.6.12-rc2

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!
poib/Kconfig
poib/Makefile
poib/ipoib.h
poib/ipoib_fs.c
poib/ipoib_ib.c
poib/ipoib_main.c
poib/ipoib_multicast.c
poib/ipoib_verbs.c
poib/ipoib_vlan.c