History log of /net/iucv/af_iucv.c
Revision Date Author Comments
1042cab8627a2d11491e8b0dd40c4dda3180285a 21-Jul-2014 Ursula Braun <ursula.braun@de.ibm.com> af_iucv: avoid path quiesce of severed path in shutdown()

An af_iucv stress test showed -EPIPE results for sendmsg()
calls. They are caused by quiescing a path even though it has
been already severed by peer. For IUCV transport shutdown()
consists of 2 steps:
(1) sending the shutdown message to peer
(2) quiescing the iucv path
If the iucv path between these 2 steps is severed due to peer
closing the path, the quiesce step is no longer needed.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Reported-by: Philipp Hachtmann <phacht@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
089b03227a503ba2c5a8174900e531764e54db0f 14-Jul-2014 Fabian Frederick <fabf@skynet.be> af_iucv: remove unnecessary break after goto

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
4d520f62e0f4fd310a2307d0244ef184ce9200ba 28-May-2014 Ursula Braun <ursula.braun@de.ibm.com> af_iucv: correct cleanup if listen backlog is full

In case of transport HIPER a sock struct is allocated for an incoming
connect request. If the backlog queue is full this socket is not
needed, but is left in the list of af_iucv sockets. Final socket
release posts console message "Attempt to release alive iucv socket".
This patch makes sure the new created socket is cleaned up correctly
if the backlog queue is full.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Reported-by: Philipp Hachtmann <phacht@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
53a4b4995edc1b9eddf6cee342479f86cbae4337 28-May-2014 Philipp Hachtmann <phacht@linux.vnet.ibm.com> af_iucv: Add automatic (source) iucv_name to bind

If a socket is bound to an address using before calling connect
it is usual to leave it to the network system to choose an appropriate
outgoing application name respective port address.
af_iucv on VM uses a counter and uses simple numbers as unique identifiers.
This behaviour was missing when af_iucv is used with HiperSockets.

This patch contains a simple approach to harmonize af_iucv's behaviour.

Signed-off-by: Philipp Hachtmann <phacht@linux.vnet.ibm.com>
Signed-off-by: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
f5738e2ef88070ef1372e6e718124d88e9abe4ac 13-May-2014 Ursula Braun <ursula.braun@de.ibm.com> af_iucv: wrong mapping of sent and confirmed skbs

When sending data through IUCV a MESSAGE COMPLETE interrupt
signals that sent data memory can be freed or reused again.
With commit f9c41a62bba3f3f7ef3541b2a025e3371bcbba97
"af_iucv: fix recvmsg by replacing skb_pull() function" the
MESSAGE COMPLETE callback iucv_callback_txdone() identifies
the wrong skb as being confirmed, which leads to data corruption.
This patch fixes the skb mapping logic in iucv_callback_txdone().

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
676d23690fb62b5d51ba5d659935e9f7d9da9f8e 11-Apr-2014 David S. Miller <davem@davemloft.net> net: Fix use after free by removing length arg from sk_data_ready callbacks.

Several spots in the kernel perform a sequence like:

skb_queue_tail(&sk->s_receive_queue, skb);
sk->sk_data_ready(sk, skb->len);

But at the moment we place the SKB onto the socket receive queue it
can be consumed and freed up. So this skb->len access is potentially
to freed up memory.

Furthermore, the skb->len can be modified by the consumer so it is
possible that the value isn't accurate.

And finally, no actual implementation of this callback actually uses
the length argument. And since nobody actually cared about it's
value, lots of call sites pass arbitrary values in such as '0' and
even '1'.

So just remove the length argument from the callback, that way there
is no confusion whatsoever and all of these use-after-free cases get
fixed as a side effect.

Based upon a patch by Eric Dumazet and his suggestion to audit this
issue tree-wide.

Signed-off-by: David S. Miller <davem@davemloft.net>
2f139a5d8225666faee9f1d3c5629c4e5ff947aa 19-Mar-2014 Ursula Braun <ursula.braun@de.ibm.com> af_iucv: recvmsg problem for SOCK_STREAM sockets

Commit f9c41a62bba3f3f7ef3541b2a025e3371bcbba97 introduced
a problem for SOCK_STREAM sockets, when only part of the
incoming iucv message is received by user space. In this
case the remaining data of the iucv message is lost.
This patch makes sure an incompletely received iucv message
is queued back to the receive queue.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Reported-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
f3d3342602f8bcbf37d7c46641cb9bca7618eb1c 21-Nov-2013 Hannes Frederic Sowa <hannes@stressinduktion.org> net: rework recvmsg handler msg_name and msg_namelen logic

This patch now always passes msg->msg_namelen as 0. recvmsg handlers must
set msg_namelen to the proper size <= sizeof(struct sockaddr_storage)
to return msg_name to the user.

This prevents numerous uninitialized memory leaks we had in the
recvmsg handlers and makes it harder for new code to accidentally leak
uninitialized memory.

Optimize for the case recvfrom is called with NULL as address. We don't
need to copy the address at all, so set it to NULL before invoking the
recvmsg handler. We can do so, because all the recvmsg handlers must
cope with the case a plain read() is called on them. read() also sets
msg_name to NULL.

Also document these changes in include/linux/net.h as suggested by David
Miller.

Changes since RFC:

Set msg->msg_name = NULL if user specified a NULL in msg_name but had a
non-null msg_namelen in verify_iovec/verify_compat_iovec. This doesn't
affect sendto as it would bail out earlier while trying to copy-in the
address. It also more naturally reflects the logic by the callers of
verify_iovec.

With this change in place I could remove "
if (!uaddr || msg_sys->msg_namelen == 0)
msg->msg_name = NULL
".

This change does not alter the user visible error logic as we ignore
msg_namelen as long as msg_name is NULL.

Also remove two unnecessary curly brackets in ___sys_recvmsg and change
comments to netdev style.

Cc: David Miller <davem@davemloft.net>
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
351638e7deeed2ec8ce451b53d33921b3da68f83 28-May-2013 Jiri Pirko <jiri@resnulli.us> net: pass info struct via netdevice notifier

So far, only net_device * could be passed along with netdevice notifier
event. This patch provides a possibility to pass custom structure
able to provide info that event listener needs to know.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>

v2->v3: fix typo on simeth
shortened dev_getter
shortened notifier_info struct name
v1->v2: fix notifier_call parameter in call_netdevice_notifier()
Signed-off-by: David S. Miller <davem@davemloft.net>
f9c41a62bba3f3f7ef3541b2a025e3371bcbba97 08-Apr-2013 Ursula Braun <ursula.braun@de.ibm.com> af_iucv: fix recvmsg by replacing skb_pull() function

When receiving data messages, the "BUG_ON(skb->len < skb->data_len)" in
the skb_pull() function triggers a kernel panic.

Replace the skb_pull logic by a per skb offset as advised by
Eric Dumazet.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
a5598bd9c087dc0efc250a5221e5d0e6f584ee88 07-Apr-2013 Mathias Krause <minipli@googlemail.com> iucv: Fix missing msg_namelen update in iucv_sock_recvmsg()

The current code does not fill the msg_name member in case it is set.
It also does not set the msg_namelen member to 0 and therefore makes
net/socket.c leak the local, uninitialized sockaddr_storage variable
to userland -- 128 bytes of kernel stack memory.

Fix that by simply setting msg_namelen to 0 as obviously nobody cared
about iucv_sock_recvmsg() not filling the msg_name in case it was set.

Cc: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8facd5fb73c6e960555e5913743dfbb6c3d984a5 02-Apr-2013 Jacob Keller <jacob.e.keller@intel.com> net: fix smatch warnings inside datagram_poll

Commit 7d4c04fc170087119727119074e72445f2bb192b ("net: add option to enable
error queue packets waking select") has an issue due to operator precedence
causing the bit-wise OR to bind to the sock_flags call instead of the result of
the terniary conditional. This fixes the *_poll functions to work properly. The
old code results in "mask |= POLLPRI" instead of what was intended, which is to
only include POLLPRI when the socket option is enabled.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7d4c04fc170087119727119074e72445f2bb192b 28-Mar-2013 Keller, Jacob E <jacob.e.keller@intel.com> net: add option to enable error queue packets waking select

Currently, when a socket receives something on the error queue it only wakes up
the socket on select if it is in the "read" list, that is the socket has
something to read. It is useful also to wake the socket if it is in the error
list, which would enable software to wait on error queue packets without waking
up for regular data on the socket. The main use case is for receiving
timestamped transmit packets which return the timestamp to the socket via the
error queue. This enables an application to select on the socket for the error
queue only instead of for the regular traffic.

-v2-
* Added the SO_SELECT_ERR_QUEUE socket option to every architechture specific file
* Modified every socket poll function that checks error queue

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jeffrey Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Matthew Vick <matthew.vick@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
b67bfe0d42cac56c512dd5da4b1b347a23f4b70a 28-Feb-2013 Sasha Levin <sasha.levin@oracle.com> hlist: drop the node parameter from iterators

I'm not sure why, but the hlist for each entry iterators were conceived

list_for_each_entry(pos, head, member)

The hlist ones were greedy and wanted an extra parameter:

hlist_for_each_entry(tpos, pos, head, member)

Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.

Besides the semantic patch, there was some manual work required:

- Fix up the actual hlist iterators in linux/list.h
- Fix up the declaration of other iterators based on the hlist ones.
- A very small amount of places were using the 'node' parameter, this
was modified to use 'obj->member' instead.
- Coccinelle didn't handle the hlist_for_each_entry_safe iterator
properly, so those had to be fixed up manually.

The semantic patch which is mostly the work of Peter Senna Tschudin is here:

@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

type T;
expression a,c,d,e;
identifier b;
statement S;
@@

-T b;
<+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
...+>

[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin <peter.senna@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
62b1a8ab9b3660bb820d8dfe23148ed6cda38574 14-Jun-2012 Eric Dumazet <edumazet@google.com> net: remove skb_orphan_try()

Orphaning skb in dev_hard_start_xmit() makes bonding behavior
unfriendly for applications sending big UDP bursts : Once packets
pass the bonding device and come to real device, they might hit a full
qdisc and be dropped. Without orphaning, the sender is automatically
throttled because sk->sk_wmemalloc reaches sk->sk_sndbuf (assuming
sk_sndbuf is not too big)

We could try to defer the orphaning adding another test in
dev_hard_start_xmit(), but all this seems of little gain,
now that BQL tends to make packets more likely to be parked
in Qdisc queues instead of NIC TX ring, in cases where performance
matters.

Reverts commits :
fc6055a5ba31 net: Introduce skb_orphan_try()
87fd308cfc6b net: skb_tx_hash() fix relative to skb_orphan_try()
and removes SKBTX_DRV_NEEDS_SK_REF flag

Reported-and-bisected-by: Jean-Michel Hautbois <jhautbois@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
82492a355fac112908271faa74f473a38c1fb647 07-Mar-2012 Ursula Braun <ursula.braun@de.ibm.com> af_iucv: add shutdown for HS transport

AF_IUCV sockets offer a shutdown function. This patch makes sure
shutdown works for HS transport as well.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9fbd87d413921f36d2f55cee1d082323e6eb1d5f 07-Mar-2012 Ursula Braun <ursula.braun@de.ibm.com> af_iucv: handle netdev events

In case of transport through HiperSockets the underlying network
interface may switch to DOWN state or the underlying network device
may recover. In both cases the socket must change to IUCV_DISCONN
state. If the interface goes down, af_iucv has a chance to notify
its connection peer in addition.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
51363b8751a673a00ad48eea895266396d53fa52 08-Feb-2012 Ursula Braun <ursula.braun@de.ibm.com> af_iucv: allow retrieval of maximum message size

For HS transport the maximum message size depends on the MTU-size
of the HS-device bound to the AF_IUCV socket. This patch adds a
getsockopt option MSGSIZE returning the maximum message size that
can be handled for this AF_IUCV socket.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
800c5eb7b5eba6cb2a32738d763fd59f0fbcdde4 08-Feb-2012 Ursula Braun <ursula.braun@de.ibm.com> af_iucv: change net_device handling for HS transport

This patch saves the net_device in the iucv_sock structure during
bind in order to fasten skb sending.
In addition some other small improvements are made for HS transport:
- error checking when sending skbs
- locking changes in afiucv_hs_callback_txnotify
- skb freeing in afiucv_hs_callback_txnotify
And finally it contains code cleanup to get rid of iucv_skb_queue_purge.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7f1b0ea42a800713a3d56e1e8ca1a845e0461ca2 08-Feb-2012 Ursula Braun <ursula.braun@de.ibm.com> af_iucv: block writing if msg limit is exceeded

When polling on an AF_IUCV socket, writing should be blocked if the
number of pending messages exceeds a defined limit.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7d316b9453523498246e9e19a659c423d4c5081e 08-Feb-2012 Ursula Braun <ursula.braun@de.ibm.com> af_iucv: remove IUCV-pathes completely

A SEVER is missing in the callback of a receiving SEVERED. This may
inhibit z/VM to remove the corresponding IUCV-path completely.
This patch adds a SEVER in iucv_callback_connrej (together with
additional locking.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
aac6399c6a08334282653a86ce760cff3e1755b7 19-Dec-2011 Ursula Braun <ursula.braun@de.ibm.com> af_iucv: get rid of state IUCV_SEVERED

af_iucv differs unnecessarily between state IUCV_SEVERED and
IUCV_DISCONN. This patch removes state IUCV_SEVERED.
While simplifying af_iucv, this patch removes the 2nd invocation of
cpcmd as well.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9e8ba5f3ec35cba4fd8a8bebda548c4db2651e40 19-Dec-2011 Ursula Braun <ursula.braun@de.ibm.com> af_iucv: remove unused timer infrastructure

af_iucv contains timer infrastructure which is not exploited.
This patch removes the timer related code parts.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
816abbadf981e64b2342e1a875592623619560a4 19-Dec-2011 Ursula Braun <ursula.braun@de.ibm.com> af_iucv: release reference to HS device

For HiperSockets transport skbs sent are bound to one of the
available HiperSockets devices. Add missing release of reference to
a HiperSockets device before freeing an skb.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
42bd48e0145567acf7b3d2ae48bea765315bdd89 19-Dec-2011 Ursula Braun <ursula.braun@de.ibm.com> af_iucv: accelerate close for HS transport

Closing an af_iucv socket may wait for confirmation of outstanding
send requests. This patch adds confirmation code for the new
HiperSockets transport.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
c64d3f8f59367e89e83582b50bf072474ba2abff 19-Dec-2011 Ursula Braun <ursula.braun@de.ibm.com> af_iucv: support ancillary data with HS transport

The AF_IUCV address family offers support for ancillary data.
This patch enables usage of ancillary data with the new
HiperSockets transport.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
87fb4b7b533073eeeaed0b6bf7c2328995f6c075 13-Oct-2011 Eric Dumazet <eric.dumazet@gmail.com> net: more accurate skb truesize

skb truesize currently accounts for sk_buff struct and part of skb head.
kmalloc() roundings are also ignored.

Considering that skb_shared_info is larger than sk_buff, its time to
take it into account for better memory accounting.

This patch introduces SKB_TRUESIZE(X) macro to centralize various
assumptions into a single place.

At skb alloc phase, we put skb_shared_info struct at the exact end of
skb head, to allow a better use of memory (lowering number of
reallocations), since kmalloc() gives us power-of-two memory blocks.

Unless SLUB/SLUB debug is active, both skb->head and skb_shared_info are
aligned to cache lines, as before.

Note: This patch might trigger performance regressions because of
misconfigured protocol stacks, hitting per socket or global memory
limits that were previously not reached. But its a necessary step for a
more accurate memory accounting.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Andi Kleen <ak@linux.intel.com>
CC: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3881ac441f642d56503818123446f7298442236b 08-Aug-2011 Ursula Braun <ursula.braun@de.ibm.com> af_iucv: add HiperSockets transport

The current transport mechanism for af_iucv is the z/VM offered
communications facility IUCV. To provide equivalent support when
running Linux in an LPAR, HiperSockets transport is added to the
AF_IUCV address family. It requires explicit binding of an AF_IUCV
socket to a HiperSockets device. A new packet_type ETH_P_AF_IUCV
is announced. An af_iucv specific transport header is defined
preceding the skb data. A small protocol is implemented for
connecting and for flow control/congestion management.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
493d3971a65c921fad5c3369c7582214c91c965a 08-Aug-2011 Ursula Braun <ursula.braun@de.ibm.com> af_iucv: cleanup - use iucv_sk(sk) early

Code cleanup making make use of local variable for struct iucv_sock.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6fcd61f7bf5d56a83cbf26c14915138d1a64ca4e 08-Aug-2011 Frank Blaschka <frank.blaschka@de.ibm.com> af_iucv: use loadable iucv interface

For future af_iucv extensions the module should be able to run in LPAR
mode too. For this we use the new dynamic loading iucv interface.

Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9f6298a6ca38e251aa72a6035a8a36a52cf94536 12-May-2011 Ursula Braun <ursula.braun@de.ibm.com> af_iucv: get rid of compile warning

-Wunused-but-set-variable generates compile warnings. The affected
variables are removed.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
25985edcedea6396277003854657b5f3cb31a628 31-Mar-2011 Lucas De Marchi <lucas.demarchi@profusion.mobi> Fix common misspellings

Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
a56635a56f2afb3d22d9ce07e8f8d69537416b2d 26-May-2010 Julia Lawall <julia@diku.dk> net/iucv: Add missing spin_unlock

Add a spin_unlock missing on the error path. There seems like no reason
why the lock should continue to be held if the kzalloc fail.

The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
expression E1;
@@

* spin_lock(E1,...);
<+... when != E1
if (...) {
... when != E1
* return ...;
}
...+>
* spin_unlock(E1,...);
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
3fa21e07e6acefa31f974d57fba2b6920a7ebd1a 18-May-2010 Joe Perches <joe@perches.com> net: Remove unnecessary returns from void function()s

This patch removes from net/ (but not any netfilter files)
all the unnecessary return; statements that precede the
last closing brace of void functions.

It does not remove the returns that are immediately
preceded by a label as gcc doesn't like that.

Done via:
$ grep -rP --include=*.[ch] -l "return;\n}" net/ | \
xargs perl -i -e 'local $/ ; while (<>) { s/\n[ \t\n]+return;\n}/\n}/g; print; }'

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
43815482370c510c569fd18edb57afcb0fa8cab6 29-Apr-2010 Eric Dumazet <eric.dumazet@gmail.com> net: sock_def_readable() and friends RCU conversion

sk_callback_lock rwlock actually protects sk->sk_sleep pointer, so we
need two atomic operations (and associated dirtying) per incoming
packet.

RCU conversion is pretty much needed :

1) Add a new structure, called "struct socket_wq" to hold all fields
that will need rcu_read_lock() protection (currently: a
wait_queue_head_t and a struct fasync_struct pointer).

[Future patch will add a list anchor for wakeup coalescing]

2) Attach one of such structure to each "struct socket" created in
sock_alloc_inode().

3) Respect RCU grace period when freeing a "struct socket_wq"

4) Change sk_sleep pointer in "struct sock" by sk_wq, pointer to "struct
socket_wq"

5) Change sk_sleep() function to use new sk->sk_wq instead of
sk->sk_sleep

6) Change sk_has_sleeper() to wq_has_sleeper() that must be used inside
a rcu_read_lock() section.

7) Change all sk_has_sleeper() callers to :
- Use rcu_read_lock() instead of read_lock(&sk->sk_callback_lock)
- Use wq_has_sleeper() to eventually wakeup tasks.
- Use rcu_read_unlock() instead of read_unlock(&sk->sk_callback_lock)

8) sock_wake_async() is modified to use rcu protection as well.

9) Exceptions :
macvtap, drivers/net/tun.c, af_unix use integrated "struct socket_wq"
instead of dynamically allocated ones. They dont need rcu freeing.

Some cleanups or followups are probably needed, (possible
sk_callback_lock conversion to a spinlock for example...).

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
aa395145165cb06a0d0885221bbe0ce4a564391d 20-Apr-2010 Eric Dumazet <eric.dumazet@gmail.com> net: sk_sleep() helper

Define a new function to return the waitqueue of a "struct sock".

static inline wait_queue_head_t *sk_sleep(struct sock *sk)
{
return sk->sk_sleep;
}

Change all read occurrences of sk_sleep by a call to this function.

Needed for a future RCU conversion. sk_sleep wont be a field directly
available.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
471452104b8520337ae2fb48c4e61cd4896e025d 15-Dec-2009 Alexey Dobriyan <adobriyan@gmail.com> const: constify remaining dev_pm_ops

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3f378b684453f2a028eda463ce383370545d9cc9 06-Nov-2009 Eric Paris <eparis@redhat.com> net: pass kern to net_proto_family create function

The generic __sock_create function has a kern argument which allows the
security system to make decisions based on if a socket is being created by
the kernel or by userspace. This patch passes that flag to the
net_proto_family specific create function, so it can do the same thing.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9a4ff8d417e4ef2eeecb4a4433e3dbd8251aae5e 15-Oct-2009 Ursula Braun <ursula.braun@de.ibm.com> af_iucv: remove duplicate sock_set_flag

Remove duplicate sock_set_flag(sk, SOCK_ZAPPED) in iucv_sock_close,
which has been overlooked in September-commit
7514bab04e567c9408fe0facbde4277f09d5eb74.

Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
49f5eba7575e6dfb146c5e24623d50200ce23ff1 15-Oct-2009 Hendrik Brueckner <brueckner@linux.vnet.ibm.com> af_iucv: use sk functions to modify sk->sk_ack_backlog

Instead of modifying sk->sk_ack_backlog directly, use respective
socket functions.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ec1b4cf74c81bfd0fbe5bf62bafc86c45917e72f 05-Oct-2009 Stephen Hemminger <shemminger@vyatta.com> net: mark net_proto_ops as const

All usages of structure net_proto_ops should be declared const.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
b7058842c940ad2c08dd829b21e5c92ebe3b8758 01-Oct-2009 David S. Miller <davem@davemloft.net> net: Make setsockopt() optlen be unsigned.

This provides safety against negative optlen at the type
level instead of depending upon (sometimes non-trivial)
checks against this sprinkled all over the the place, in
each and every implementation.

Based upon work done by Arjan van de Ven and feedback
from Linus Torvalds.

Signed-off-by: David S. Miller <davem@davemloft.net>
bf95d20fdbd602d72c28a009a55d90d5109b8a86 16-Sep-2009 Hendrik Brueckner <brueckner@linux.vnet.ibm.com> af_iucv: fix race when queueing skbs on the backlog queue

iucv_sock_recvmsg() and iucv_process_message()/iucv_fragment_skb race
for dequeuing an skb from the backlog queue.

If iucv_sock_recvmsg() dequeues first, iucv_process_message() calls
sock_queue_rcv_skb() with an skb that is NULL.

This results in the following kernel panic:

<1>Unable to handle kernel pointer dereference at virtual kernel address (null)
<4>Oops: 0004 [#1] PREEMPT SMP DEBUG_PAGEALLOC
<4>Modules linked in: af_iucv sunrpc qeth_l3 dm_multipath dm_mod vmur qeth ccwgroup
<4>CPU: 0 Not tainted 2.6.30 #4
<4>Process client-iucv (pid: 4787, task: 0000000034e75940, ksp: 00000000353e3710)
<4>Krnl PSW : 0704000180000000 000000000043ebca (sock_queue_rcv_skb+0x7a/0x138)
<4> R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0 EA:3
<4>Krnl GPRS: 0052900000000000 000003e0016e0fe8 0000000000000000 0000000000000000
<4> 000000000043eba8 0000000000000002 0000000000000001 00000000341aa7f0
<4> 0000000000000000 0000000000007800 0000000000000000 0000000000000000
<4> 00000000341aa7f0 0000000000594650 000000000043eba8 000000003fc2fb28
<4>Krnl Code: 000000000043ebbe: a7840006 brc 8,43ebca
<4> 000000000043ebc2: 5930c23c c %r3,572(%r12)
<4> 000000000043ebc6: a724004c brc 2,43ec5e
<4> >000000000043ebca: e3c0b0100024 stg %r12,16(%r11)
<4> 000000000043ebd0: a7190000 lghi %r1,0
<4> 000000000043ebd4: e310b0200024 stg %r1,32(%r11)
<4> 000000000043ebda: c010ffffdce9 larl %r1,43a5ac
<4> 000000000043ebe0: e310b0800024 stg %r1,128(%r11)
<4>Call Trace:
<4>([<000000000043eba8>] sock_queue_rcv_skb+0x58/0x138)
<4> [<000003e0016bcf2a>] iucv_process_message+0x112/0x3cc [af_iucv]
<4> [<000003e0016bd3d4>] iucv_callback_rx+0x1f0/0x274 [af_iucv]
<4> [<000000000053a21a>] iucv_message_pending+0xa2/0x120
<4> [<000000000053b5a6>] iucv_tasklet_fn+0x176/0x1b8
<4> [<000000000014fa82>] tasklet_action+0xfe/0x1f4
<4> [<0000000000150a56>] __do_softirq+0x116/0x284
<4> [<0000000000111058>] do_softirq+0xe4/0xe8
<4> [<00000000001504ba>] irq_exit+0xba/0xd8
<4> [<000000000010e0b2>] do_extint+0x146/0x190
<4> [<00000000001184b6>] ext_no_vtime+0x1e/0x22
<4> [<00000000001fbf4e>] kfree+0x202/0x28c
<4>([<00000000001fbf44>] kfree+0x1f8/0x28c)
<4> [<000000000044205a>] __kfree_skb+0x32/0x124
<4> [<000003e0016bd8b2>] iucv_sock_recvmsg+0x236/0x41c [af_iucv]
<4> [<0000000000437042>] sock_aio_read+0x136/0x160
<4> [<0000000000205e50>] do_sync_read+0xe4/0x13c
<4> [<0000000000206dce>] vfs_read+0x152/0x15c
<4> [<0000000000206ed0>] SyS_read+0x54/0xac
<4> [<0000000000117c8e>] sysc_noemu+0x10/0x16
<4> [<00000042ff8def3c>] 0x42ff8def3c

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7514bab04e567c9408fe0facbde4277f09d5eb74 16-Sep-2009 Hendrik Brueckner <brueckner@linux.vnet.ibm.com> af_iucv: do not call iucv_sock_kill() twice

For non-accepted sockets on the accept queue, iucv_sock_kill()
is called twice (in iucv_sock_close() and iucv_sock_cleanup_listen()).
This typically results in a kernel oops as shown below.

Remove the duplicate call to iucv_sock_kill() and set the SOCK_ZAPPED
flag in iucv_sock_close() only.

The iucv_sock_kill() function frees a socket only if the socket is zapped
and orphaned (sk->sk_socket == NULL):
- Non-accepted sockets are always orphaned and, thus, iucv_sock_kill()
frees the socket twice.
- For accepted sockets or sockets created with iucv_sock_create(),
sk->sk_socket is initialized. This caused the first call to
iucv_sock_kill() to return immediately. To free these sockets,
iucv_sock_release() uses sock_orphan() before calling iucv_sock_kill().

<1>Unable to handle kernel pointer dereference at virtual kernel address 000000003edd3000
<4>Oops: 0011 [#1] PREEMPT SMP DEBUG_PAGEALLOC
<4>Modules linked in: af_iucv sunrpc qeth_l3 dm_multipath dm_mod qeth vmur ccwgroup
<4>CPU: 0 Not tainted 2.6.30 #4
<4>Process iucv_sock_close (pid: 2486, task: 000000003aea4340, ksp: 000000003b75bc68)
<4>Krnl PSW : 0704200180000000 000003e00168e23a (iucv_sock_kill+0x2e/0xcc [af_iucv])
<4> R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:2 PM:0 EA:3
<4>Krnl GPRS: 0000000000000000 000000003b75c000 000000003edd37f0 0000000000000001
<4> 000003e00168ec62 000000003988d960 0000000000000000 000003e0016b0608
<4> 000000003fe81b20 000000003839bb58 00000000399977f0 000000003edd37f0
<4> 000003e00168b000 000003e00168f138 000000003b75bcd0 000000003b75bc98
<4>Krnl Code: 000003e00168e22a: c0c0ffffe6eb larl %r12,3e00168b000
<4> 000003e00168e230: b90400b2 lgr %r11,%r2
<4> 000003e00168e234: e3e0f0980024 stg %r14,152(%r15)
<4> >000003e00168e23a: e310225e0090 llgc %r1,606(%r2)
<4> 000003e00168e240: a7110001 tmll %r1,1
<4> 000003e00168e244: a7840007 brc 8,3e00168e252
<4> 000003e00168e248: d507d00023c8 clc 0(8,%r13),968(%r2)
<4> 000003e00168e24e: a7840009 brc 8,3e00168e260
<4>Call Trace:
<4>([<000003e0016b0608>] afiucv_dbf+0x0/0xfffffffffffdea20 [af_iucv])
<4> [<000003e00168ec6c>] iucv_sock_close+0x130/0x368 [af_iucv]
<4> [<000003e00168ef02>] iucv_sock_release+0x5e/0xe4 [af_iucv]
<4> [<0000000000438e6c>] sock_release+0x44/0x104
<4> [<0000000000438f5e>] sock_close+0x32/0x50
<4> [<0000000000207898>] __fput+0xf4/0x250
<4> [<00000000002038aa>] filp_close+0x7a/0xa8
<4> [<00000000002039ba>] SyS_close+0xe2/0x148
<4> [<0000000000117c8e>] sysc_noemu+0x10/0x16
<4> [<00000042ff8deeac>] 0x42ff8deeac

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
56a73de3889383b70ed1fef06aaab0677731b0ea 16-Sep-2009 Hendrik Brueckner <brueckner@linux.vnet.ibm.com> af_iucv: handle non-accepted sockets after resuming from suspend

After resuming from suspend, all af_iucv sockets are disconnected.
Ensure that iucv_accept_dequeue() can handle disconnected sockets
which are not yet accepted.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
d9973179aef2af88b6fe4cc1df7ced6fe7cec7d0 16-Sep-2009 Hendrik Brueckner <brueckner@linux.vnet.ibm.com> af_iucv: fix race in __iucv_sock_wait()

Moving prepare_to_wait before the condition to avoid a race between
schedule_timeout and wake up.
The race can appear during iucv_sock_connect() and iucv_callback_connack().

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5708e868dc512f055f0ea4a14d01f8252c3ca8a1 14-Sep-2009 Alexey Dobriyan <adobriyan@gmail.com> net: constify remaining proto_ops

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
a57de0b4336e48db2811a2030bb68dba8dd09d88 08-Jul-2009 Jiri Olsa <jolsa@redhat.com> net: adding memory barrier to the poll and receive callbacks

Adding memory barrier after the poll_wait function, paired with
receive callbacks. Adding fuctions sock_poll_wait and sk_has_sleeper
to wrap the memory barrier.

Without the memory barrier, following race can happen.
The race fires, when following code paths meet, and the tp->rcv_nxt
and __add_wait_queue updates stay in CPU caches.

CPU1 CPU2

sys_select receive packet
... ...
__add_wait_queue update tp->rcv_nxt
... ...
tp->rcv_nxt check sock_def_readable
... {
schedule ...
if (sk->sk_sleep && waitqueue_active(sk->sk_sleep))
wake_up_interruptible(sk->sk_sleep)
...
}

If there was no cache the code would work ok, since the wait_queue and
rcv_nxt are opposit to each other.

Meaning that once tp->rcv_nxt is updated by CPU2, the CPU1 either already
passed the tp->rcv_nxt check and sleeps, or will get the new value for
tp->rcv_nxt and will return with new data mask.
In both cases the process (CPU1) is being added to the wait queue, so the
waitqueue_active (CPU2) call cannot miss and will wake up CPU1.

The bad case is when the __add_wait_queue changes done by CPU1 stay in its
cache, and so does the tp->rcv_nxt update on CPU2 side. The CPU1 will then
endup calling schedule and sleep forever if there are no more data on the
socket.

Calls to poll_wait in following modules were ommited:
net/bluetooth/af_bluetooth.c
net/irda/af_irda.c
net/irda/irnet/irnet_ppp.c
net/mac80211/rc80211_pid_debugfs.c
net/phonet/socket.c
net/rds/af_rds.c
net/rfkill/core.c
net/sunrpc/cache.c
net/sunrpc/rpc_pipe.c
net/tipc/socket.c

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
0ea920d211e0a870871965418923b08da2025b4a 17-Jun-2009 Hendrik Brueckner <brueckner@linux.vnet.ibm.com> af_iucv: Return -EAGAIN if iucv msg limit is exceeded

If the iucv message limit for a communication path is exceeded,
sendmsg() returns -EAGAIN instead of -EPIPE.
The calling application can then handle this error situtation,
e.g. to try again after waiting some time.

For blocking sockets, sendmsg() waits up to the socket timeout
before returning -EAGAIN. For the new wait condition, a macro
has been introduced and the iucv_sock_wait_state() has been
refactored to this macro.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bb664f49f8be17d7b8bf9821144e8a53d7fcfe8a 17-Jun-2009 Hendrik Brueckner <brueckner@linux.vnet.ibm.com> af_iucv: Change if condition in sendmsg() for more readability

Change the if condition to exit sendmsg() if the socket in not connected.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
c23cad923bfebd295ec49dc9265569993903488d 16-Jun-2009 Ursula Braun <ursula.braun@de.ibm.com> [S390] PM: af_iucv power management callbacks.

Patch establishes a dummy afiucv-device to make sure af_iucv is
notified as iucv-bus device about suspend/resume.

The PM freeze callback severs all iucv pathes of connected af_iucv sockets.
The PM thaw/restore callback switches the state of all previously connected
sockets to IUCV_DISCONN.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
d93fe1a144c1a4312972bedbefc2213aa8b88612 23-Apr-2009 Ursula Braun <ubraun@linux.vnet.ibm.com> af_iucv: Fix merge.

From: Ursula Braun <ubraun@linux.vnet.ibm.com>

net/iucv/af_iucv.c in net-next-2.6 is almost correct. 4 lines should
still be deleted. These are the remaining changes:

Signed-off-by: David S. Miller <davem@davemloft.net>
09488e2e0fab14ebe41135f0d066cfe2c56ba9e5 22-Apr-2009 Hendrik Brueckner <brueckner@linux.vnet.ibm.com> af_iucv: New socket option for setting IUCV MSGLIMITs

The SO_MSGLIMIT socket option modifies the message limit for new
IUCV communication paths.

The message limit specifies the maximum number of outstanding messages
that are allowed for connections. This setting can be lowered by z/VM
when an IUCV connection is established.

Expects an integer value in the range of 1 to 65535.
The default value is 65535.

The message limit must be set before calling connect() or listen()
for sockets.

If sockets are already connected or in state listen, changing the message
limit is not supported.
For reading the message limit value, unconnected sockets return the limit
that has been set or the default limit. For connected sockets, the actual
message limit is returned. The actual message limit is assigned by z/VM
for each connection and it depends on IUCV MSGLIMIT authorizations
specified for the z/VM guest virtual machine.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
802788bf90f78e7f248e78d4d0510bb00e976db8 22-Apr-2009 Hendrik Brueckner <brueckner@linux.vnet.ibm.com> af_iucv: cleanup and refactor recvmsg() EFAULT handling

If the skb cannot be copied to user iovec, always return -EFAULT.
The skb is enqueued again, except MSG_PEEK flag is set, to allow user space
applications to correct its iovec pointer.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
aa8e71f58ab8e01d63c33df40ff1bcb997c9df92 22-Apr-2009 Hendrik Brueckner <brueckner@linux.vnet.ibm.com> af_iucv: Provide new socket type SOCK_SEQPACKET

This patch provides the socket type SOCK_SEQPACKET in addition to
SOCK_STREAM.

AF_IUCV sockets of type SOCK_SEQPACKET supports an 1:1 mapping of
socket read or write operations to complete IUCV messages.
Socket data or IUCV message data is not fragmented as this is the
case for SOCK_STREAM sockets.

The intention is to help application developers who write
applications or device drivers using native IUCV interfaces
(Linux kernel or z/VM IUCV interfaces).

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
44b1e6b5f9a93cc2ba024e09cf137d5f1b5f8426 22-Apr-2009 Hendrik Brueckner <brueckner@linux.vnet.ibm.com> af_iucv: Modify iucv msg target class using control msghdr

Allow 'classification' of socket data that is sent or received over
an af_iucv socket. For classification of data, the target class of an
(native) iucv message is used.

This patch provides the cmsg interface for iucv_sock_recvmsg() and
iucv_sock_sendmsg(). Applications can use the msg_control field of
struct msghdr to set or get the target class as a
"socket control message" (SCM/CMSG).

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
b8942e3b6c4b35dda5e8ca75aec5e2f027fe39a9 22-Apr-2009 Hendrik Brueckner <brueckner@linux.vnet.ibm.com> af_iucv: Support data in IUCV msg parameter lists (IPRMDATA)

The patch allows to send and receive data in the parameter list of an
iucv message.
The parameter list is an arry of 8 bytes that are used by af_iucv as
follows:
0..6 7 bytes for socket data and
7 1 byte to store the data length.

Instead of storing the data length directly, the difference
between 0xFF and the data length is used.
This convention does not interfere with the existing use of PRM
messages for shutting down the send direction of an AF_IUCV socket
(shutdown() operation). Data lenghts greater than 7 (or PRM message
byte 8 is less than 0xF8) denotes to special messages.
Currently, the special SEND_SHUTDOWN message is supported only.

To use IPRM messages, both communicators must set the IUCV_IPRMDATA
flag during path negotiation, i.e. in iucv_connect() and
path_pending().

To be compatible to older af_iucv implementations, sending PRM
messages is controlled by the socket option SO_IPRMDATA_MSG.
Receiving PRM messages does not depend on the socket option (but
requires the IUCV_IPRMDATA path flag to be set).

Sending/Receiving data in the parameter list improves performance for
small amounts of data by reducing message_completion() interrupts and
memory copy operations.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9d5c5d8f4105dc56ec10864b195dd1714f282c22 22-Apr-2009 Hendrik Brueckner <brueckner@linux.vnet.ibm.com> af_iucv: add sockopt() to enable/disable use of IPRM_DATA msgs

Provide the socket operations getsocktopt() and setsockopt() to enable/disable
sending of data in the parameter list of IUCV messages.
The patch sets respective flag only.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
af88b52def76679c8c5bcdbed199fbe62b6a16d4 22-Apr-2009 Hendrik Brueckner <brueckner@linux.vnet.ibm.com> af_iucv: sync sk shutdown flag if iucv path is quiesced

If the af_iucv communication partner quiesces the path to shutdown its
receive direction, provide a quiesce callback implementation to shutdown
the (local) send direction. This ensures that both sides are synchronized.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3fa6b5adbe46b3d665267dee0f879858ab464f44 21-Apr-2009 Hendrik Brueckner <brueckner@linux.vnet.ibm.com> af_iucv: Fix race when queuing incoming iucv messages

AF_IUCV runs into a race when queuing incoming iucv messages
and receiving the resulting backlog.

If the Linux system is under pressure (high load or steal time),
the message queue grows up, but messages are not received and queued
onto the backlog queue. In that case, applications do not
receive any data with recvmsg() even if AF_IUCV puts incoming
messages onto the message queue.

The race can be avoided if the message queue spinlock in the
message_pending callback is spreaded across the entire callback
function.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
e14ad5fa8705fb354e72312479abbe420ebc3f8e 21-Apr-2009 Hendrik Brueckner <brueckner@linux.vnet.ibm.com> af_iucv: Test additional sk states in iucv_sock_shutdown

Add few more sk states in iucv_sock_shutdown().

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
fe86e54ef9465c97a16337d2a41a4cf486b937ae 21-Apr-2009 Hendrik Brueckner <brueckner@linux.vnet.ibm.com> af_iucv: Reject incoming msgs if RECV_SHUTDOWN is set

Reject incoming iucv messages if the receive direction has been shut down.
It avoids that the queue of outstanding messages increases and exceeds the
message limit of the iucv communication path.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
60d3705fcbfe7deca8e94bc7ddecd6f9f1a4647e 21-Apr-2009 Hendrik Brueckner <brueckner@linux.vnet.ibm.com> af_iucv: fix oops in iucv_sock_recvmsg() for MSG_PEEK flag

If iucv_sock_recvmsg() is called with MSG_PEEK flag set, the skb is enqueued
twice. If the socket is then closed, the pointer to the skb is freed twice.

Remove the skb_queue_head() call for MSG_PEEK, because the skb_recv_datagram()
function already handles MSG_PEEK (does not dequeue the skb).

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bbe188c8f16effd902d1ad391e06e41ce649b22e 21-Apr-2009 Ursula Braun <ursula.braun@de.ibm.com> af_iucv: consider state IUCV_CLOSING when closing a socket

Make sure a second invocation of iucv_sock_close() guarantees proper
freeing of an iucv path.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
47a30b26e58ab7e56e5654766fd678a4b90010e3 25-Feb-2009 Wei Yongjun <yjwei@cn.fujitsu.com> iucv: remove some pointless conditionals before kfree_skb()

Remove some pointless conditionals before kfree_skb().

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
65dbd7c2778f1921ef1ee2a73e47a2a126fed30f 06-Jan-2009 Hendrik Brueckner <brueckner@linux.vnet.ibm.com> af_iucv: Free iucv path/socket in path_pending callback

Free iucv path after iucv_path_sever() calls in iucv_callback_connreq()
(path_pending() iucv callback).
If iucv_path_accept() fails, free path and free/kill newly created socket.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18becbc5479f88d5adc218374ca62b8b93ec2545 06-Jan-2009 Ursula Braun <ursula.braun@de.ibm.com> af_iucv: avoid left over IUCV connections from failing connects

For certain types of AFIUCV socket connect failures IUCV connections
are left over. Add some cleanup-statements to avoid cluttered IUCV
connections.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
55cdea9ed9cf2d76993e40ed7a1fc649a14db07c 06-Jan-2009 Hendrik Brueckner <brueckner@linux.vnet.ibm.com> af_iucv: New error return codes for connect()

If the iucv_path_connect() call fails then return an error code that
corresponds to the iucv_path_connect() failure condition; instead of
returning -ECONNREFUSED for any failure.

This helps to improve error handling for user space applications
(e.g. inform the user that the z/VM guest is not authorized to
connect to other guest virtual machines).

The error return codes are based on those described in connect(2).

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8f7c502c267c0e5e2dbbbdea9f3e7e85bbc95694 25-Dec-2008 Ursula Braun <braunu@de.ibm.com> [S390] convert iucv printks to dev_xxx and pr_xxx macros.

Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
c2b4afd2f99a187ec3bbd6e2def186fbfb755929 14-Jul-2008 Ursula Braun <braunu@de.ibm.com> [S390] Cleanup iucv printk messages.

Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
469689a4dd476c1be6750deea5f59528a17b8b4a 10-Jun-2008 Ursula Braun <braunu@de.ibm.com> af_iucv: exploit target message class support of IUCV

The first 4 bytes of data to be sent are stored additionally into
the message class field of the send request. A receiving target
program (not an af_iucv socket program) can make use of this
information to pre-screen incoming messages.

Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3db8ce35c37b62d27e7656fe5e7d2d2865002045 10-Apr-2008 Robert P. J. Day <rpjday@crashcourse.ca> af_iucv: Use non-deprecated __RW_LOCK_UNLOCKED macro.

Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
f2a77991a918218be4a3ac78250e7eba2282be59 08-Feb-2008 Ursula Braun <braunu@de.ibm.com> [AF_IUCV]: defensive programming of iucv_callback_txdone

The loop in iucv_callback_txdone presumes existence of an entry
with msg->tag in the send_skb_q list. In error cases this
assumption might be wrong and might cause an endless loop.
Loop is rewritten to guarantee loop end in case of missing
msg->tag entry in send_skb_q.

Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
d44447229e35115675d166b51a52e512c281475c 08-Feb-2008 Ursula Braun <braunu@de.ibm.com> [AF_IUCV]: broken send_skb_q results in endless loop

A race has been detected in iucv_callback_txdone().
skb_unlink has to be done inside the locked area.

In addition checkings for successful allocations are inserted.

Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
b24b8a247ff65c01b252025926fe564209fae4fc 24-Jan-2008 Pavel Emelyanov <xemul@openvz.org> [NET]: Convert init_timer into setup_timer

Many-many code in the kernel initialized the timer->function
and timer->data together with calling init_timer(timer). There
is already a helper for this. Use it for networking code.

The patch is HUGE, but makes the code 130 lines shorter
(98 insertions(+), 228 deletions(-)).

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6257ff2177ff02d7f260a7a501876aa41cb9a9f6 01-Nov-2007 Pavel Emelyanov <xemul@openvz.org> [NET]: Forget the zero_it argument of sk_alloc()

Finally, the zero_it argument can be completely removed from
the callers and from the function prototype.

Besides, fix the checkpatch.pl warnings about using the
assignments inside if-s.

This patch is rather big, and it is a part of the previous one.
I splitted it wishing to make the patches more readable. Hope
this particular split helped.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
f0703c80e5156406ad947cb67fe277725b48080f 08-Oct-2007 Ursula Braun <braunu@de.ibm.com> [AF_IUCV]: postpone receival of iucv-packets

AF_IUCV socket programs may waste Linux storage, because af_iucv
allocates an skb whenever posted by the receive callback routine and
receives the message immediately.
Message receival is now postponed if data from previous callbacks has
not yet been transferred to the receiving socket program. Instead a
message handle is saved in a message queue as a reminder. Once
messages could be given to the receiving socket program, there is
an additional checking for entries in the message queue, followed
by skb allocation and message receival if applicable.

Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
57f20448032158ad00b1e74f479515c689998be9 08-Oct-2007 Heiko Carstens <heiko.carstens@de.ibm.com> [AF_IUCV]: remove static declarations from header file.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1b8d7ae42d02e483ad94035cca851e4f7fbecb40 09-Oct-2007 Eric W. Biederman <ebiederm@xmission.com> [NET]: Make socket creation namespace safe.

This patch passes in the namespace a new socket should be created in
and has the socket code do the appropriate reference counting. By
virtue of this all socket create methods are touched. In addition
the socket create methods are modified so that they will fail if
you attempt to create a socket in a non-default network namespace.

Failing if we attempt to create a socket outside of the default
network namespace ensures that as we incrementally make the network stack
network namespace aware we will not export functionality that someone
has not audited and made certain is network namespace safe.
Allowing us to partially enable network namespaces before all of the
exotic protocols are supported.

Any protocol layers I have missed will fail to compile because I now
pass an extra parameter into the socket creation code.

[ Integrated AF_IUCV build fixes from Andrew Morton... -DaveM ]

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
febca281f677a775c61cd0572c2f35e4ead9e7d5 15-Jul-2007 Ursula Braun <braunu@de.ibm.com> [AF_IUCV]: Add lock when updating accept_q

The accept_queue of an af_iucv socket will be corrupted, if
adding and deleting of entries in this queue occurs at the
same time (connect request from one client, while accept call
is processed for another client).
Solution: add locking when updating accept_q

Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Acked-by: Frank Pavlic <fpavlic@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
af7cd373b01ccb8191dc16c77fff4cf2b11def50 05-May-2007 Heiko Carstens <heiko.carstens@de.ibm.com> [AF_IUCV]: Compile fix - adopt to skbuff changes.

From: Heiko Carstens <heiko.carstens@de.ibm.com>

CC [M] net/iucv/af_iucv.o
net/iucv/af_iucv.c: In function `iucv_fragment_skb':
net/iucv/af_iucv.c:984: error: structure has no member named `h'
net/iucv/af_iucv.c:985: error: structure has no member named `nh'
net/iucv/af_iucv.c:988: error: incompatible type for argument 1 of
`skb_queue_tail'

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
da99f0565477899f08b76ffcb32afbf6fa95d64a 04-May-2007 Heiko Carstens <heiko.carstens@de.ibm.com> [AF_IUCV/IUCV] : Add missing section annotations

Add missing section annotations and found and fixed some
Coding Style issues.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Frank Pavlic <fpavlic@de.ibm.com>
561e036006dc4078446815613781c6c33441dd3b 04-May-2007 Jennifer Hunt <jenhunt@us.ibm.com> [AF_IUCV]: Implementation of a skb backlog queue

With the inital implementation we missed to implement a skb backlog
queue . The result is that socket receive processing tossed packets.
Since AF_IUCV connections are working synchronously it leads to
connection hangs. Problems with read, close and select also occured.

Using a skb backlog queue is fixing all of these problems .

Signed-off-by: Jennifer Hunt <jenhunt@us.ibm.com>
Signed-off-by: Frank Pavlic <fpavlic@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3ff50b7997fe06cd5d276b229967bb52d6b3b6c1 21-Apr-2007 Stephen Hemminger <shemminger@linux-foundation.org> [NET]: cleanup extra semicolons

Spring cleaning time...

There seems to be a lot of places in the network code that have
extra bogus semicolons after conditionals. Most commonly is a
bogus semicolon after: switch() { }

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
badff6d01a8589a1c828b0bf118903ca38627f4e 13-Mar-2007 Arnaldo Carvalho de Melo <acme@redhat.com> [SK_BUFF]: Introduce skb_reset_transport_header(skb)

For the common, open coded 'skb->h.raw = skb->data' operation, so that we can
later turn skb->h.raw into a offset, reducing the size of struct sk_buff in
64bit land while possibly keeping it as a pointer on 32bit.

This one touches just the most simple cases:

skb->h.raw = skb->data;
skb->h.raw = {skb_push|[__]skb_pull}()

The next ones will handle the slightly more "complex" cases.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
c1d2bbe1cd6c7bbdc6d532cefebb66c7efb789ce 11-Apr-2007 Arnaldo Carvalho de Melo <acme@redhat.com> [SK_BUFF]: Introduce skb_reset_network_header(skb)

For the common, open coded 'skb->nh.raw = skb->data' operation, so that we can
later turn skb->nh.raw into a offset, reducing the size of struct sk_buff in
64bit land while possibly keeping it as a pointer on 32bit.

This one touches just the most simple case, next will handle the slightly more
"complex" cases.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
eac3731bd04c7131478722a3c148b78774553116 08-Feb-2007 Jennifer Hunt <jenhunt@us.ibm.com> [S390]: Add AF_IUCV socket support

From: Jennifer Hunt <jenhunt@us.ibm.com>

This patch adds AF_IUCV socket support.

Signed-off-by: Frank Pavlic <fpavlic@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>