History log of /net/sunrpc/svc_xprt.c
Revision Date Author Comments
ae89254da6879cffa6a17327e5f3f60217b718cf 20-Aug-2014 J. Bruce Fields <bfields@redhat.com> SUNRPC: Fix compile on non-x86

current_task appears to be x86-only, oops.

Let's just delete this check entirely:

Any developer that adds a new user without setting rq_task will get a
crash the first time they test it. I also don't think there are
normally any important locks held here, and I can't see any other reason
why killing a server thread would bring the whole box down.

So the effort to fail gracefully here looks like overkill.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 983c684466e0 "SUNRPC: get rid of the request wait queue"
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
0c0746d03eac70e12bcb39e7f1c7f0a1dd31123c 03-Aug-2014 Trond Myklebust <trond.myklebust@primarydata.com> SUNRPC: More optimisations of svc_xprt_enqueue()

Just move the transport locking out of the spin lock protected area
altogether.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
a4aa8054a60c545f100826271ac9f04c34bf828d 03-Aug-2014 Trond Myklebust <trond.myklebust@primarydata.com> SUNRPC: Fix broken kthread_should_stop test in svc_get_next_xprt

We should definitely not be exiting svc_get_next_xprt() with the
thread enqueued. Fix this by ensuring that we fall through to
the dequeue.
Also move the test itself outside the spin lock protected section.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
983c684466e02b21f83c025ea539deee6c0aeac0 03-Aug-2014 Trond Myklebust <trond.myklebust@primarydata.com> SUNRPC: get rid of the request wait queue

We're always _only_ waking up tasks from within the sp_threads list, so
we know that they are enqueued and alive. The rq_wait waitqueue is just
a distraction with extra atomic semantics.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
106f359cf4d613ebf54cb9f29721bb956fc3460e 03-Aug-2014 Trond Myklebust <trond.myklebust@primarydata.com> SUNRPC: Do not grab pool->sp_lock unnecessarily in svc_get_next_xprt

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
9e5b208dc9b2460f83f218ef6a6a1b1309fcd6b0 03-Aug-2014 Trond Myklebust <trond.myklebust@primarydata.com> SUNRPC: Do not override wspace tests in svc_handle_xprt

We already determined that there was enough wspace when we
called svc_xprt_enqueue.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
518776800c094a518ae6d303660b57f1400eb1eb 25-Jul-2014 Trond Myklebust <trond.myklebust@primarydata.com> SUNRPC: Allow svc_reserve() to notify TCP socket that space has been freed

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
0971374e2818eef6ebdbd7a37acf6ab7e98ac06c 25-Jul-2014 Trond Myklebust <trond.myklebust@primarydata.com> SUNRPC: Reduce contention in svc_xprt_enqueue()

Ensure that all calls to svc_xprt_enqueue() except svc_xprt_received()
check the value of XPT_BUSY, before attempting to grab spinlocks etc.
This is to avoid situations such as the following "perf" trace,
which shows heavy contention on the pool spinlock:

54.15% nfsd [kernel.kallsyms] [k] _raw_spin_lock_bh
|
--- _raw_spin_lock_bh
|
|--71.43%-- svc_xprt_enqueue
| |
| |--50.31%-- svc_reserve
| |
| |--31.35%-- svc_xprt_received
| |
| |--18.34%-- svc_tcp_data_ready
...

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2825a7f90753012babe7ee292f4a1eadd3706f92 26-Aug-2013 J. Bruce Fields <bfields@redhat.com> nfsd4: allow encoding across page boundaries

After this we can handle for example getattr of very large ACLs.

Read, readdir, readlink are still special cases with their own limits.

Also we can't handle a new operation starting close to the end of a
page.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
c789102c20bbbdda6831a273e046715be9d6af79 18-May-2014 Trond Myklebust <trond.myklebust@primarydata.com> SUNRPC: Fix a module reference leak in svc_handle_xprt

If the accept() call fails, we need to put the module reference.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
16e4d93f6de7063800f3f5e68f064b0ff8fae9b7 19-May-2014 Chuck Lever <chuck.lever@oracle.com> NFSD: Ignore client's source port on RDMA transports

An NFS/RDMA client's source port is meaningless for RDMA transports.
The transport layer typically sets the source port value on the
connection to a random ephemeral port.

Currently, NFS server administrators must specify the "insecure"
export option to enable clients to access exports via RDMA.

But this means NFS clients can access such an export via IP using an
ephemeral port, which may not be desirable.

This patch eliminates the need to specify the "insecure" export
option to allow NFS/RDMA clients access to an export.

BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=250
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
45481201009e33a5e61c78fbe14ccc02b066742b 08-Feb-2014 Rashika Kheria <rashika.kheria@gmail.com> net: Mark functions as static in net/sunrpc/svc_xprt.c

Mark functions as static in net/sunrpc/svc_xprt.c because they are not
used outside this file.

This eliminates the following warning in net/sunrpc/svc_xprt.c:
net/sunrpc/svc_xprt.c:574:5: warning: no previous prototype for ‘svc_alloc_arg’ [-Wmissing-prototypes]
net/sunrpc/svc_xprt.c:615:18: warning: no previous prototype for ‘svc_get_next_xprt’ [-Wmissing-prototypes]
net/sunrpc/svc_xprt.c:694:6: warning: no previous prototype for ‘svc_add_new_temp_xprt’ [-Wmissing-prototypes]

Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
e1d83ee673eff5407255ba9e884312219f6832d7 09-Feb-2014 Rashika Kheria <rashika.kheria@gmail.com> net: Mark functions as static in net/sunrpc/svc_xprt.c

Mark functions as static in net/sunrpc/svc_xprt.c because they are not
used outside this file.

This eliminates the following warning in net/sunrpc/svc_xprt.c:
net/sunrpc/svc_xprt.c:574:5: warning: no previous prototype for ‘svc_alloc_arg’ [-Wmissing-prototypes]
net/sunrpc/svc_xprt.c:615:18: warning: no previous prototype for ‘svc_get_next_xprt’ [-Wmissing-prototypes]
net/sunrpc/svc_xprt.c:694:6: warning: no previous prototype for ‘svc_add_new_temp_xprt’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
cc630d9f476445927fca599f81182c7f06f79058 10-Feb-2013 J. Bruce Fields <bfields@redhat.com> svcrpc: fix rpc server shutdown races

Rewrite server shutdown to remove the assumption that there are no
longer any threads running (no longer true, for example, when shutting
down the service in one network namespace while it's still running in
others).

Do that by doing what we'd do in normal circumstances: just CLOSE each
socket, then enqueue it.

Since there may not be threads to handle the resulting queued xprts,
also run a simplified version of the svc_recv() loop run by a server to
clean up any closed xprts afterwards.

Cc: stable@kernel.org
Tested-by: Jason Tibbitts <tibbs@math.uh.edu>
Tested-by: Paweł Sikora <pawel.sikora@agmk.net>
Acked-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
e75bafbff2270993926abcc31358361db74a9bc2 10-Feb-2013 J. Bruce Fields <bfields@redhat.com> svcrpc: make svc_age_temp_xprts enqueue under sv_lock

svc_age_temp_xprts expires xprts in a two-step process: first it takes
the sv_lock and moves the xprts to expire off their server-wide list
(sv_tempsocks or sv_permsocks) to a local list. Then it drops the
sv_lock and enqueues and puts each one.

I see no reason for this: svc_xprt_enqueue() will take sp_lock, but the
sv_lock and sp_lock are not otherwise nested anywhere (and documentation
at the top of this file claims it's correct to nest these with sp_lock
inside.)

Cc: stable@kernel.org
Tested-by: Jason Tibbitts <tibbs@math.uh.edu>
Tested-by: Paweł Sikora <pawel.sikora@agmk.net>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
35525b79786b2ba58ef13822198ce22c497bc7a2 06-Jan-2013 Andriy Skulysh <andriy_skulysh@xyratex.com> sunrpc: Fix lockd sleeping until timeout

There is a race in enqueueing thread to a pool and
waking up a thread.
lockd doesn't wake up on reception of lock granted callback
if svc_wake_up() is called before lockd's thread is added
to a pool.

Signed-off-by: Andriy Skulysh <Andriy_Skulysh@xyratex.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
010472980724ed7bbabf818b4232bf3b182b94c5 23-Oct-2012 Weston Andros Adamson <dros@netapp.com> SUNRPC: remove BUG_ON in svc_delete_xprt

Replace BUG_ON() with WARN_ON_ONCE().

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
b25cd058f25ea2054351bbe501956002cd8ed4c5 23-Oct-2012 Weston Andros Adamson <dros@netapp.com> SUNRPC: remove BUG_ONs checking RPCSVC_MAXPAGES

Replace two bounds checking BUG_ON() calls with WARN_ON_ONCE() and resetting
the requested size to RPCSVC_MAXPAGES.

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
ff1fdb9b805fc03fb51c7b061604360af92d0c9e 23-Oct-2012 Weston Andros Adamson <dros@netapp.com> SUNRPC: remove BUG_ON in svc_xprt_received

Replace BUG_ON() with a WARN_ON_ONCE() and early return.

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
65b2e6656bda2ad983727fcc725ac66b6d5035a7 18-Aug-2012 J. Bruce Fields <bfields@redhat.com> svcrpc: split up svc_handle_xprt

Move initialization of newly accepted socket into a helper.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
6797fa5a018ff916a071c6265fbf043644abcd29 18-Aug-2012 J. Bruce Fields <bfields@redhat.com> svcrpc: break up svc_recv

Matter of taste, I suppose, but svc_recv breaks up naturally into:

allocate pages and setup arg
dequeue (wait for, if necessary) next socket
do something with that socket

And I find it easier to read when it doesn't go on for pages and pages.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
6741019c829ecfa6f7a504fae1305dcf5d5cf057 18-Aug-2012 J. Bruce Fields <bfields@redhat.com> svcrpc: make svc_xprt_received static

Note this isn't used outside svc_xprt.c.

May as well move it so we don't need a declaration while we're here.

Also remove an outdated comment.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
9f9d2ebe693a98d517257e1a39f61120b4473b96 18-Aug-2012 J. Bruce Fields <bfields@redhat.com> svcrpc: make xpo_recvfrom return only >=0

The only errors returned from xpo_recvfrom have been -EAGAIN and
-EAFNOSUPPORT. The latter was removed by a previous patch. That leaves
only -EAGAIN, which is treated just like 0 by the caller (svc_recv).

So, just ditch -EAGAIN and return 0 instead.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
39b553013719fe6495cf5e496b827b2d712e4265 14-Aug-2012 J. Bruce Fields <bfields@redhat.com> svcrpc: share some setup of listening sockets

There's some duplicate code here.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
c3341966943284ab3618a1814cefd693ad9aa736 14-Aug-2012 J. Bruce Fields <bfields@redhat.com> svcrpc: make svc_create_xprt enqueue on clearing XPT_BUSY

Whenever we clear XPT_BUSY we should call svc_xprt_enqueue(). Without
that we may fail to notice any events (such as new connections) that
arrived while XPT_BUSY was set.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
719f8bcc883e7992615f4d5625922e24995e2d98 13-Aug-2012 J. Bruce Fields <bfields@redhat.com> svcrpc: fix xpt_list traversal locking on shutdown

Server threads are not running at this point, but svc_age_temp_xprts
still may be, so we need this locking.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
d10f27a750312ed5638c876e4bd6aa83664cccd8 17-Aug-2012 J. Bruce Fields <bfields@redhat.com> svcrpc: fix svc_xprt_enqueue/svc_recv busy-looping

The rpc server tries to ensure that there will be room to send a reply
before it receives a request.

It does this by tracking, in xpt_reserved, an upper bound on the total
size of the replies that is has already committed to for the socket.

Currently it is adding in the estimate for a new reply *before* it
checks whether there is space available. If it finds that there is not
space, it then subtracts the estimate back out.

This may lead the subsequent svc_xprt_enqueue to decide that there is
space after all.

The results is a svc_recv() that will repeatedly return -EAGAIN, causing
server threads to loop without doing any actual work.

Cc: stable@vger.kernel.org
Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Tested-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
f06f00a24d76e168ecb38d352126fd203937b601 20-Aug-2012 J. Bruce Fields <bfields@redhat.com> svcrpc: sends on closed socket should stop immediately

svc_tcp_sendto sets XPT_CLOSE if we fail to transmit the entire reply.
However, the XPT_CLOSE won't be acted on immediately. Meanwhile other
threads could send further replies before the socket is really shut
down. This can manifest as data corruption: for example, if a truncated
read reply is followed by another rpc reply, that second reply will look
to the client like further read data.

Symptoms were data corruption preceded by svc_tcp_sendto logging
something like

kernel: rpc-srv/tcp: nfsd: sent only 963696 when sending 1048708 bytes - shutting down socket

Cc: stable@vger.kernel.org
Reported-by: Malahal Naineni <malahal@us.ibm.com>
Tested-by: Malahal Naineni <malahal@us.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
3ddbe8794ff1bcba5af09f2e6949755d6251958f 16-May-2012 J. Bruce Fields <bfields@redhat.com> svcrpc: fix a comment typo

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
91c427ac3a61ccabae0fdef53563edf40394b6c9 04-May-2012 Jeff Layton <jlayton@redhat.com> sunrpc: do array overrun check in svc_recv before allocating pages

There's little point in waiting until after we allocate all of the pages
to see if we're going to overrun the array. In the event that this
calculation is really off we could end up scribbling over a bunch of
memory and make it tougher to debug.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
e87cc4728f0e2fb663e592a1141742b1d6c63256 13-May-2012 Joe Perches <joe@perches.com> net: Convert net_ratelimit uses to net_<level>_ratelimited

Standardize the net core ratelimited logging functions.

Coalesce formats, align arguments.
Change a printk then vprintk sequence to use printf extension %pV.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7b147f1ff267d12e0d189ca3d4156ed5a76b8d99 31-Jan-2012 Stanislav Kinsbursky <skinsbursky@parallels.com> SUNRPC: service destruction in network namespace context

v2: Added comment to BUG_ON's in svc_destroy() to make code looks clearer.

This patch introduces network namespace filter for service destruction
function.
Nothing special here - just do exactly the same operations, but only for
tranports in passed networks namespace context.
BTW, BUG_ON() checks for empty service transports lists were returned into
svc_destroy() function. This is because of swithing generic svc_close_all() to
networks namespace dependable svc_close_net().

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
3a22bf506c9df47e93e8dc8a68d86cd8ae384d98 31-Jan-2012 Stanislav Kinsbursky <skinsbursky@parallels.com> SUNRPC: clear svc transports lists helper introduced

This patch moves service transports deletion from service sockets lists to
separated function.
This is a precursor patch, which would be usefull with service shutdown in
network namespace context, introduced later in the series.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
6f5133652eaab6fbd88bdb1b1fd2236fd82583cb 31-Jan-2012 Stanislav Kinsbursky <skinsbursky@parallels.com> SUNRPC: clear svc pools lists helper introduced

This patch moves removing of service transport from it's pools ready lists to
separated function. Also this clear is now done with list_for_each_entry_safe()
helper.
This is a precursor patch, which would be usefull with service shutdown in
network namespace context, introduced later in the series.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
4cb54ca2069903121e4c03ec427147c47bed5755 20-Jan-2012 Stanislav Kinsbursky <skinsbursky@parallels.com> SUNRPC: search for service transports in network namespace context

Service transports are parametrized by network namespace. And thus lookup of
transport instance have to take network namespace into account.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: J. Bruce Fields <bfields@redhat.com>
dfd56b8b38fff3586f36232db58e1e9f7885a605 10-Dec-2011 Eric Dumazet <eric.dumazet@gmail.com> net: use IS_ENABLED(CONFIG_IPV6)

Instead of testing defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bd4620ddf6d6eb3d9e7d073ad601fa4299d46ba9 06-Dec-2011 Stanislav Kinsbursky <skinsbursky@parallels.com> SUNRPC: create svc_xprt in proper network namespace

This patch makes svc_xprt inherit network namespace link from its socket.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
b4f36f88b3ee7cf26bf0be84e6c7fc15f84dcb71 29-Nov-2011 J. Bruce Fields <bfields@redhat.com> svcrpc: avoid memory-corruption on pool shutdown

Socket callbacks use svc_xprt_enqueue() to add an xprt to a
pool->sp_sockets list. In normal operation a server thread will later
come along and take the xprt off that list. On shutdown, after all the
threads have exited, we instead manually walk the sv_tempsocks and
sv_permsocks lists to find all the xprt's and delete them.

So the sp_sockets lists don't really matter any more. As a result,
we've mostly just ignored them and hoped they would go away.

Which has gotten us into trouble; witness for example ebc63e531cc6
"svcrpc: fix list-corrupting race on nfsd shutdown", the result of Ben
Greear noticing that a still-running svc_xprt_enqueue() could re-add an
xprt to an sp_sockets list just before it was deleted. The fix was to
remove it from the list at the end of svc_delete_xprt(). But that only
made corruption less likely--I can see nothing that prevents a
svc_xprt_enqueue() from adding another xprt to the list at the same
moment that we're removing this xprt from the list. In fact, despite
the earlier xpo_detach(), I don't even see what guarantees that
svc_xprt_enqueue() couldn't still be running on this xprt.

So, instead, note that svc_xprt_enqueue() essentially does:
lock sp_lock
if XPT_BUSY unset
add to sp_sockets
unlock sp_lock

So, if we do:

set XPT_BUSY on every xprt.
Empty every sp_sockets list, under the sp_socks locks.

Then we're left knowing that the sp_sockets lists are all empty and will
stay that way, since any svc_xprt_enqueue() will check XPT_BUSY under
the sp_lock and see it set.

And *then* we can continue deleting the xprt's.

(Thanks to Jeff Layton for being correctly suspicious of this code....)

Cc: Ben Greear <greearb@candelatech.com>
Cc: Jeff Layton <jlayton@redhat.com>
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2fefb8a09e7ed251ae8996e0c69066e74c5aa560 29-Nov-2011 J. Bruce Fields <bfields@redhat.com> svcrpc: destroy server sockets all at once

There's no reason I can see that we need to call sv_shutdown between
closing the two lists of sockets.

Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
7710ec36b6f516e026f9e29e50e67d2547c2a79b 26-Nov-2011 J. Bruce Fields <bfields@redhat.com> svcrpc: make svc_delete_xprt static

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
3a9a231d977222eea36eae091df2c358e03ac839 27-May-2011 Paul Gortmaker <paul.gortmaker@windriver.com> net: Fix files explicitly needing to include module.h

With calls to modular infrastructure, these files really
needs the full module.h header. Call it out so some of the
cleanups of implicit and unrequired includes elsewhere can be
cleaned up.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
849a1cf13d4394d398d91752166e92e9ecd64f8d 30-Aug-2011 Mi Jinlong <mijinlong@cn.fujitsu.com> SUNRPC: Replace svc_addr_u by sockaddr_storage

For IPv6 local address, lockd can not callback to client for
missing scope id when binding address at inet6_bind:

324 if (addr_type & IPV6_ADDR_LINKLOCAL) {
325 if (addr_len >= sizeof(struct sockaddr_in6) &&
326 addr->sin6_scope_id) {
327 /* Override any existing binding, if another one
328 * is supplied by user.
329 */
330 sk->sk_bound_dev_if = addr->sin6_scope_id;
331 }
332
333 /* Binding to link-local address requires an interface */
334 if (!sk->sk_bound_dev_if) {
335 err = -EINVAL;
336 goto out_unlock;
337 }

Replacing svc_addr_u by sockaddr_storage, let rqstp->rq_daddr contains more info
besides address.

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
ebc63e531cc6a457595dd110b07ac530eae788c3 29-Jun-2011 J. Bruce Fields <bfields@redhat.com> svcrpc: fix list-corrupting race on nfsd shutdown

After commit 3262c816a3d7fb1eaabce633caa317887ed549ae "[PATCH] knfsd:
split svc_serv into pools", svc_delete_xprt (then svc_delete_socket) no
longer removed its xpt_ready (then sk_ready) field from whatever list it
was on, noting that there was no point since the whole list was about to
be destroyed anyway.

That was mostly true, but forgot that a few svc_xprt_enqueue()'s might
still be hanging around playing with the about-to-be-destroyed list, and
could get themselves into trouble writing to freed memory if we left
this xprt on the list after freeing it.

(This is actually functionally identical to a patch made first by Ben
Greear, but with more comments.)

Cc: stable@kernel.org
Cc: gnb@fmeh.org
Reported-by: Ben Greear <greearb@candelatech.com>
Tested-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
99de8ea962bbc11a51ad4c52e3dc93bee5f6ba70 08-Dec-2010 J. Bruce Fields <bfields@redhat.com> rpc: keep backchannel xprt as long as server connection

Multiple backchannels can share the same tcp connection; from rfc 5661 section
2.10.3.1:

A connection's association with a session is not exclusive. A
connection associated with the channel(s) of one session may be
simultaneously associated with the channel(s) of other sessions
including sessions associated with other client IDs.

However, multiple backchannels share a connection, they must all share
the same xid stream (hence the same rpc_xprt); the only way we have to
match replies with calls at the rpc layer is using the xid.

So, keep the rpc_xprt around as long as the connection lasts, in case
we're asked to use the connection as a backchannel again.

Requests to create new backchannel clients over a given server
connection should results in creating new clients that reuse the
existing rpc_xprt.

But to start, just reject attempts to associate multiple rpc_xprt's with
the same underlying bc_xprt.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
9e701c610923aaeac8b38b9202a686d1cc9ee35d 03-Jan-2011 J. Bruce Fields <bfields@redhat.com> svcrpc: simpler request dropping

Currently we use -EAGAIN returns to determine when to drop a deferred
request. On its own, that is error-prone, as it makes us treat -EAGAIN
returns from other functions specially to prevent inadvertent dropping.

So, use a flag on the request instead.

Returning an error on request deferral is still required, to prevent
further processing, but we no longer need worry that an error return on
its own could result in a drop.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
7c96aef75949a56ec427fc6a2522dace2af33605 15-Nov-2010 NeilBrown <neilb@suse.de> sunrpc: remove xpt_pool

The xpt_pool field is only used for reporting BUGs.
And it isn't used correctly.

In particular, when it is cleared in svc_xprt_received before
XPT_BUSY is cleared, there is no guarantee that either the
compiler or the CPU might not re-order to two assignments, just
setting xpt_pool to NULL after XPT_BUSY is cleared.

If a different cpu were running svc_xprt_enqueue at this moment,
it might see XPT_BUSY clear and then xpt_pool non-NULL, and
so BUG.

This could be fixed by calling
smp_mb__before_clear_bit()
before the clear_bit. However as xpt_pool isn't really used,
it seems safest to simply remove xpt_pool.

Another alternate would be to change the clear_bit to
clear_bit_unlock, and the test_and_set_bit to test_and_set_bit_lock.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
ed2849d3ecfa339435818eeff28f6c3424300cec 16-Nov-2010 NeilBrown <neilb@suse.de> sunrpc: prevent use-after-free on clearing XPT_BUSY

When an xprt is created, it has a refcount of 1, and XPT_BUSY is set.
The refcount is *not* owned by the thread that created the xprt
(as is clear from the fact that creators never put the reference).
Rather, it is owned by the absence of XPT_DEAD. Once XPT_DEAD is set,
(And XPT_BUSY is clear) that initial reference is dropped and the xprt
can be freed.

So when a creator clears XPT_BUSY it is dropping its only reference and
so must not touch the xprt again.

However svc_recv, after calling ->xpo_accept (and so getting an XPT_BUSY
reference on a new xprt), calls svc_xprt_recieved. This clears
XPT_BUSY and then svc_xprt_enqueue - this last without owning a reference.
This is dangerous and has been seen to leave svc_xprt_enqueue working
with an xprt containing garbage.

So we need to hold an extra counted reference over that call to
svc_xprt_received.

For safety, any time we clear XPT_BUSY and then use the xprt again, we
first get a reference, and the put it again afterwards.

Note that svc_close_all does not need this extra protection as there are
no threads running, and the final free can only be called asynchronously
from such a thread.

Signed-off-by: NeilBrown <neilb@suse.de>
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
9c335c0b8daf56b9f73479d00b1dd726e1fcca09 26-Oct-2010 J. Bruce Fields <bfields@redhat.com> svcrpc: fix wspace-checking race

We call svc_xprt_enqueue() after something happens which we think may
require handling from a server thread. To avoid such events being lost,
svc_xprt_enqueue() must guarantee that there will be a svc_serv() call
from a server thread following any such event. It does that by either
waking up a server thread itself, or checking that XPT_BUSY is set (in
which case somebody else is doing it).

But the check of XPT_BUSY could occur just as someone finishes
processing some other event, and just before they clear XPT_BUSY.

Therefore it's important not to clear XPT_BUSY without subsequently
doing another svc_export_enqueue() to check whether the xprt should be
requeued.

The xpo_wspace() check in svc_xprt_enqueue() breaks this rule, allowing
an event to be missed in situations like:

data arrives
call svc_tcp_data_ready():
call svc_xprt_enqueue():
set BUSY
find no write space
svc_reserve():
free up write space
call svc_enqueue():
test BUSY
clear BUSY

So, instead, check wspace in the same places that the state flags are
checked: before taking BUSY, and in svc_receive().

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
b176331627fccc726d28f4fc4a357d1f3c19dbf0 26-Oct-2010 J. Bruce Fields <bfields@redhat.com> svcrpc: svc_close_xprt comment

Neil Brown had to explain to me why we do this here; record the answer
for posterity.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
f8c0d226fef05226ff1a85055c8ed663022f40c1 26-Oct-2010 J. Bruce Fields <bfields@redhat.com> svcrpc: simplify svc_close_all

There's no need to be fooling with XPT_BUSY now that all the threads
are gone.

The list_del_init() here could execute at the same time as the
svc_xprt_enqueue()'s list_add_tail(), with undefined results. We don't
really care at this point, but it might result in a spurious
list-corruption warning or something.

And svc_close() isn't adding any value; just call svc_delete_xprt()
directly.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
ca7896cd83456082b1e78816cdf7e41658ef7bcd 25-Oct-2010 J. Bruce Fields <bfields@redhat.com> nfsd4: centralize more calls to svc_xprt_received

Follow up on b48fa6b99100dc7772af3cd276035fcec9719ceb by moving all the
svc_xprt_received() calls for the main xprt to one place. The clearing
of XPT_BUSY here is critical to the correctness of the server, so I'd
prefer it to be obvious where we do it.

The only substantive result is moving svc_xprt_received() after
svc_receive_deferred(). Other than a (likely insignificant) delay
waking up the next thread, that should be harmless.

Also reshuffle the exit code a little to skip a few other steps that we
don't care about the in the svc_delete_xprt() case.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
62bac4af3d778f6d06d351c0442008967c512588 25-Oct-2010 J. Bruce Fields <bfields@redhat.com> svcrpc: don't set then immediately clear XPT_DEFERRED

There's no harm to doing this, since the only caller will immediately
call svc_enqueue() afterwards, ensuring we don't miss the remaining
deferred requests just because XPT_DEFERRED was briefly cleared.

But why not just do this the simple way?

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
451a3c24b0135bce54542009b5fde43846c7cf67 17-Nov-2010 Arnd Bergmann <arnd@arndb.de> BKL: remove extraneous #include <smp_lock.h>

The big kernel lock has been removed from all these files at some point,
leaving only the #include.

Remove this too as a cleanup.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
01dba075d571f5a8b7dcb153fdfd14e981c4cee3 23-Oct-2010 J. Bruce Fields <bfields@redhat.com> svcrpc: no need for XPT_DEAD check in svc_xprt_enqueue

If any xprt marked DEAD is also left BUSY for the rest of its life, then
the XPT_DEAD check here is superfluous--we'll get the same result from
the XPT_BUSY check just after.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
ac9303eb74471bc2567960b47497a8bfbe1e5a03 23-Oct-2010 J. Bruce Fields <bfields@redhat.com> svcrpc: assume svc_delete_xprt() called only once

As long as DEAD exports are left BUSY, and svc_delete_xprt is called
only with BUSY held, then svc_delete_xprt() will never be called on an
xprt that is already DEAD.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
7e4fdd0744fcb9f08854c37643bf529c5945cc36 23-Oct-2010 J. Bruce Fields <bfields@redhat.com> svcrpc: never clear XPT_BUSY on dead xprt

Once an xprt has been deleted, there's no reason to allow it to be
enqueued--at worst, that might cause the xprt to be re-added to some
global list, resulting in later corruption.

Also, note this leaves us with no need for the reference-count
manipulation here.

Reviewed-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
8f3a6de313391b6910aa7db185eb9f3e930a51cf 05-Oct-2010 Pavel Emelyanov <xemul@parallels.com> sunrpc: Turn list_for_each-s into the ..._entry-s

Saves some lines of code and some branticks when reading one.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
edc7a894034acb4c7ff8305716ca5df8aaf8e642 22-Mar-2010 J. Bruce Fields <bfields@citi.umich.edu> nfsd: provide callbacks on svc_xprt deletion

NFSv4.1 needs warning when a client tcp connection goes down, if that
connection is being used as a backchannel, so that it can warn the
client that it has lost the backchannel connection.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
62832c039eab9d03cd28a66427ce8276988f28b0 29-Sep-2010 Pavel Emelyanov <xemul@parallels.com> sunrpc: Pull net argument downto svc_create_socket

After this the socket creation in it knows the context.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
fc5d00b04a3a58cac8620403dfe9f43f72578ec1 29-Sep-2010 Pavel Emelyanov <xemul@parallels.com> sunrpc: Add net argument to svc_create_xprt

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
4fb8518bdac8e85f6580ea3f586adf396cd472bc 27-Sep-2010 Pavel Emelyanov <xemul@parallels.com> sunrpc: Tag svc_xprt with net

The transport representation should be per-net of course.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
e3bfca01c1ad378deaee598292bcc7ee19024563 27-Sep-2010 Pavel Emelyanov <xemul@parallels.com> sunrpc: Make xprt auth cache release work with the xprt

This is done in order to facilitate getting the ip_map_cache from
which to put the ip_map.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
6610f720e9e8103c22d1f1ccf8fbb695550a571f 26-Aug-2010 J. Bruce Fields <bfields@redhat.com> svcrpc: minor cache cleanup

Pull out some code into helper functions, fix a typo.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
f16b6e8d838b2e2bb4561201311c66ac02ad67df 12-Aug-2010 NeilBrown <neilb@suse.de> sunrpc/cache: allow threads to block while waiting for cache update.

The current practice of waiting for cache updates by queueing the
whole request to be retried has (at least) two problems.

1/ With NFSv4, requests can be quite complex and re-trying a whole
request when a later part fails should only be a last-resort, not a
normal practice.

2/ Large requests, and in particular any 'write' request, will not be
queued by the current code and doing so would be undesirable.

In many cases only a very sort wait is needed before the cache gets
valid data.

So, providing the underlying transport permits it by setting
->thread_wait,
arrange to wait briefly for an upcall to be completed (as reflected in
the clearing of CACHE_PENDING).
If the short wait was not long enough and CACHE_PENDING is still set,
fall back on the old approach.

The 'thread_wait' value is set to 5 seconds when there are spare
threads, and 1 second when there are no spare threads.

These values are probably much higher than needed, but will ensure
some forward progress.

Note that as we only request an update for a non-valid item, and as
non-valid items are updated in place it is extremely unlikely that
cache_check will return -ETIMEDOUT. Normally cache_defer_req will
sleep for a short while and then find that the item is_valid.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
b48fa6b99100dc7772af3cd276035fcec9719ceb 01-Mar-2010 Neil Brown <neilb@suse.de> sunrpc: centralise most calls to svc_xprt_received

svc_xprt_received must be called when ->xpo_recvfrom has finished
receiving a message, so that the XPT_BUSY flag will be cleared and
if necessary, requeued for further work.

This call is currently made in each ->xpo_recvfrom function, often
from multiple different points. In each case it is the earliest point
on a particular path where it is known that the protection provided by
XPT_BUSY is no longer needed.

However there are (still) some error paths which do not call
svc_xprt_received, and requiring each ->xpo_recvfrom to make the call
does not encourage robustness.

So: move the svc_xprt_received call to be made just after the
call to ->xpo_recvfrom(), and move it of the various ->xpo_recvfrom
methods.

This means that it may not be called at the earliest possible instant,
but this is unlikely to be a measurable performance issue.

Note that there are still other calls to svc_xprt_received as it is
also needed when an xprt is newly created.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
5a0e3ad6af8660be21ca98a971cd00f331318c05 24-Mar-2010 Tejun Heo <tj@kernel.org> include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
788e69e548cc8d127b90f0de1f7b7e983d1d587a 30-Mar-2010 J. Bruce Fields <bfields@citi.umich.edu> svcrpc: don't hold sv_lock over svc_xprt_put()

svc_xprt_put() can call tcp_close(), which can sleep, so we shouldn't be
holding this lock.

In fact, only the xpt_list removal and the sv_tmpcnt decrement should
need the sv_lock here.

Reported-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
1b644b6e6f6160ae35ce4b52c2ca89ed3e356e18 28-Feb-2010 J. Bruce Fields <bfields@citi.umich.edu> Revert "sunrpc: move the close processing after do recvfrom method"

This reverts commit b0401d725334a94d57335790b8ac2404144748ee, which
moved svc_delete_xprt() outside of XPT_BUSY, and allowed it to be called
after svc_xpt_recived(), removing its last reference and destroying it
after it had already been queued for future processing.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
f5822754ea006563e1bf0a1f43faaad49c0d8bb2 28-Feb-2010 J. Bruce Fields <bfields@citi.umich.edu> Revert "sunrpc: fix peername failed on closed listener"

This reverts commit b292cf9ce70d221c3f04ff62db5ab13d9a249ca8. The
commit that it attempted to patch up,
b0401d725334a94d57335790b8ac2404144748ee, was fundamentally wrong, and
will also be reverted.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
ab1b18f70a007ea6caeb007d269abb75b131a410 26-Feb-2010 Neil Brown <neilb@suse.de> sunrpc: remove unnecessary svc_xprt_put

The 'struct svc_deferred_req's on the xpt_deferred queue do not
own a reference to the owning xprt. This is seen in svc_revisit
which is where things are added to this queue. dr->xprt is set to
NULL and the reference to the xprt it put.

So when this list is cleaned up in svc_delete_xprt, we mustn't
put the reference.

Also, replace the 'for' with a 'while' which is arguably
simpler and more likely to compile efficiently.

Cc: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
68717908155a9dcd4161f4d730fea478712d9794 26-Jan-2010 Chuck Lever <chuck.lever@oracle.com> SUNRPC: NFS kernel APIs shouldn't return ENOENT for "transport not found"

write_ports() converts svc_create_xprt()'s ENOENT error return to
EPROTONOSUPPORT so that rpc.nfsd (in user space) can report an error
message that makes sense.

It turns out that several of the other kernel APIs rpc.nfsd use can
also return ENOENT from svc_create_xprt(), by way of lockd_up().

On the client side, an NFSv2 or NFSv3 mount request can also return
the result of lockd_up(). This error may also be returned during an
NFSv4 mount request, since the NFSv4 callback service uses
svc_create_xprt() to create the callback listener. An ENOENT error
return results in a confusing error message from the mount command.

Let's have svc_create_xprt() return EPROTONOSUPPORT instead of ENOENT.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
d6783b2b6c4050df0ba0a84c6842cf5bc2212ef9 26-Jan-2010 Chuck Lever <chuck.lever@oracle.com> SUNRPC: Bury "#ifdef IPV6" in svc_create_xprt()

Clean up: Bruce observed we have more or less common logic in each of
svc_create_xprt()'s callers: the check to create an IPv6 RPC listener
socket only if CONFIG_IPV6 is set. I'm about to add another case
that does just the same.

If we move the ifdefs into __svc_xpo_create(), then svc_create_xprt()
call sites can get rid of the "#ifdef" ugliness, and can use the same
logic with or without IPv6 support available in the kernel.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
b292cf9ce70d221c3f04ff62db5ab13d9a249ca8 31-Dec-2009 Xiaotian Feng <dfeng@redhat.com> sunrpc: fix peername failed on closed listener

There're some warnings of "nfsd: peername failed (err 107)!"
socket error -107 means Transport endpoint is not connected.
This warning message was outputed by svc_tcp_accept() [net/sunrpc/svcsock.c],
when kernel_getpeername returns -107. This means socket might be CLOSED.

And svc_tcp_accept was called by svc_recv() [net/sunrpc/svc_xprt.c]

if (test_bit(XPT_LISTENER, &xprt->xpt_flags)) {
<snip>
newxpt = xprt->xpt_ops->xpo_accept(xprt);
<snip>

So this might happen when xprt->xpt_flags has both XPT_LISTENER and XPT_CLOSE.

Let's take a look at commit b0401d72, this commit has moved the close
processing after do recvfrom method, but this commit also introduces this
warnings, if the xpt_flags has both XPT_LISTENER and XPT_CLOSED, we should
close it, not accpet then close.

Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
Cc: J. Bruce Fields <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
f64f9e719261a87818dd192a3a2352e5b20fbd0f 30-Nov-2009 Joe Perches <joe@perches.com> net: Move && and || to end of previous line

Not including net/atm/

Compiled tested x86 allyesconfig only
Added a > 80 column line or two, which I ignored.
Existing checkpatch plaints willfully, cheerfully ignored.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
78c210efdefe07131f91ed512a3308b15bb14e2f 06-Aug-2009 J. Bruce Fields <bfields@citi.umich.edu> Revert "knfsd: avoid overloading the CPU scheduler with enormous load averages"

This reverts commit 59a252ff8c0f2fa32c896f69d56ae33e641ce7ad.

This helps in an entirely cached workload but not necessarily in
workloads that require waiting on disk.

Conflicts:

include/linux/sunrpc/svc.h
net/sunrpc/svc_xprt.c

Reported-by: Simon Kirby <sim@hostway.ca>
Tested-by: Jesper Krogh <jesper@krogh.cc>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
4cfc7e6019caa3e97d2a81c48c8d575d7b38d751 10-Sep-2009 Rahul Iyer <iyer@netapp.com> nfsd41: sunrpc: Added rpc server-side backchannel handling

When the call direction is a reply, copy the xid and call direction into the
req->rq_private_buf.head[0].iov_base otherwise rpc_verify_header returns
rpc_garbage.

Signed-off-by: Rahul Iyer <iyer@netapp.com>
Signed-off-by: Mike Sager <sager@netapp.com>
Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[get rid of CONFIG_NFSD_V4_1]
[sunrpc: refactoring of svc_tcp_recvfrom]
[nfsd41: sunrpc: create common send routine for the fore and the back channels]
[nfsd41: sunrpc: Use free_page() to free server backchannel pages]
[nfsd41: sunrpc: Document server backchannel locking]
[nfsd41: sunrpc: remove bc_connect_worker()]
[nfsd41: sunrpc: Define xprt_server_backchannel()[
[nfsd41: sunrpc: remove bc_close and bc_init_auto_disconnect dummy functions]
[nfsd41: sunrpc: eliminate unneeded switch statement in xs_setup_tcp()]
[nfsd41: sunrpc: Don't auto close the server backchannel connection]
[nfsd41: sunrpc: Remove unused functions]
Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: change bc_sock to bc_xprt]
[nfsd41: sunrpc: move struct rpc_buffer def into a common header file]
[nfsd41: sunrpc: use rpc_sleep in bc_send_request so not to block on mutex]
[removed cosmetic changes]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[sunrpc: add new xprt class for nfsv4.1 backchannel]
[sunrpc: v2.1 change handling of auto_close and init_auto_disconnect operations for the nfsv4.1 backchannel]
Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
[reverted more cosmetic leftovers]
[got rid of xprt_server_backchannel]
[separated "nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel"]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Cc: Trond Myklebust <trond.myklebust@netapp.com>
[sunrpc: change idle timeout value for the backchannel]
Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Acked-by: Trond Myklebust <trond.myklebust@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
b0401d725334a94d57335790b8ac2404144748ee 27-Aug-2009 Wei Yongjun <yjwei@cn.fujitsu.com> sunrpc: move the close processing after do recvfrom method

sunrpc: "Move close processing to a single place"
(d7979ae4a050a45b78af51832475001b68263d2a) moved the
close processing before the recvfrom method. This may
cause the close processing never to execute. So this
patch moves it to the right place.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
ed2d8aed52212610d4cb79be3cbf535b04be38dc 15-Aug-2009 Ryusei Yamaguchi <mandel59@gmail.com> knfsd: Replace lock_kernel with a mutex in nfsd pool stats.

lock_kernel() in knfsd was replaced with a mutex. The later
commit 03cf6c9f49a8fea953d38648d016e3f46e814991 ("knfsd:
add file to export stats about nfsd pools") did not follow
that change. This patch fixes the issue.

Also move the get and put of nfsd_serv to the open and close methods
(instead of start and stop methods) to allow atomic check and increment
of reference count in the open method (where we can still return an
error).

Signed-off-by: Ryusei Yamaguchi <mandel59@gmail.com>
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Greg Banks <gnb@fmeh.org>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
405f55712dfe464b3240d7816cc4fe4174831be2 11-Jul-2009 Alexey Dobriyan <adobriyan@gmail.com> headers: smp_lock.h redux

* Remove smp_lock.h from files which don't need it (including some headers!)
* Add smp_lock.h to files which do need it
* Make smp_lock.h include conditional in hardirq.h
It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT

This will make hardirq.h inclusion cheaper for every PREEMPT=n config
(which includes allmodconfig/allyesconfig, BTW)

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
335c54bdc4d3bacdbd619ec95cd0b352435bd37f 24-Apr-2009 Chuck Lever <chuck.lever@oracle.com> NFSD: Prevent a buffer overflow in svc_xprt_names()

The svc_xprt_names() function can overflow its buffer if it's so near
the end of the passed in buffer that the "name too long" string still
doesn't fit. Of course, it could never tell if it was near the end
of the passed in buffer, since its only caller passes in zero as the
buffer length.

Let's make this API a little safer.

Change svc_xprt_names() so it *always* checks for a buffer overflow,
and change its only caller to pass in the correct buffer length.

If svc_xprt_names() does overflow its buffer, it now fails with an
ENAMETOOLONG errno, instead of trying to write a message at the end
of the buffer. I don't like this much, but I can't figure out a clean
way that's always safe to return some of the names, *and* an
indication that the buffer was not long enough.

The displayed error when doing a 'cat /proc/fs/nfsd/portlist' is
"File name too long".

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
dcf1a3573eae69937fb14462369c4d3e6f4a37f1 23-Apr-2009 H Hartley Sweeten <hartleys@visionengravers.com> net/sunrpc/svc_xprt.c: fix sparse warnings

Fix the following sparse warnings in net/sunrpc/svc_xprt.c.

warning: symbol 'svc_recv' was not declared. Should it be static?
warning: symbol 'svc_drop' was not declared. Should it be static?
warning: symbol 'svc_send' was not declared. Should it be static?
warning: symbol 'svc_close_all' was not declared. Should it be static?

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2f425878b6a71571341dcd3f9e9d1a6f6355da9c 03-Apr-2009 Andy Adamson <andros@netapp.com> nfsd: don't use the deferral service, return NFS4ERR_DELAY

On an NFSv4.1 server cache miss that causes an upcall, NFS4ERR_DELAY will be
returned. It is up to the NFSv4.1 client to resend only the operations that
have not been processed.

Initialize rq_usedeferral to 1 in svc_process(). It sill be turned off in
nfsd4_proc_compound() only when NFSv4.1 Sessions are used.

Note: this isn't an adequate solution on its own. It's acceptable as a way
to get some minimal 4.1 up and working, but we're going to have to find a
way to avoid returning DELAY in all common cases before 4.1 can really be
considered ready.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: reverse rq_nodeferral negative logic]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[sunrpc: initialize rq_usedeferral]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
9652ada3fb5914a67d8422114e8a76388330fa79 19-Mar-2009 Chuck Lever <chuck.lever@oracle.com> SUNRPC: Change svc_create_xprt() to take a @family argument

The sv_family field is going away. Pass a protocol family argument to
svc_create_xprt() instead of extracting the family from the passed-in
svc_serv struct.

Again, as this is a listener socket and not an address, we make this
new argument an "int" protocol family, instead of an "sa_family_t."

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
156e62094a74cf43f02f56ef96b6cda567501357 19-Mar-2009 Chuck Lever <chuck.lever@oracle.com> SUNRPC: Clean up svc_find_xprt() calling sequence

Clean up: add documentating comment and use appropriate data types for
svc_find_xprt()'s arguments.

This also eliminates a mixed sign comparison: @port was an int, while
the return value of svc_xprt_local_port() is an unsigned short.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
03cf6c9f49a8fea953d38648d016e3f46e814991 13-Jan-2009 Greg Banks <gnb@sgi.com> knfsd: add file to export stats about nfsd pools

Add /proc/fs/nfsd/pool_stats to export to userspace various
statistics about the operation of rpc server thread pools.

This patch is based on a forward-ported version of
knfsd-add-pool-thread-stats which has been shipping in the SGI
"Enhanced NFS" product since 2006 and which was previously
posted:

http://article.gmane.org/gmane.linux.nfs/10375

It has also been updated thus:

* moved EXPORT_SYMBOL() to near the function it exports
* made the new struct struct seq_operations const
* used SEQ_START_TOKEN instead of ((void *)1)
* merged fix from SGI PV 990526 "sunrpc: use dprintk instead of
printk in svc_pool_stats_*()" by Harshula Jayasuriya.
* merged fix from SGI PV 964001 "Crash reading pool_stats before
nfsds are started".

Signed-off-by: Greg Banks <gnb@sgi.com>
Signed-off-by: Harshula Jayasuriya <harshula@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
59a252ff8c0f2fa32c896f69d56ae33e641ce7ad 13-Jan-2009 Greg Banks <gnb@sgi.com> knfsd: avoid overloading the CPU scheduler with enormous load averages

Avoid overloading the CPU scheduler with enormous load averages
when handling high call-rate NFS loads. When the knfsd bottom half
is made aware of an incoming call by the socket layer, it tries to
choose an nfsd thread and wake it up. As long as there are idle
threads, one will be woken up.

If there are lot of nfsd threads (a sensible configuration when
the server is disk-bound or is running an HSM), there will be many
more nfsd threads than CPUs to run them. Under a high call-rate
low service-time workload, the result is that almost every nfsd is
runnable, but only a handful are actually able to run. This situation
causes two significant problems:

1. The CPU scheduler takes over 10% of each CPU, which is robbing
the nfsd threads of valuable CPU time.

2. At a high enough load, the nfsd threads starve userspace threads
of CPU time, to the point where daemons like portmap and rpc.mountd
do not schedule for tens of seconds at a time. Clients attempting
to mount an NFS filesystem timeout at the very first step (opening
a TCP connection to portmap) because portmap cannot wake up from
select() and call accept() in time.

Disclaimer: these effects were observed on a SLES9 kernel, modern
kernels' schedulers may behave more gracefully.

The solution is simple: keep in each svc_pool a counter of the number
of threads which have been woken but have not yet run, and do not wake
any more if that count reaches an arbitrary small threshold.

Testing was on a 4 CPU 4 NIC Altix using 4 IRIX clients, each with 16
synthetic client threads simulating an rsync (i.e. recursive directory
listing) workload reading from an i386 RH9 install image (161480
regular files in 10841 directories) on the server. That tree is small
enough to fill in the server's RAM so no disk traffic was involved.
This setup gives a sustained call rate in excess of 60000 calls/sec
before being CPU-bound on the server. The server was running 128 nfsds.

Profiling showed schedule() taking 6.7% of every CPU, and __wake_up()
taking 5.2%. This patch drops those contributions to 3.0% and 2.2%.
Load average was over 120 before the patch, and 20.9 after.

This patch is a forward-ported version of knfsd-avoid-nfsd-overload
which has been shipping in the SGI "Enhanced NFS" product since 2006.
It has been posted before:

http://article.gmane.org/gmane.linux.nfs/10374

Signed-off-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
24c3767e41a6a59d32bb45abe899eb194e6bf1b8 23-Dec-2008 Trond Myklebust <Trond.Myklebust@netapp.com> SUNRPC: The sunrpc server code should not be used by out-of-tree modules

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
22945e4a1c7454c97f5d8aee1ef526c83fef3223 05-Jan-2009 Tom Tucker <tom@opengridcomputing.com> svc: Clean up deferred requests on transport destruction

A race between svc_revisit and svc_delete_xprt can result in
deferred requests holding references on a transport that can never be
recovered because dead transports are not enqueued for subsequent
processing.

Check for XPT_DEAD in revisit to clean up completing deferrals on a dead
transport and sweep a transport's deferred queue to do the same for queued
but unprocessed deferrals.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2779e3ae39645515cb6c1126634f47c28c9e7190 05-Jan-2009 Tom Tucker <tom@opengridcomputing.com> svc: Move kfree of deferral record to common code

The rqstp structure has a pointer to a svc_deferred_req record
that is allocated when requests are deferred. This record is common
to all transports and can be freed in common code.

Move the kfree of the rq_deferred to the common svc_xprt_release
function.

This also fixes a memory leak in the RDMA transport which does not
kfree the dr structure in it's version of the xpo_release_rqst callback.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
c9233eb7b0b11ef176d4bf68da2ce85464b6ec39 20-Oct-2008 Jeff Layton <jlayton@redhat.com> sunrpc: add sv_maxconn field to svc_serv (try #3)

svc_check_conn_limits() attempts to prevent denial of service attacks
by having the service close old connections once it reaches a
threshold. This threshold is based on the number of threads in the
service:

(serv->sv_nrthreads + 3) * 20

Once we reach this, we drop the oldest connections and a printk pops
to warn the admin that they should increase the number of threads.

Increasing the number of threads isn't an option however for services
like lockd. We don't want to eliminate this check entirely for such
services but we need some way to increase this limit.

This patch adds a sv_maxconn field to the svc_serv struct. When it's
set to 0, we use the current method to calculate the max number of
connections. RPC services can then set this on an as-needed basis.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
5dd248f6f1ffe1f691fd66749e2a3dc8f8eb7b5e 01-Jul-2008 Chuck Lever <chuck.lever@oracle.com> SUNRPC: Use proper INADDR_ANY when setting up RPC services on IPv6

Teach svc_create_xprt() to use the correct ANY address for AF_INET6 based
RPC services.

No caller uses AF_INET6 yet.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
aa3314c8d6da673b3454549eed45547a79f7cbe1 25-Apr-2008 Tom Tucker <tom@opengridcomputing.com> svc: Remove unused header files from svc_xprt.c

This cosmetic patch removes unused header files that svc_xprt.c
inherited from svcsock.c

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
fc63a050861a53ba99a6222229cda555796d669e 25-Apr-2008 Tom Tucker <tom@opengridcomputing.com> svc: Remove extra check for XPT_DEAD bit in svc_xprt_enqueue

Remove a redundant check for the XPT_DEAD bit in the svc_xprt_enqueue
function. This same bit is checked below while holding the pool lock
and prints a debug message if found to be dead.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
7b54fe61ffd5bfa4e50d371a2415225aa0cbb38e 12-Feb-2008 Jeff Layton <jlayton@redhat.com> SUNRPC: allow svc_recv to break out of 500ms sleep when alloc_page fails

svc_recv() calls alloc_page(), and if it fails it does a 500ms
uninterruptible sleep and then reattempts. There doesn't seem to be any
real reason for this to be uninterruptible, so change it to an
interruptible sleep. Also check for kthread_stop() and signalled() after
setting the task state to avoid races that might lead to sleeping after
kthread_stop() wakes up the task.

I've done some very basic smoke testing with this, but obviously it's
hard to test the actual changes since this all depends on an
alloc_page() call failing.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
7086721f9c8b59331e164e534f588e075cfd9d3f 07-Feb-2008 Jeff Layton <jlayton@redhat.com> SUNRPC: have svc_recv() check kthread_should_stop()

When using kthreads that call into svc_recv, we want to make sure that
they do not block there for a long time when we're trying to take down
the kthread.

This patch changes svc_recv() to check kthread_should_stop() at the same
places that it checks to see if it's signalled(). Also check just before
svc_recv() tries to schedule(). By making sure that we check it just
after setting the task state we can avoid having to use any locking or
signalling to ensure it doesn't block for a long time.

There's still a chance of a 500ms sleep if alloc_page() fails, but
that should be a rare occurrence and isn't a terribly long time in
the context of a kthread being taken down.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
e6f1cebf71c4e7aae7dfa43414ce2631291def9f 18-Mar-2008 Al Viro <viro@zeniv.linux.org.uk> [NET] endianness noise: INADDR_ANY

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
d2f7e79e3bad31b3d52c405085b9e01e5f6c01e0 14-Jul-2007 Trond Myklebust <Trond.Myklebust@netapp.com> SUNRPC: Move exported symbol definitions after function declaration part 2

Do it for the server code...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
9571af18fa1e4a431dc6f6023ddbd87d1112fd5d 31-Dec-2007 Tom Tucker <tom@opengridcomputing.com> svc: Add svc_xprt_names service to replace svc_sock_names

Create a transport independent version of the svc_sock_names function.

The toclose capability of the svc_sock_names service can be implemented
using the svc_xprt_find and svc_xprt_close services.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
a217813f9067b785241cb7f31956e51d2071703a 31-Dec-2007 Tom Tucker <tom@opengridcomputing.com> knfsd: Support adding transports by writing portlist file

Update the write handler for the portlist file to allow creating new
listening endpoints on a transport. The general form of the string is:

<transport_name><space><port number>

For example:

echo "tcp 2049" > /proc/fs/nfsd/portlist

This is intended to support the creation of a listening endpoint for
RDMA transports without adding #ifdef code to the nfssvc.c file.

Transports can also be removed as follows:

'-'<transport_name><space><port number>

For example:

echo "-tcp 2049" > /proc/fs/nfsd/portlist

Attempting to add a listener with an invalid transport string results
in EPROTONOSUPPORT and a perror string of "Protocol not supported".

Attempting to remove an non-existent listener (.e.g. bad proto or port)
results in ENOTCONN and a perror string of
"Transport endpoint is not connected"

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
7fcb98d58cb4d18af6386f71025fc5192f25fbca 31-Dec-2007 Tom Tucker <tom@opengridcomputing.com> svc: Add svc API that queries for a transport instance

Add a new svc function that allows a service to query whether a
transport instance has already been created. This is used in lockd
to determine whether or not a transport needs to be created when
a lockd instance is brought up.

Specifying 0 for the address family or port is effectively a wild-card,
and will result in matching the first transport in the service's list
that has a matching class name.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
dc9a16e49dbba3dd042e6aec5d9a7929e099a89b 31-Dec-2007 Tom Tucker <tom@opengridcomputing.com> svc: Add /proc/sys/sunrpc/transport files

Add a file that when read lists the set of registered svc
transports.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
260c1d1298f6703d38fdccd3dd5a310766327340 31-Dec-2007 Tom Tucker <tom@opengridcomputing.com> svc: Add transport hdr size for defer/revisit

Some transports have a header in front of the RPC header. The current
defer/revisit processing considers only the iov_len and arg_len to
determine how much to back up when saving the original request
to revisit. Add a field to the rqstp structure to save the size
of the transport header so svc_defer can correctly compute
the start of a request.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
0f0257eaa5d29b80f6ab2c40ed21aa65bb4527f6 31-Dec-2007 Tom Tucker <tom@opengridcomputing.com> svc: Move the xprt independent code to the svc_xprt.c file

This functionally trivial patch moves all of the transport independent
functions from the svcsock.c file to the transport independent svc_xprt.c
file.

In addition the following formatting changes were made:
- White space cleanup
- Function signatures on single line
- The inline directive was removed
- Lines over 80 columns were reformatted
- The term 'socket' was changed to 'transport' in comments
- The SMP comment was moved and updated.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
4e5caaa5f24b3df1fe01097e1e7576461e70d491 31-Dec-2007 Tom Tucker <tom@opengridcomputing.com> svc: Move create logic to common code

Move the svc transport list logic into common transport creation code.
Refactor this code path to make the flow of control easier to read.

Move the setting and clearing of the BUSY_BIT during transport creation
to common code.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
9dbc240f199c16c3c0859c255ad52a663d8ee51d 31-Dec-2007 Tom Tucker <tom@opengridcomputing.com> svc: Move the sockaddr information to svc_xprt

This patch moves the transport sockaddr to the svc_xprt
structure. Convenience functions are added to set and
get the local and remote addresses of a transport from
the transport provider as well as determine the length
of a sockaddr.

A transport is responsible for setting the xpt_local
and xpt_remote addresses in the svc_xprt structure as
part of transport creation and xpo_accept processing. This
cannot be done in a generic way and in fact varies
between TCP, UDP and RDMA. A set of xpo_ functions
(e.g. getlocalname, getremotename) could have been
added but this would have resulted in additional
caching and copying of the addresses around. Note that
the xpt_local address should also be set on listening
endpoints; for TCP/RDMA this is done as part of
endpoint creation.

For connected transports like TCP and RDMA, the addresses
never change and can be set once and copied into the
rqstp structure for each request. For UDP, however, the
local and remote addresses may change for each request. In
this case, the address information is obtained from the
UDP recvmsg info and copied into the rqstp structure from
there.

A svc_xprt_local_port function was also added that returns
the local port given a transport. This is used by
svc_create_xprt when returning the port associated with
a newly created transport, and later when creating a
generic find transport service to check if a service is
already listening on a given port.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
8c7b0172a1db8120d25ecb4eff69664c52ee7639 31-Dec-2007 Tom Tucker <tom@opengridcomputing.com> svc: Make deferral processing xprt independent

This patch moves the transport independent sk_deferred list to the svc_xprt
structure and updates the svc_deferred_req structure to keep pointers to
svc_xprt's directly. The deferral processing code is also moved out of the
transport dependent recvfrom functions and into the generic svc_recv path.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
def13d7401e9b95bbd34c20057ebeb2972708b1b 31-Dec-2007 Tom Tucker <tom@opengridcomputing.com> svc: Move the authinfo cache to svc_xprt.

Move the authinfo cache to svc_xprt. This allows both the TCP and RDMA
transports to share this logic. A flag bit is used to determine if
auth information is to be cached or not. Previously, this code looked
at the transport protocol.

I've also changed the spin_lock/unlock logic so that a lock is not taken for
transports that are not caching auth info.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
a50fea26b9d2aa7b66fdd6d9579de10827ec086a 31-Dec-2007 Tom Tucker <tom@opengridcomputing.com> svc: Make svc_send transport neutral

Move the sk_mutex field to the transport independent svc_xprt structure.
Now all the fields that svc_send touches are transport neutral. Change the
svc_send function to use the transport independent svc_xprt directly instead
of the transport dependent svc_sock structure.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
7a18208383ab3f3ce4a1f4e0536acc9372523d81 31-Dec-2007 Tom Tucker <tom@opengridcomputing.com> svc: Make close transport independent

Move sk_list and sk_ready to svc_xprt. This involves close because these
lists are walked by svcs when closing all their transports. So I combined
the moving of these lists to svc_xprt with making close transport independent.

The svc_force_sock_close has been changed to svc_close_all and takes a list
as an argument. This removes some svc internals knowledge from the svcs.

This code races with module removal and transport addition.

Thanks to Simon Holm Thøgersen for a compile fix.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Simon Holm Thøgersen <odie@cs.aau.dk>
bb5cf160b282644c4491afbf76fbc66f5dc35030 31-Dec-2007 Tom Tucker <tom@opengridcomputing.com> svc: Move sk_server and sk_pool to svc_xprt

This is another incremental change that moves transport independent
fields from svc_sock to the svc_xprt structure. The changes
should be functionally null.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
e1b3157f9710622bad6c7747d3b08ed3d2394cf6 31-Dec-2007 Tom Tucker <tom@opengridcomputing.com> svc: Change sk_inuse to a kref

Change the atomic_t reference count to a kref and move it to the
transport indepenent svc_xprt structure. Change the reference count
wrapper names to be generic.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
b700cbb11fced2a0e953fdd19eac07ffaad86598 31-Dec-2007 Tom Tucker <tom@opengridcomputing.com> svc: Add a generic transport svc_create_xprt function

The svc_create_xprt function is a transport independent version
of the svc_makesock function.

Since transport instance creation contains transport dependent and
independent components, add an xpo_create transport function. The
transport implementation of this function allocates the memory for the
endpoint, implements the transport dependent initialization logic, and
calls svc_xprt_init to initialize the transport independent field (svc_xprt)
in it's data structure.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
1d8206b97a09e7ff2fbef17d8d1ea008d764eeaa 31-Dec-2007 Tom Tucker <tom@opengridcomputing.com> svc: Add an svc transport class

The transport class (svc_xprt_class) represents a type of transport, e.g.
udp, tcp, rdma. A transport class has a unique name and a set of transport
operations kept in the svc_xprt_ops structure.

A transport class can be dynamically registered and unregisterd. The
svc_xprt_class represents the module that implements the transport
type and keeps reference counts on the module to avoid unloading while
there are active users.

The endpoint (svc_xprt) is a generic, transport independent endpoint that can
be used to send and receive data for an RPC service. It inherits it's
operations from the transport class.

A transport driver module registers and unregisters itself with svc sunrpc
by calling svc_reg_xprt_class, and svc_unreg_xprt_class respectively.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>