History log of /fs/nfsd/nfs4xdr.c
Revision Date Author Comments
15b23ef5d348ea51c5e7573e2ef4116fbc7cb099 24-Sep-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: fix corruption of NFSv4 read data

The calculation of page_ptr here is wrong in the case the read doesn't
start at an offset that is a multiple of a page.

The result is that nfs4svc_encode_compoundres sets rq_next_page to a
value one too small, and then the loop in svc_free_res_pages may
incorrectly fail to clear a page pointer in rq_respages[].

Pages left in rq_respages[] are available for the next rpc request to
use, so xdr data may be written to that page, which may hold data still
waiting to be transmitted to the client or data in the page cache.

The observed result was silent data corruption seen on an NFSv4 client.

We tag this as "fixing" 05638dc73af2 because that commit exposed this
bug, though the incorrect calculation predates it.

Particular thanks to Andrea Arcangeli and David Gilbert for analysis and
testing.

Fixes: 05638dc73af2 "nfsd4: simplify server xdr->next_page use"
Cc: stable@vger.kernel.org
Reported-by: Andrea Arcangeli <aarcange@redhat.com>
Tested-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
24bab491220faa446d945624086d838af41d616c 26-Sep-2014 Anna Schumaker <Anna.Schumaker@netapp.com> NFSD: Implement SEEK

This patch adds server support for the NFS v4.2 operation SEEK, which
returns the position of the next hole or data segment in a file.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
87a15a8090c0e5284c0e53528d9defa5d9237866 26-Sep-2014 Anna Schumaker <Anna.Schumaker@netapp.com> NFSD: Add generic v4.2 infrastructure

It's cleaner to introduce everything at once and have the server reply
with "not supported" than it would be to introduce extra operations when
implementing a specific one in the middle of the list.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
aee3776441461c14ba6d8ed9e2149933e65abb6e 20-Aug-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: fix rd_dircount enforcement

Commit 3b299709091b "nfsd4: enforce rd_dircount" totally misunderstood
rd_dircount; it refers to total non-attribute bytes returned, not number
of directory entries returned.

Bring the code into agreement with RFC 3530 section 14.2.24.

Cc: stable@vger.kernel.org
Fixes: 3b299709091b "nfsd4: enforce rd_dircount"
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
f7b43d0c992c3ec3e8d9285c3fb5e1e0eb0d031a 12-Aug-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: reserve adequate space for LOCK op

As of 8c7424cff6 "nfsd4: don't try to encode conflicting owner if low
on space", we permit the server to process a LOCK operation even if
there might not be space to return the conflicting lockowner, because
we've made returning the conflicting lockowner optional.

However, the rpc server still wants to know the most we might possibly
return, so we need to take into account the possible conflicting
lockowner in the svc_reserve_space() call here.

Symptoms were log messages like "RPC request reserved 88 but used 108".

Fixes: 8c7424cff6 "nfsd4: don't try to encode conflicting owner if low on space"
Reported-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
1383bf37ce2554d7632f21ee03f3ea815edaf933 11-Aug-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: remove obsolete comment

We do what Neil suggests now.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
58fb12e6a42f30adf209f8f41385a3bbb2c82420 30-Jul-2014 Jeff Layton <jlayton@primarydata.com> nfsd: Add a mutex to protect the NFSv4.0 open owner replay cache

We don't want to rely on the client_mutex for protection in the case of
NFSv4 open owners. Instead, we add a mutex that will only be taken for
NFSv4.0 state mutating operations, and that will be released once the
entire compound is done.

Also, ensure that nfsd4_cstate_assign_replay/nfsd4_cstate_clear_replay
take a reference to the stateowner when they are using it for NFSv4.0
open and lock replay caching.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
f98bac5a30b60a2fca854dd5ee7256221d8ccf0a 07-Jul-2014 Kinglong Mee <kinglongmee@gmail.com> NFSD: Fix crash encoding lock reply on 32-bit

Commit 8c7424cff6 "nfsd4: don't try to encode conflicting owner if low
on space" forgot to free conf->data in nfsd4_encode_lockt and before
sign conf->data to NULL in nfsd4_encode_lock_denied, causing a leak.

Worse, kfree() can be called on an uninitialized pointer in the case of
a succesful lock (or one that fails for a reason other than a conflict).

(Note that lock->lk_denied.ld_owner.data appears it should be zero here,
until you notice that it's one arm of a union the other arm of which is
written to in the succesful case by the

memcpy(&lock->lk_resp_stateid, &lock_stp->st_stid.sc_stateid,
sizeof(stateid_t));

in nfsd4_lock(). In the 32-bit case this overwrites ld_owner.data.)

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Fixes: 8c7424cff6 ""nfsd4: don't try to encode conflicting owner if low on space"
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
5d6031ca742f9f07b9c9d9322538619f3bd155ac 17-Jul-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: zero op arguments beyond the 8th compound op

The first 8 ops of the compound are zeroed since they're a part of the
argument that's zeroed by the

memset(rqstp->rq_argp, 0, procp->pc_argsize);

in svc_process_common(). But we handle larger compounds by allocating
the memory on the fly in nfsd4_decode_compound(). Other than code
recently fixed by 01529e3f8179 "NFSD: Fix memory leak in encoding denied
lock", I don't know of any examples of code depending on this
initialization. But it definitely seems possible, and I'd rather be
safe.

Compounds this long are unusual so I'm much more worried about failure
in this poorly tested cases than about an insignificant performance hit.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
d5d5c304b13bc3cade13b8a1b5833c8b3a0975f1 09-Jul-2014 Kinglong Mee <kinglongmee@gmail.com> NFSD: Fix bad checking of space for padding in splice read

Note that the caller has already reserved space for count and eof, so
xdr->p has already moved past them, only the padding remains.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Fixes dc97618ddd (nfsd4: separate splice and readv cases)
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
01529e3f817908b394221b0a5d985ae3541641cc 07-Jul-2014 Kinglong Mee <kinglongmee@gmail.com> NFSD: Fix memory leak in encoding denied lock

Commit 8c7424cff6 (nfsd4: don't try to encode conflicting owner if low on space)
forgot free conf->data in nfsd4_encode_lockt and before sign conf->data to NULL
in nfsd4_encode_lock_denied.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
b607664ee74313c7f3f657a044eda572051e560e 30-Jun-2014 Trond Myklebust <trond.myklebust@primarydata.com> nfsd: Cleanup nfs4svc_encode_compoundres

Move the slot return, put session etc into a helper in fs/nfsd/nfs4state.c
instead of open coding in nfs4svc_encode_compoundres.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
1055414fe19db2db6c8947c0b9ee9c8fe07beea1 29-Jun-2014 Kinglong Mee <kinglongmee@gmail.com> NFSD: Avoid warning message when compile at i686 arch

fs/nfsd/nfs4xdr.c: In function 'nfsd4_encode_readv':
>> fs/nfsd/nfs4xdr.c:3137:148: warning: comparison of distinct pointer types lacks a cast [enabled by default]
thislen = min(len, ((void *)xdr->end - (void *)xdr->p));

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
d5e2338324102dcf34aa25aeaf96064cc4d94dce 24-Jun-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: replace defer_free by svcxdr_tmpalloc

Avoid an extra allocation for the tmpbuf struct itself, and stop
ignoring some allocation failures.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
bcaab953b1d3790c724a211f2452b574fd49a7ce 24-Jun-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: remove nfs4_acl_new

This is a not-that-useful kmalloc wrapper. And I'd like one of the
callers to actually use something other than kmalloc.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
29c353b3fe54789706c0a37560ce4548a6362c2c 24-Jun-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: define svcxdr_dupstr to share some common code

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
ce043ac826f3ad224142f84d860316a5fd05f79c 24-Jun-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: remove unused defer_free argument

28e05dd8457c "knfsd: nfsd4: represent nfsv4 acl with array instead of
linked list" removed the last user that wanted a custom free function.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
7fb84306f55d6cc32ea894d47cbb2faa18c8f45b 24-Jun-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: rename cr_linkname->cr_data

The name of a link is currently stored in cr_name and cr_namelen, and
the content in cr_linkname and cr_linklen. That's confusing.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
b829e9197ad3d8b86dbd5dc1d9bbc5508d214cec 19-Jun-2014 J. Bruce Fields <bfields@redhat.com> nfsd: fix rare symlink decoding bug

An NFS operation that creates a new symlink includes the symlink data,
which is xdr-encoded as a length followed by the data plus 0 to 3 bytes
of zero-padding as required to reach a 4-byte boundary.

The vfs, on the other hand, wants null-terminated data.

The simple way to handle this would be by copying the data into a newly
allocated buffer with space for the final null.

The current nfsd_symlink code tries to be more clever by skipping that
step in the (likely) case where the byte following the string is already
0.

But that assumes that the byte following the string is ours to look at.
In fact, it might be the first byte of a page that we can't read, or of
some object that another task might modify.

Worse, the NFSv4 code tries to fix the problem by actually writing to
that byte.

In the NFSv2/v3 cases this actually appears to be safe:

- nfs3svc_decode_symlinkargs explicitly null-terminates the data
(after first checking its length and copying it to a new
page).
- NFSv2 limits symlinks to 1k. The buffer holding the rpc
request is always at least a page, and the link data (and
previous fields) have maximum lengths that prevent the request
from reaching the end of a page.

In the NFSv4 case the CREATE op is potentially just one part of a long
compound so can end up on the end of a page if you're unlucky.

The minimal fix here is to copy and null-terminate in the NFSv4 case.
The nfsd_symlink() interface here seems too fragile, though. It should
really either do the copy itself every time or just require a
null-terminated string.

Reported-by: Jeff Layton <jlayton@primarydata.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
c3a4561796cffae6996264876ffca147b5c3709a 06-Jul-2014 Kinglong Mee <kinglongmee@gmail.com> nfsd: Fix bad reserving space for encoding rdattr_error

Introduced by commit 561f0ed498 (nfsd4: allow large readdirs).

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
69bbd9c7b99974f3a701d4de6ef7010c37182a47 26-Jun-2014 Avi Kivity <avi@cloudius-systems.com> nfs: fix nfs4d readlink truncated packet

XDR requires 4-byte alignment; nfs4d READLINK reply writes out the padding,
but truncates the packet to the padding-less size.

Fix by taking the padding into consideration when truncating the packet.

Symptoms:

# ll /mnt/
ls: cannot read symbolic link /mnt/test: Input/output error
total 4
-rw-r--r--. 1 root root 0 Jun 14 01:21 123456
lrwxrwxrwx. 1 root root 6 Jul 2 03:33 test
drwxr-xr-x. 1 root root 0 Jul 2 23:50 tmp
drwxr-xr-x. 1 root root 60 Jul 2 23:44 tree

Signed-off-by: Avi Kivity <avi@cloudius-systems.com>
Fixes: 476a7b1f4b2c (nfsd4: don't treat readlink like a zero-copy operation)
Reviewed-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
76f47128f9b33af1e96819746550d789054c9664 19-Jun-2014 J. Bruce Fields <bfields@redhat.com> nfsd: fix rare symlink decoding bug

An NFS operation that creates a new symlink includes the symlink data,
which is xdr-encoded as a length followed by the data plus 0 to 3 bytes
of zero-padding as required to reach a 4-byte boundary.

The vfs, on the other hand, wants null-terminated data.

The simple way to handle this would be by copying the data into a newly
allocated buffer with space for the final null.

The current nfsd_symlink code tries to be more clever by skipping that
step in the (likely) case where the byte following the string is already
0.

But that assumes that the byte following the string is ours to look at.
In fact, it might be the first byte of a page that we can't read, or of
some object that another task might modify.

Worse, the NFSv4 code tries to fix the problem by actually writing to
that byte.

In the NFSv2/v3 cases this actually appears to be safe:

- nfs3svc_decode_symlinkargs explicitly null-terminates the data
(after first checking its length and copying it to a new
page).
- NFSv2 limits symlinks to 1k. The buffer holding the rpc
request is always at least a page, and the link data (and
previous fields) have maximum lengths that prevent the request
from reaching the end of a page.

In the NFSv4 case the CREATE op is potentially just one part of a long
compound so can end up on the end of a page if you're unlucky.

The minimal fix here is to copy and null-terminate in the NFSv4 case.
The nfsd_symlink() interface here seems too fragile, though. It should
really either do the copy itself every time or just require a
null-terminated string.

Reported-by: Jeff Layton <jlayton@primarydata.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
3c7aa15d2073d81e56e8ba8771a4ab6f23be7ae2 10-Jun-2014 Kinglong Mee <kinglongmee@gmail.com> NFSD: Using min/max/min_t/max_t for calculate

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
f41c5ad2ff2657978a9712b9ea80cd812a7da2b0 13-Jun-2014 Kinglong Mee <kinglongmee@gmail.com> NFSD: fix bug for readdir of pseudofs

Commit 561f0ed498ca (nfsd4: allow large readdirs) introduces a bug
about readdir the root of pseudofs.

Call xdr_truncate_encode() revert encoded name when skipping.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
542d1ab3c7ce53be7d7122a83d016304af4e6345 02-Jun-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: kill READ64

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
06553991e7757c668efb3bce9dcc740f31aead60 02-Jun-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: kill READ32

While we're here, let's kill off a couple of the read-side macros.

Leaving the more complicated ones alone for now.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
da2ebce6a0f64cc01bd00aba998c0a4fa7c09843 30-May-2014 Jeff Layton <jlayton@primarydata.com> nfsd: make nfsd4_encode_fattr static

sparse says:

CHECK fs/nfsd/nfs4xdr.c
fs/nfsd/nfs4xdr.c:2043:1: warning: symbol 'nfsd4_encode_fattr' was not declared. Should it be static?

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12337901d654415d9f764b5f5ba50052e9700f37 28-May-2014 Christoph Hellwig <hch@lst.de> nfsd: getattr for FATTR4_WORD0_FILES_AVAIL needs the statfs buffer

Note nobody's ever noticed because the typical client probably never
requests FILES_AVAIL without also requesting something else on the list.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
94eb36892d727145794b80dceffc435d1d68edbb 23-May-2014 Kinglong Mee <kinglongmee@gmail.com> NFSD: Adds macro EX_UUID_LEN for exports uuid's length

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
a5cddc885b99458df963a75abbe0b40cbef56c48 13-May-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: better reservation of head space for krb5

RPC_MAX_AUTH_SIZE is scattered around several places. Better to set it
once in the auth code, where this kind of estimate should be made. And
while we're at it we can leave it zero when we're not using krb5i or
krb5p.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
d05d5744ef67879877dbe2e3d0fb9fcc27ee44e5 22-Mar-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: kill write32, write64

And switch a couple other functions from the encode(&p,...) convention
to the p = encode(p,...) convention mostly used elsewhere.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
0c0c267ba96f606b541ab8e4bcde54e6b3f0198f 22-Mar-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: kill WRITEMEM

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
b64c7f3bdfbb468d9026ca91d55c57675724f516 22-Mar-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: kill WRITE64

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
c373b0a4289ebf1ca6fbf4614d8b457b5f1b489f 22-Mar-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: kill WRITE32

These macros just obscure what's going on. Adopt the convention of the
client-side code.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
c8f13d977518e588ac89dcf8e841821569108109 08-May-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: really fix nfs4err_resource in 4.1 case

encode_getattr, for example, can return nfserr_resource to indicate it
ran out of buffer space. That's not a legal error in the 4.1 case.
And in the 4.1 case, if we ran out of buffer space, we should have
exceeded a session limit too.

(Note in 1bc49d83c37cfaf46be357757e592711e67f9809 "nfsd4: fix
nfs4err_resource in 4.1 case" we originally tried fixing this error
return before fixing the problem that we could error out while we still
had lots of available space. The result was to trade one illegal error
for another in those cases. We decided that was helpful, so reverted
the change in fc208d026be0c7d60db9118583fc62f6ca97743d, and are only
reinstating it now that we've elimited almost all of those cases.)

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
b042098063849794d69b5322fcc6cf9fb5f2586e 18-Mar-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: allow exotic read compounds

I'm not sure why a client would want to stuff multiple reads in a
single compound rpc, but it's legal for them to do it, and we should
really support it.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
fec25fa4ad728dd9b063313f2a61ff65eae0d571 13-May-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: more read encoding cleanup

More cleanup, no change in functionality.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
34a78b488f144e011493fa51f10c01d034d47c8e 13-May-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: read encoding cleanup

Trivial cleanup, no change in functionality.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
dc97618ddda9a23e5211e800f0614e9612178200 18-Mar-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: separate splice and readv cases

The splice and readv cases are actually quite different--for example the
former case ignores the array of vectors we build up for the latter.

It is probably clearer to separate the two cases entirely.

There's some code duplication between the split out encoders, but this
is only temporary and will be fixed by a later patch.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
b0e35fda827e72cf4b065b52c4c472c28c004fca 04-Feb-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: turn off zero-copy-read in exotic cases

We currently allow only one read per compound, with operations before
and after whose responses will require no more than about a page to
encode.

While we don't expect clients to violate those limits any time soon,
this limitation isn't really condoned by the spec, so to future proof
the server we should lift the limitation.

At the same time we'd like to continue to support zero-copy reads.

Supporting multiple zero-copy-reads per compound would require a new
data structure to replace struct xdr_buf, which can represent only one
set of included pages.

So for now we plan to modify encode_read() to support either zero-copy
or non-zero-copy reads, and use some heuristics at the start of the
compound processing to decide whether a zero-copy read will work.

This will allow us to support more exotic compounds without introducing
a performance regression in the normal case.

Later patches handle those "exotic compounds", this one just makes sure
zero-copy is turned off in those cases.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
476a7b1f4b2c9c38255653fa55157565be8b14be 20-Jan-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: don't treat readlink like a zero-copy operation

There's no advantage to this zero-copy-style readlink encoding, and it
unnecessarily limits the kinds of compounds we can handle. (In practice
I can't see why a client would want e.g. multiple readlink calls in a
comound, but it's probably a spec violation for us not to handle it.)

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
3b299709091befc0e02aa33d55ddd5baef006853 21-Mar-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: enforce rd_dircount

As long as we're here, let's enforce the protocol's limit on the number
of directory entries to return in a readdir.

I don't think anyone's ever noticed our lack of enforcement, but maybe
there's more of a chance they will now that we allow larger readdirs.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
561f0ed498ca4342573a870779cc645d3fd7dfe7 20-Jan-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: allow large readdirs

Currently we limit readdir results to a single page. This can result in
a performance regression compared to NFSv3 when reading large
directories.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
47ee52986472dba068e8223cbaf1b65d74238781 13-Mar-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: adjust buflen to session channel limit

We can simplify session limit enforcement by restricting the xdr buflen
to the session size.

Also fix a preexisting bug: we should really have been taking into
account the auth-required space when comparing against session limits,
which are limits on the size of the entire rpc reply, including any krb5
overhead.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
30596768b31069a3ae08fc305f394efb8c42b473 19-May-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: fix buflen calculation after read encoding

We don't necessarily want to assume that the buflen is the same
as the number of bytes available in the pages. We may have some reason
to set it to something less (for example, later patches will use a
smaller buflen to enforce session limits).

So, calculate the buflen relative to the previous buflen instead of
recalculating it from scratch.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
89ff884ebbd0a667253dd61ade8a0e70b787c84a 11-Mar-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: nfsd4_check_resp_size should check against whole buffer

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
6ff9897d2bcf4036dfd139caeddd6f0a51c9ca06 11-Mar-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: minor encode_read cleanup

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
4f0cefbf389c28b0a2be34960797adb0c84ee43d 11-Mar-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: more precise nfsd4_max_reply

It will turn out to be useful to have a more accurate estimate of reply
size; so, piggyback on the existing op reply-size estimators.

Also move nfsd4_max_reply to nfs4proc.c to get easier access to struct
nfsd4_operation and friends. (Thanks to Christoph Hellwig for pointing
out that simplification.)

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
8c7424cff6bd33459945646cfcbf6dc6c899ab24 10-Mar-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: don't try to encode conflicting owner if low on space

I ran into this corner case in testing: in theory clients can provide
state owners up to 1024 bytes long. In the sessions case there might be
a risk of this pushing us over the DRC slot size.

The conflicting owner isn't really that important, so let's humor a
client that provides a small maxresponsize_cached by allowing ourselves
to return without the conflicting owner instead of outright failing the
operation.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
f5236013a21c118e9d317e90c7a152dfe51fab93 21-Mar-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: convert 4.1 replay encoding

Limits on maxresp_sz mean that we only ever need to replay rpc's that
are contained entirely in the head.

The one exception is very small zero-copy reads. That's an odd corner
case as clients wouldn't normally ask those to be cached.

in any case, this seems a little more robust.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2825a7f90753012babe7ee292f4a1eadd3706f92 26-Aug-2013 J. Bruce Fields <bfields@redhat.com> nfsd4: allow encoding across page boundaries

After this we can handle for example getattr of very large ACLs.

Read, readdir, readlink are still special cases with their own limits.

Also we can't handle a new operation starting close to the end of a
page.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
a8095f7e80fbf3e0efe4ee5cd3f509113c56290f 11-Mar-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: size-checking cleanup

Better variable name, some comments, etc.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
ea8d7720b274607f48fb524337254a9c43dbc2df 08-Mar-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: remove redundant encode buffer size checking

Now that all op encoders can handle running out of space, we no longer
need to check the remaining size for every operation; only nonidempotent
operations need that check, and that can be done by
nfsd4_check_resp_size.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
67492c990300912c717bc95e9f705feb63de2df9 08-Mar-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: nfsd4_check_resp_size needn't recalculate length

We're keeping the length updated as we go now, so there's no need for
the extra calculation here.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
4e21ac4b6f1d09c56f7d10916eaa738361214ab7 22-Mar-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: reserve space before inlining 0-copy pages

Once we've included page-cache pages in the encoding it's difficult to
remove them and restart encoding. (xdr_truncate_encode doesn't handle
that case.) So, make sure we'll have adequate space to finish the
operation first.

For now COMPOUND_SLACK_SPACE checks should prevent this case happening,
but we want to remove those checks.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
d0a381dd0eda1cc769a5762d0eed4d0d662219f2 30-Jan-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: teach encoders to handle reserve_space failures

We've tried to prevent running out of space with COMPOUND_SLACK_SPACE
and special checking in those operations (getattr) whose result can vary
enormously.

However:
- COMPOUND_SLACK_SPACE may be difficult to maintain as we add
more protocol.
- BUG_ON or page faulting on failure seems overly fragile.
- Especially in the 4.1 case, we prefer not to fail compounds
just because the returned result came *close* to session
limits. (Though perfect enforcement here may be difficult.)
- I'd prefer encoding to be uniform for all encoders instead of
having special exceptions for encoders containing, for
example, attributes.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
082d4bd72a4527c6568f53f4a5de74e804666fa7 29-Aug-2013 J. Bruce Fields <bfields@redhat.com> nfsd4: "backfill" using write_bytes_to_xdr_buf

Normally xdr encoding proceeds in a single pass from start of a buffer
to end, but sometimes we have to write a few bytes to an earlier
position.

Use write_bytes_to_xdr_buf for these cases rather than saving a pointer
to write to. We plan to rewrite xdr_reserve_space to handle encoding
across page boundaries using a scratch buffer, and don't want to risk
writing to a pointer that was contained in a scratch buffer.

Also it will no longer be safe to calculate lengths by subtracting two
pointers, so use xdr_buf offsets instead.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
1fcea5b20b74cb856f5cd27161fea5329079dbd7 27-Feb-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: use xdr_truncate_encode

Now that lengths are reliable, we can use xdr_truncate instead of
open-coding it everywhere.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
6ac90391c6e36c536cfcedbe4801a77e304205b1 26-Feb-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: keep xdr buf length updated

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
dd97fddedc251eb423408d89f2947eff9c4ea3c1 26-Feb-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: no need for encode_compoundres to adjust lengths

xdr_reserve_space should now be calculating the length correctly as we
go, so there's no longer any need to fix it up here.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
f46d382a749874e1b29cfb34d4ccf283eae4fffa 31-Jan-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: remove ADJUST_ARGS

It's just uninteresting debugging code at this point.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
d3f627c815b6eb5f6be388100617c36823d661c5 26-Feb-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: use xdr_stream throughout compound encoding

Note this makes ADJUST_ARGS useless; we'll remove it in the following
patch.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
ddd1ea56367202f6c99135cd59de7a97af4c4ffd 28-Aug-2013 J. Bruce Fields <bfields@redhat.com> nfsd4: use xdr_reserve_space in attribute encoding

This is a cosmetic change for now; no change in behavior.

Note we're just depending on xdr_reserve_space to do the bounds checking
for us, we're not really depending on its adjustment of iovec or xdr_buf
lengths yet, as those are fixed up by as necessary after the fact by
read-link operations and by nfs4svc_encode_compoundres. However we do
have to update xdr->iov on read-like operations to prevent
xdr_reserve_space from messing with the already-fixed-up length of the
the head.

When the attribute encoding fails partway through we have to undo the
length adjustments made so far. We do it manually for now, but later
patches will add an xdr_truncate_encode() helper to handle cases like
this.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
5f4ab9458755eddc66912a15319363bf311f7fc8 07-Mar-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: allow space for final error return

This post-encoding check should be taking into account the need to
encode at least an out-of-space error to the following op (if any).

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
07d1f8020738ba3180ea9992c4fa7dbc0685396a 07-Mar-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: fix encoding of out-of-space replies

If nfsd4_check_resp_size() returns an error then we should really be
truncating the reply here, otherwise we may leave extra garbage at the
end of the rpc reply.

Also add a warning to catch any cases where our reply-size estimates may
be wrong in the case of a non-idempotent operation.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
d518465866bfeaa41fb685d7dfc9983e0312232e 26-Aug-2013 J. Bruce Fields <bfields@redhat.com> nfsd4: tweak nfsd4_encode_getattr to take xdr_stream

Just change the nfsd4_encode_getattr api. Not changing any code or
adding any new functionality yet.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
4aea24b2ff7510932118ec9b06c35a11625194ea 15-Jan-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: embed xdr_stream in nfsd4_compoundres

This is a mechanical transformation with no change in behavior.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
e372ba60def1af33e1c0b9bbfa5c8f8559c1ad6b 19-May-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: decoding errors can still be cached and require space

Currently a non-idempotent op reply may be cached if it fails in the
proc code but not if it fails at xdr decoding. I doubt there are any
xdr-decoding-time errors that would make this a problem in practice, so
this probably isn't a serious bug.

The space estimates should also take into account space required for
encoding of error returns. Again, not a practical problem, though it
would become one after future patches which will tighten the space
estimates.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
fc208d026be0c7d60db9118583fc62f6ca97743d 09-Apr-2014 J. Bruce Fields <bfields@redhat.com> Revert "nfsd4: fix nfs4err_resource in 4.1 case"

Since we're still limiting attributes to a page, the result here is that
a large getattr result will return NFS4ERR_REP_TOO_BIG/TOO_BIG_TO_CACHE
instead of NFS4ERR_RESOURCE.

Both error returns are wrong, and the real bug here is the arbitrary
limit on getattr results, fixed by as-yet out-of-tree patches. But at a
minimum we can make life easier for clients by sticking to one broken
behavior in released kernels instead of two....

Trond says:

one immediate consequence of this patch will be that NFSv4.1
clients will now report EIO instead of EREMOTEIO if they hit the
problem. That may make debugging a little less obvious.

Another consequence will be that if we ever do try to add client
side handling of NFS4ERR_REP_TOO_BIG, then we now have to deal
with the “handle existing buggy server” syndrome.

Reported-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
18df11d0eacf67bbcd8dda755b568bbbd7264735 09-Mar-2014 Yan, Zheng <zheng.z.yan@intel.com> nfsd4: fix memory leak in nfsd4_encode_fattr()

fh_put() does not free the temporary file handle.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
1bc49d83c37cfaf46be357757e592711e67f9809 11-Mar-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: fix nfs4err_resource in 4.1 case

encode_getattr, for example, can return nfserr_resource to indicate it
ran out of buffer space. That's not a legal error in the 4.1 case. And
in the 4.1 case, if we ran out of buffer space, we should have exceeded
a session limit too.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
1bed92cb3c7663240992f4d97cbe4d21783113a0 21-Mar-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: remove redundant check from nfsd4_check_resp_size

cstate->slot and ->session are each set together in nfsd4_sequence. If
one is non-NULL, so is the other.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
067e1ace46834613a0543124981b0d54dcd87d49 21-Mar-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: update comments with obsolete function name

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
e874f9f8e04cb67351893894dfb9fbcd25e62fa2 10-Mar-2014 Jeff Layton <jlayton@redhat.com> svcrpc: explicitly reject compounds that are not padded out to 4-byte multiple

We have a WARN_ON in the nfsd4_decode_write() that tells us when the
client has sent a request that is not padded out properly according to
RFC4506. A WARN_ON really isn't appropriate in this case though since
this indicates a client bug, not a server one.

Move this check out to the top-level compound decoder and have it just
explicitly return an error. Also add a dprintk() that shows the client
address and xid to help track down clients and frames that trigger it.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
a11fcce1544df08c723d950ff0edef3adac40405 03-Feb-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: fix test_stateid error reply encoding

If the entire operation fails then there's nothing to encode.

Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
798df3387971abf6071de77ca82b8e7775e74809 29-Jan-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: make set of large acl return efbig, not resource

If a client attempts to set an excessively large ACL, return
NFS4ERR_FBIG instead of NFS4ERR_RESOURCE. I'm not sure FBIG is correct,
but I'm positive RESOURCE is wrong (it isn't even a well-defined error
any more for NFS versions since 4.1).

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
de3997a7eeb9ea286b15879fdf8a95aae065b4f7 28-Jan-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: buffer-length check for SUPPATTR_EXCLCREAT

This was an omission from 8c18f2052e756e7d5dea712fc6e7ed70c00e8a39
"nfsd41: SUPPATTR_EXCLCREAT attribute".

Cc: Benny Halevy <bhalevy@primarydata.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
d50e61361c68a05a9cd7d54617522f99f278ac8a 15-Jan-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: decrease nfsd4_encode_fattr stack usage

A struct svc_fh is 320 bytes on x86_64, it'd be better not to have these
on the stack.

kmalloc'ing them probably isn't ideal either, but this is the simplest
thing to do. If it turns out to be a problem in the readdir case then
we could add a svc_fh to nfsd4_readdir and pass that in.

Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
3554116d3aae25353713f3d0131d86ae6c1e5674 08-Jan-2014 J. Bruce Fields <bfields@redhat.com> nfsd4: simplify xdr encoding of nfsv4 names

We can simplify the idmapping code if it does its own encoding and
returns nfs errors.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
87915c6472acbc5d7c809f3c9753808797da51a8 16-Jan-2013 J. Bruce Fields <bfields@redhat.com> nfsd4: encode_rdattr_error cleanup

There's a simpler way to write this.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
6b6d8137f1d3fc7a3970e1e384b8ce2d0967e087 16-Jan-2013 J. Bruce Fields <bfields@redhat.com> nfsd4: nfsd4_encode_fattr cleanup

Remove some pointless goto's.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
dfeecc829eb8e4ccbbab2ebc9b81b4cebec7fad4 09-Dec-2013 Kinglong Mee <kinglongmee@gmail.com> nfsd: get rid of unused macro definition

Since defined in Linux-2.6.12-rc2, READTIME has not been used.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
eba1c99ce4590506516ec801d991e36aa8b0d436 09-Dec-2013 Kinglong Mee <kinglongmee@gmail.com> nfsd: clean up unnecessary temporary variable in nfsd4_decode_fattr

host_err was only used for nfs4_acl_new.
This patch delete it, and return nfserr_jukebox directly.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
43212cc7dfee0ca33d1f0f23652c70317ee031e6 09-Dec-2013 Kinglong Mee <kinglongmee@gmail.com> nfsd: using nfsd4_encode_noop for encoding destroy_session/free_stateid

Get rid of the extra code, using nfsd4_encode_noop for encoding destroy_session and free_stateid.
And, delete unused argument (fr_status) int nfsd4_free_stateid.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
a9f7b4a06c9704fa3cfe0b0601347e03289a7407 09-Dec-2013 Kinglong Mee <kinglongmee@gmail.com> nfsd: clean up an xdr reserved space calculation

We should use XDR_LEN to calculate reserved space in case the oid is not
a multiple of 4.

RESERVE_SPACE actually rounds up for us, but it's probably better to be
careful here.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
a8bb84bc9e57ad214024425d480a722f304df9e8 10-Dec-2013 Kinglong Mee <kinglongmee@gmail.com> nfsd: calculate the missing length of bitmap in EXCHANGE_ID

commit 58cd57bfd9db3bc213bf9d6a10920f82095f0114
"nfsd: Fix SP4_MACH_CRED negotiation in EXCHANGE_ID"
miss calculating the length of bitmap for spo_must_enforce and spo_must_allow.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2d8498dbf8041c51ca49a0be6be594501638e591 20-Nov-2013 Christoph Hellwig <hch@infradead.org> nfsd: start documenting some XDR handling functions

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
365da4adebb1c012febf81019ad3dc5bb52e2a13 19-Nov-2013 J. Bruce Fields <bfields@redhat.com> nfsd4: fix xdr decoding of large non-write compounds

This fixes a regression from 247500820ebd02ad87525db5d9b199e5b66f6636
"nfsd4: fix decoding of compounds across page boundaries". The previous
code was correct: argp->pagelist is initialized in
nfs4svc_deocde_compoundargs to rqstp->rq_arg.pages, and is therefore a
pointer to the page *after* the page we are currently decoding.

The reason that patch nevertheless fixed a problem with decoding
compounds containing write was a bug in the write decoding introduced by
5a80a54d21c96590d013378d8c5f65f879451ab4 "nfsd4: reorganize write
decoding", after which write decoding no longer adhered to the rule that
argp->pagelist point to the next page.

Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
aea240f4162d50e0f2d8bd5ea3ba11b5f072add8 14-Nov-2013 Christoph Hellwig <hch@infradead.org> nfsd: export proper maximum file size to the client

I noticed that we export a way to high value for the maxfilesize
attribute when debugging a client issue. The issue didn't turn
out to be related to it, but I think we should export it, so that
clients can limit what write sizes they accept before hitting
the server.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
6ff40decff0ef35a5d755ec60182d7f803356dfb 05-Nov-2013 J. Bruce Fields <bfields@redhat.com> nfsd4: improve write performance with better sendspace reservations

Currently the rpc code conservatively refuses to accept rpc's from a
client if the sum of its worst-case estimates of the replies it owes
that client exceed the send buffer space.

Unfortunately our estimate of the worst-case reply for an NFSv4 compound
is always the maximum read size. This can unnecessarily limit the
number of operations we handle concurrently, for example in the case
most operations are writes (which have small replies).

We can do a little better if we check which ops the compound contains.

This is still a rough estimate, we'll need to improve on it some day.

Reported-by: Shyam Kaushik <shyamnfs1@gmail.com>
Tested-by: Shyam Kaushik <shyamnfs1@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
3378b7f40d79930f0f447a164c7e8fcbe4480e40 01-Nov-2013 J. Bruce Fields <bfields@redhat.com> nfsd4: fix discarded security labels on setattr

Security labels in setattr calls are currently ignored because we forget
to set label->len.

Cc: stable@vger.kernel.org
Reported-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
8217d146ab98a1790349d79c436176658e311e3c 30-Oct-2013 Anna Schumaker <bjschuma@netapp.com> NFSD: Add support for NFS v4.2 operation checking

The server does allow NFS over v4.2, even if it doesn't add any new
operations yet.

I also switch to using constants to represent the last operation for
each minor version since this makes the code cleaner and easier to
understand at a quick glance.

Signed-off-by: Anna Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
e1a90ebd8b2349eb00ec22f0b8bf6ab8bbd06cc8 30-Oct-2013 Anna Schumaker <bjschuma@netapp.com> NFSD: Combine decode operations for v4 and v4.1

We were using a different array of function pointers to represent each
minor version. This makes adding a new minor version tedious, since it
needs a step to copy, paste and modify a new version of the same
functions.

This patch combines the v4 and v4.1 arrays into a single instance and
will check minor version support inside each decoder function.

Signed-off-by: Anna Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
301f0268b63d1b07268e46f5901fc51d6cac20eb 01-Sep-2013 Al Viro <viro@zeniv.linux.org.uk> nfsd: racy access to ->d_name in nsfd4_encode_path()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
58cd57bfd9db3bc213bf9d6a10920f82095f0114 05-Aug-2013 Weston Andros Adamson <dros@netapp.com> nfsd: Fix SP4_MACH_CRED negotiation in EXCHANGE_ID

- don't BUG_ON() when not SP4_NONE
- calculate recv and send reserve sizes correctly

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
f0f51f5cdd107971282ae18f00a6fa03d69407a0 18-Jun-2013 J. Bruce Fields <bfields@redhat.com> nfsd4: allow destroy_session over destroyed session

RFC 5661 allows a client to destroy a session using a compound
associated with the destroyed session, as long as the DESTROY_SESSION op
is the last op of the compound.

We attempt to allow this, but testing against a Solaris client (which
does destroy sessions in this way) showed that we were failing the
DESTROY_SESSION with NFS4ERR_DELAY, because we assumed the reference
count on the session (held by us) represented another rpc in progress
over this session.

Fix this by noting that in this case the expected reference count is 1,
not 0.

Also, note as long as the session holds a reference to the compound
we're destroying, we can't free it here--instead, delay the free till
the final put in nfs4svc_encode_compoundres.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
590b743143eae8db40abdfd1ab20bc51ee0ee5db 21-Jun-2013 J. Bruce Fields <bfields@redhat.com> nfsd4: minor read_buf cleanup

The code to step to the next page seems reasonably self-contained.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
247500820ebd02ad87525db5d9b199e5b66f6636 21-Jun-2013 J. Bruce Fields <bfields@redhat.com> nfsd4: fix decoding of compounds across page boundaries

A freebsd NFSv4.0 client was getting rare IO errors expanding a tarball.
A network trace showed the server returning BAD_XDR on the final getattr
of a getattr+write+getattr compound. The final getattr started on a
page boundary.

I believe the Linux client ignores errors on the post-write getattr, and
that that's why we haven't seen this before.

Cc: stable@vger.kernel.org
Reported-by: Rick Macklem <rmacklem@uoguelph.ca>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
57569a707082a337e4a61a657521d79cac3528bf 17-May-2013 J. Bruce Fields <bfields@redhat.com> nfsd4: allow client to send no cb_sec flavors

In testing I notice that some of the pynfs tests forget to send any
cb_sec flavors, and that we haven't necessarily errored out in that case
before.

I'll fix pynfs, but am also inclined to default to trying AUTH_NONE in
that case in case this is something clients actually do.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
57266a6e916e2522ea61758a3ee5576b60156791 13-Apr-2013 J. Bruce Fields <bfields@redhat.com> nfsd4: implement minimal SP4_MACH_CRED

Do a minimal SP4_MACH_CRED implementation suggested by Trond, ignoring
the client-provided spo_must_* arrays and just enforcing credential
checks for the minimum required operations.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
ba4e55bb67894136489b27372166416cd70b0756 15-May-2013 J. Bruce Fields <bfields@redhat.com> nfsd4: fix compile in !CONFIG_NFSD_V4_SECURITY_LABEL case

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
18032ca062e621e15683cb61c066ef3dc5414a7b 02-May-2013 David Quigley <dpquigl@davequigley.com> NFSD: Server implementation of MAC Labeling

Implement labeled NFS on the server: encoding and decoding, and writing
and reading, of file labels.

Enabled with CONFIG_NFSD_V4_SECURITY_LABEL.

Signed-off-by: Matthew N. Dodd <Matthew.Dodd@sparta.com>
Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg>
Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg>
Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
4bdc33ed5bd9fbaa243bda6fdccb22674aed6305 02-May-2013 Steve Dickson <steved@redhat.com> NFSDv4.2: Add NFS v4.2 support to the NFS server

This enables NFSv4.2 support for the server. To enable this
code do the following:
echo "+4.2" >/proc/fs/nfsd/versions

after the nfsd kernel module is loaded.

On its own this does nothing except allow the server to respond to
compounds with minorversion set to 2. All the new NFSv4.2 features are
optional, so this is perfectly legal.

Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
676e4ebd5f2c3b4fd1d2bff79b68385c23c5c105 01-May-2013 Chuck Lever <chuck.lever@oracle.com> NFSD: SECINFO doesn't handle unsupported pseudoflavors correctly

If nfsd4_do_encode_secinfo() can't find GSS info that matches an
export security flavor, it assumes the flavor is not a GSS
pseudoflavor, and simply puts it on the wire.

However, if this XDR encoding logic is given a legitimate GSS
pseudoflavor but the RPC layer says it does not support that
pseudoflavor for some reason, then the server leaks GSS pseudoflavor
numbers onto the wire.

I confirmed this happens by blacklisting rpcsec_gss_krb5, then
attempted a client transition from the pseudo-fs to a Kerberos-only
share. The client received a flavor list containing the Kerberos
pseudoflavor numbers, rather than GSS tuples.

The encoder logic can check that each pseudoflavor in flavs[] is
less than MAXFLAVOR before writing it into the buffer, to prevent
this. But after "nflavs" is written into the XDR buffer, the
encoder can't skip writing flavor information into the buffer when
it discovers the RPC layer doesn't support that flavor.

So count the number of valid flavors as they are written into the
XDR buffer, then write that count into a placeholder in the XDR
buffer when all recognized flavors have been encoded.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
ed9411a00464860cafe7e07224818cdf04fd9e89 01-May-2013 Chuck Lever <chuck.lever@oracle.com> NFSD: Simplify GSS flavor encoding in nfsd4_do_encode_secinfo()

Clean up.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
bf8d909705e9d9bac31d9b8eac6734d2b51332a7 19-Apr-2013 Bryan Schumaker <bjschuma@netapp.com> nfsd: Decode and send 64bit time values

The seconds field of an nfstime4 structure is 64bit, but we are assuming
that the first 32bits are zero-filled. So if the client tries to set
atime to a value before the epoch (touch -t 196001010101), then the
server will save the wrong value on disk.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
9aeb5aeeb09d59794896ccefd60d58c44987f52f 17-Apr-2013 J. Bruce Fields <bfields@redhat.com> nfsd4: remove unused macro

Cleanup a piece I forgot to remove in
9411b1d4c7df26dca6bc6261b5dc87a5b4c81e5c "nfsd4: cleanup handling of
nfsv4.0 closed stateid's".

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
9411b1d4c7df26dca6bc6261b5dc87a5b4c81e5c 01-Apr-2013 J. Bruce Fields <bfields@redhat.com> nfsd4: cleanup handling of nfsv4.0 closed stateid's

Closed stateid's are kept around a little while to handle close replays
in the 4.0 case. So we stash them in the last-used stateid in the
oo_last_closed_stateid field of the open owner. We can free that in
encode_seqid_op_tail once the seqid on the open owner is next
incremented. But we don't want to do that on the close itself; so we
set NFS4_OO_PURGE_CLOSE flag set on the open owner, skip freeing it the
first time through encode_seqid_op_tail, then when we see that flag set
next time we free it.

This is unnecessarily baroque.

Instead, just move the logic that increments the seqid out of the xdr
code and into the operation code itself.

The justification given for the current placement is that we need to
wait till the last minute to be sure we know whether the status is a
sequence-id-mutating error or not, but examination of the code shows
that can't actually happen.

Reported-by: Yanchuan Nian <ycnian@gmail.com>
Tested-by: Yanchuan Nian <ycnian@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
221a68766973d7a3afe40a05abd8258b5de016a0 02-Apr-2013 J. Bruce Fields <bfields@redhat.com> nfsd4: don't destroy in-use clients

When a setclientid_confirm or create_session confirms a client after a
client reboot, it also destroys any previous state held by that client.

The shutdown of that previous state must be careful not to free the
client out from under threads processing other requests that refer to
the client.

This is a particular problem in the NFSv4.1 case when we hold a
reference to a session (hence a client) throughout compound processing.

The server attempts to handle this by unhashing the client at the time
it's destroyed, then delaying the final free to the end. But this still
leaves some races in the current code.

I believe it's simpler just to fail the attempt to destroy the client by
returning NFS4ERR_DELAY. This is a case that should never happen
anyway.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
b0a9d3ab577464529f6649ec54f8a0de160866e3 07-Mar-2013 J. Bruce Fields <bfields@redhat.com> nfsd4: fix race on client shutdown

Dropping the session's reference count after the client's means we leave
a window where the session's se_client pointer is NULL. An xpt_user
callback that encounters such a session may then crash:

[ 303.956011] BUG: unable to handle kernel NULL pointer dereference at 0000000000000318
[ 303.959061] IP: [<ffffffff81481a8e>] _raw_spin_lock+0x1e/0x40
[ 303.959061] PGD 37811067 PUD 3d498067 PMD 0
[ 303.959061] Oops: 0002 [#8] PREEMPT SMP
[ 303.959061] Modules linked in: md5 nfsd auth_rpcgss nfs_acl snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_page_alloc microcode psmouse snd_timer serio_raw pcspkr evdev snd soundcore i2c_piix4 i2c_core intel_agp intel_gtt processor button nfs lockd sunrpc fscache ata_generic pata_acpi ata_piix uhci_hcd libata btrfs usbcore usb_common crc32c scsi_mod libcrc32c zlib_deflate floppy virtio_balloon virtio_net virtio_pci virtio_blk virtio_ring virtio
[ 303.959061] CPU 0
[ 303.959061] Pid: 264, comm: nfsd Tainted: G D 3.8.0-ARCH+ #156 Bochs Bochs
[ 303.959061] RIP: 0010:[<ffffffff81481a8e>] [<ffffffff81481a8e>] _raw_spin_lock+0x1e/0x40
[ 303.959061] RSP: 0018:ffff880037877dd8 EFLAGS: 00010202
[ 303.959061] RAX: 0000000000000100 RBX: ffff880037a2b698 RCX: ffff88003d879278
[ 303.959061] RDX: ffff88003d879278 RSI: dead000000100100 RDI: 0000000000000318
[ 303.959061] RBP: ffff880037877dd8 R08: ffff88003c5a0f00 R09: 0000000000000002
[ 303.959061] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[ 303.959061] R13: 0000000000000318 R14: ffff880037a2b680 R15: ffff88003c1cbe00
[ 303.959061] FS: 0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
[ 303.959061] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 303.959061] CR2: 0000000000000318 CR3: 000000003d49c000 CR4: 00000000000006f0
[ 303.959061] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 303.959061] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 303.959061] Process nfsd (pid: 264, threadinfo ffff880037876000, task ffff88003c1fd0a0)
[ 303.959061] Stack:
[ 303.959061] ffff880037877e08 ffffffffa03772ec ffff88003d879000 ffff88003d879278
[ 303.959061] ffff88003d879080 0000000000000000 ffff880037877e38 ffffffffa0222a1f
[ 303.959061] 0000000000107ac0 ffff88003c22e000 ffff88003d879000 ffff88003c1cbe00
[ 303.959061] Call Trace:
[ 303.959061] [<ffffffffa03772ec>] nfsd4_conn_lost+0x3c/0xa0 [nfsd]
[ 303.959061] [<ffffffffa0222a1f>] svc_delete_xprt+0x10f/0x180 [sunrpc]
[ 303.959061] [<ffffffffa0223d96>] svc_recv+0xe6/0x580 [sunrpc]
[ 303.959061] [<ffffffffa03587c5>] nfsd+0xb5/0x140 [nfsd]
[ 303.959061] [<ffffffffa0358710>] ? nfsd_destroy+0x90/0x90 [nfsd]
[ 303.959061] [<ffffffff8107ae00>] kthread+0xc0/0xd0
[ 303.959061] [<ffffffff81010000>] ? perf_trace_xen_mmu_set_pte_at+0x50/0x100
[ 303.959061] [<ffffffff8107ad40>] ? kthread_freezable_should_stop+0x70/0x70
[ 303.959061] [<ffffffff814898ec>] ret_from_fork+0x7c/0xb0
[ 303.959061] [<ffffffff8107ad40>] ? kthread_freezable_should_stop+0x70/0x70
[ 303.959061] Code: ff ff 5d c3 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 55 65 48 8b 04 25 f0 c6 00 00 48 89 e5 83 80 44 e0 ff ff 01 b8 00 01 00 00 <3e> 66 0f c1 07 0f b6 d4 38 c2 74 0f 66 0f 1f 44 00 00 f3 90 0f
[ 303.959061] RIP [<ffffffff81481a8e>] _raw_spin_lock+0x1e/0x40
[ 303.959061] RSP <ffff880037877dd8>
[ 303.959061] CR2: 0000000000000318
[ 304.001218] ---[ end trace 2d809cd4a7931f5a ]---
[ 304.001903] note: nfsd[264] exited with preempt_count 2

Reported-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
9d313b17db965ae42137c5d4dd3063037544c4cd 28-Feb-2013 J. Bruce Fields <bfields@redhat.com> nfsd4: handle seqid-mutating open errors from xdr decoding

If a client sets an owner (or group_owner or acl) attribute on open for
create, and the mapping of that owner to an id fails, then we return
BAD_OWNER. But BAD_OWNER is a seqid-mutating error, so we can't
shortcut the open processing that case: we have to at least look up the
owner so we can find the seqid to bump.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
a77c806fb9d097bb7733b64207cf52fc2c6438bb 16-Mar-2013 Chuck Lever <chuck.lever@oracle.com> SUNRPC: Refactor nfsd4_do_encode_secinfo()

Clean up. This matches a similar API for the client side, and
keeps ULP fingers out the of the GSS mech switch.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
64a817cfbded8674f345d1117b117f942a351a69 26-Mar-2013 J. Bruce Fields <bfields@redhat.com> nfsd4: reject "negative" acl lengths

Since we only enforce an upper bound, not a lower bound, a "negative"
length can get through here.

The symptom seen was a warning when we attempt to a kmalloc with an
excessive size.

Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
3dadecce20603aa380023c65e6f55f108fd5e952 24-Jan-2013 Al Viro <viro@zeniv.linux.org.uk> switch vfs_getattr() to struct path

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
03bc6d1cc1759e6b5959cacc02a19ef36e95e741 02-Feb-2013 Eric W. Biederman <ebiederm@xmission.com> nfsd: Modify nfsd4_cb_sec to use kuids and kgids

Change uid and gid in struct nfsd4_cb_sec to be of type kuid_t and
kgid_t.

In nfsd4_decode_cb_sec when reading uids and gids off the wire convert
them to kuids and kgids, and if they don't convert to valid kuids or
valid kuids ignore RPC_AUTH_UNIX and don't fill in any of the fields.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
ab8e4aee0a3f73d1b12e6d63b42075f0586ad4fd 02-Feb-2013 Eric W. Biederman <ebiederm@xmission.com> nfsd: Handle kuids and kgids in the nfs4acl to posix_acl conversion

In struct nfs4_ace remove the member who and replace it with an
anonymous union holding who_uid and who_gid. Allowing typesafe
storage uids and gids.

Add a helper pace_gt for sorting posix_acl_entries.

In struct posix_user_ace_state to replace uid with a union
of kuid_t uid and kgid_t gid.

Remove all initializations of the deprecated posic_acl_entry
e_id field. Which is not present when user namespaces are enabled.

Split find_uid into two functions find_uid and find_gid that work
in a typesafe manner.

In nfs4xdr update nfsd4_encode_fattr to deal with the changes
in struct nfs4_ace.

Rewrite nfsd4_encode_name to take a kuid_t and a kgid_t instead
of a generic id and flag if it is a group or a uid. Replace
the group flag with a test for a valid gid.

Modify nfsd4_encode_user to take a kuid_t and call the modifed
nfsd4_encode_name.

Modify nfsd4_encode_group to take a kgid_t and call the modified
nfsd4_encode_name.

Modify nfsd4_encode_aclname to take an ace instead of taking the
fields of an ace broken out. This allows it to detect if the ace is
for a user or a group and to pass the appropriate value while still
being typesafe.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
84822d0b3bc5a74a4290727dd1ab4fc7dcd6a348 14-Dec-2012 J. Bruce Fields <bfields@redhat.com> nfsd4: simplify nfsd4_encode_fattr interface slightly

It seems slightly simpler to make nfsd4_encode_fattr rather than its
callers responsible for advancing the write pointer on success.

(Also: the count == 0 check in the verify case looks superfluous.
Running out of buffer space is really the only reason fattr encoding
should fail with eresource.)

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
afc59400d6c65bad66d4ad0b2daf879cbff8e23e 11-Dec-2012 J. Bruce Fields <bfields@redhat.com> nfsd4: cleanup: replace rq_resused count by rq_next_page pointer

It may be a matter of personal taste, but I find this makes the code
clearer.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
d5f50b0c290431c65377c4afa1c764e2c3fe5305 05-Dec-2012 J. Bruce Fields <bfields@redhat.com> nfsd4: fix oops on unusual readlike compound

If the argument and reply together exceed the maximum payload size, then
a reply with a read-like operation can overlow the rq_pages array.

Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
e5f9570319771bb0a5afc792b34fbd5564b935c8 30-Nov-2012 J. Bruce Fields <bfields@redhat.com> nfsd4: discard some unused nfsd4_verify xdr code

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
3d7337115d06f21970e23684f4d2e62e3a44c572 27-Nov-2012 Stanislav Kinsbursky <skinsbursky@parallels.com> nfsd: make NFSv4 lease time per net

Lease time is a part of NFSv4 state engine, which is constructed per network
namespace.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
a36b1725b342c8131a86a0238789d8e7bcb490dd 25-Nov-2012 J. Bruce Fields <bfields@redhat.com> nfsd4: return badname, not inval, on "." or "..", or "/"

The spec requires badname, not inval, in these cases.

Some callers want us to return enoent, but I can see no justification
for that.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
ffe1137ba743cdf1c2414d5a89690aec1daa6bba 15-Nov-2012 J. Bruce Fields <bfields@redhat.com> nfsd4: delay filling in write iovec array till after xdr decoding

Our server rejects compounds containing more than one write operation.
It's unclear whether this is really permitted by the spec; with 4.0,
it's possibly OK, with 4.1 (which has clearer limits on compound
parameters), it's probably not OK. No client that we're aware of has
ever done this, but in theory it could be useful.

The source of the limitation: we need an array of iovecs to pass to the
write operation. In the worst case that array of iovecs could have
hundreds of elements (the maximum rwsize divided by the page size), so
it's too big to put on the stack, or in each compound op. So we instead
keep a single such array in the compound argument.

We fill in that array at the time we decode the xdr operation.

But we decode every op in the compound before executing any of them. So
once we've used that array we can't decode another write.

If we instead delay filling in that array till the time we actually
perform the write, we can reuse it.

Another option might be to switch to decoding compound ops one at a
time. I considered doing that, but it has a number of other side
effects, and I'd rather fix just this one problem for now.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
70cc7f75b1ee4161dfdea1012223db25712ab1a5 16-Nov-2012 J. Bruce Fields <bfields@redhat.com> nfsd4: move more write parameters into xdr argument

In preparation for moving some of this elsewhere.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
5a80a54d21c96590d013378d8c5f65f879451ab4 16-Nov-2012 J. Bruce Fields <bfields@redhat.com> nfsd4: reorganize write decoding

In preparation for moving some of it elsewhere.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
8a61b18c9b13987310d0f3ba13aa04af51f02a1c 17-Nov-2012 J. Bruce Fields <bfields@redhat.com> nfsd4: simplify reading of opnum

The comment here is totally bogus:
- OP_WRITE + 1 is RELEASE_LOCKOWNER. Maybe there was some older
version of the spec in which that served as a sort of
OP_ILLEGAL? No idea, but it's clearly wrong now.
- In any case, I can't see that the spec says anything about
what to do if the client sends us less ops than promised.
It's clearly nutty client behavior, and we should do
whatever's easiest: returning an xdr error (even though it
won't be consistent with the error on the last op returned)
seems fine to me.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
447bfcc936ce28636833e89c4b82f424a291dde9 17-Nov-2012 J. Bruce Fields <bfields@redhat.com> nfsd4: no, we're not going to check tags for utf8

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12fc3e92d4b18b4e99af624586e1696479ff36ce 05-Nov-2012 J. Bruce Fields <bfields@redhat.com> nfsd4: backchannel should use client-provided security flavor

For now this only adds support for AUTH_NULL. (Previously we assumed
AUTH_UNIX.) We'll also need AUTH_GSS, which is trickier.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
cb73a9f4649bf63c0397e565a15abf8a91ecf56f 01-Nov-2012 J. Bruce Fields <bfields@redhat.com> nfsd4: implement backchannel_ctl operation

This operation is mandatory for servers to implement.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
acb2887e04c2140c2c63c8bf94e0b446efcc7001 27-Mar-2012 J. Bruce Fields <bfields@redhat.com> nfsd4: clean up callback security parsing

Move the callback parsing into a separate function.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
ae7095a7c44b4cda963e3d4059788ff60e119684 01-Oct-2012 J. Bruce Fields <bfields@redhat.com> nfsd4: helper function for getting mounted_on ino

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
6e67b5d1840b5788875ad561f2e76a1bf5facc86 13-Sep-2012 J. Bruce Fields <bfields@redhat.com> nfsd4: fix bind_conn_to_session xdr comment

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2930d381d22b9c56f40dd4c63a8fa59719ca2c3c 05-Jun-2012 J. Bruce Fields <bfields@redhat.com> nfsd4: our filesystems are normally case sensitive

Actually, xfs and jfs can optionally be case insensitive; we'll handle
that case in later patches.

Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
e7a0444aef4a1649bc155fd5c6d6ab3f8bdc88ab 24-Apr-2012 Weston Andros Adamson <dros@netapp.com> nfsd: add IPv6 addr escaping to fs_location hosts

The fs_location->hosts list is split on colons, but this doesn't work when
IPv6 addresses are used (they contain colons).
This patch adds the function nfsd4_encode_components_esc() to
allow the caller to specify escape characters when splitting on 'sep'.
In order to fix referrals, this patch must be used with the mountd patch
that similarly fixes IPv6 [] escaping.

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
45eaa1c1a16122a98bf995c004c23806759d2e5f 26-Apr-2012 J. Bruce Fields <bfields@redhat.com> nfsd4: fix change attribute endianness

Though actually this doesn't matter much, as NFSv4.0 clients are
required to treat the change attribute as opaque.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
d1829b38241394c0c66d407a165fbd6d9897c241 26-Apr-2012 J. Bruce Fields <bfields@redhat.com> nfsd4: fix free_stateid return endianness

Cc: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
57b7b43b403136dc18d067909050e8677f97aeed 25-Apr-2012 J. Bruce Fields <bfields@redhat.com> nfsd4: int/__be32 fixes

In each of these cases there's a simple unambiguous correct choice, and
no actual bug.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2355c59644def5950f982fc1509dd45037e79ded 25-Apr-2012 J. Bruce Fields <bfields@redhat.com> nfsd4: fix missing "static"

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
afcf6792afd66209161495f691e19d4fc5460a93 13-Apr-2012 Al Viro <viro@zeniv.linux.org.uk> nfsd: fix error value on allocation failure in nfsd4_decode_test_stateid()

PTR_ERR(NULL) is going to be 0...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
02f5fde5df0ea930e70f93763dd48beff182b208 13-Apr-2012 Al Viro <viro@zeniv.linux.org.uk> nfsd: fix endianness breakage in TEST_STATEID handling

->ts_id_status gets nfs errno, i.e. it's already big-endian; no need
to apply htonl() to it. Broken by commit 174568 (NFSD: Added TEST_STATEID
operation) last year...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
ab4684d1560f8d77f6ce82bd3f1f82937070d397 02-Mar-2012 Chuck Lever <chuck.lever@oracle.com> NFSD: Fix nfs4_verifier memory alignment

Clean up due to code review.

The nfs4_verifier's data field is not guaranteed to be u32-aligned.
Casting an array of chars to a u32 * is considered generally
hazardous.

We can fix most of this by using a __be32 array to generate the
verifier's contents and then byte-copying it into the verifier field.

However, there is one spot where there is a backwards compatibility
constraint: the do_nfsd_create() call expects a verifier which is
32-bit aligned. Fix this spot by forcing the alignment of the create
verifier in the nfsd4_open args structure.

Also, sizeof(nfs4_verifer) is the size of the in-core verifier data
structure, but NFS4_VERIFIER_SIZE is the number of octets in an XDR'd
verifier. The two are not interchangeable, even if they happen to
have the same value.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
d24433cdc91c0ed15938d2a6ee9e3e1b00fcfaa3 16-Feb-2012 Benny Halevy <benny@tonian.com> nfsd41: implement NFS4_SHARE_WANT_NO_DELEG, NFS4_OPEN_DELEGATE_NONE_EXT, why_no_deleg

Respect client request for not getting a delegation in NFSv4.1
Appropriately return delegation "type" NFS4_OPEN_DELEGATE_NONE_EXT
and WND4_NOT_WANTED reason.

[nfsd41: add missing break when encoding op_why_no_deleg]
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
03cfb42025a16dc45195dbdd6d368daaa8367429 27-Jan-2012 Bryan Schumaker <bjschuma@netapp.com> NFSD: Clean up the test_stateid function

When I initially wrote it, I didn't understand how lists worked so I
wrote something that didn't use them. I think making a list of stateids
to test is a more straightforward implementation, especially compared to
especially compared to decoding stateids while simultaneously encoding
a reply to the client.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2c8bd7e0d1b66b2f8f267fd6ab62a30569c792c0 16-Feb-2012 Benny Halevy <benny@tonian.com> nfsd41: split out share_access want and signal flags while decoding

Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
73e79482b40fb6671915e3da0d178862a07ef254 13-Feb-2012 J. Bruce Fields <bfields@redhat.com> nfsd4: rearrange struct nfsd4_slot

Combine two booleans into a single flag field, move the smaller fields
to the end.

(In practice this doesn't make the struct any smaller. But we'll be
adding another flag here soon.)

Remove some debugging code that doesn't look useful, while we're in the
neighborhood.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
67114fe6103183190fb0a24f58e5695a80beb2ec 17-Nov-2011 Thomas Meyer <thomas@m3y3r.de> nfsd4: Use kmemdup rather than duplicating its implementation

The semantic patch that makes this change is available
in scripts/coccinelle/api/memdup.cocci.

Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
fc0d14fe2d6403eb21202fd0c1cf67cd2c85ca67 27-Oct-2011 Benny Halevy <bhalevy@tonian.com> nfsd4: typo logical vs bitwise negate in nfsd4_decode_share_access

Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
345c284290cabb5484df909303e73d6def8ec8ec 20-Oct-2011 Mi Jinlong <mijinlong@cn.fujitsu.com> nfs41: implement DESTROY_CLIENTID operation

According to rfc5661 18.50, implement DESTROY_CLIENTID operation.

Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
92bac8c5d60623167c6802b1f125e6d623708185 20-Oct-2011 Benny Halevy <bhalevy@tonian.com> nfsd4: typo logical vs bitwise negate for want_mask

Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
c668fc6dfcce98f8222ce1c997f4e5c4ac63f3d0 20-Oct-2011 Benny Halevy <bhalevy@tonian.com> nfsd4: allow NFS4_SHARE_SIGNAL_DELEG_WHEN_RESRC_AVAIL | NFS4_SHARE_PUSH_DELEG_WHEN_UNCONTENDED

RFC5661 says:
The client may set one or both of
OPEN4_SHARE_ACCESS_WANT_SIGNAL_DELEG_WHEN_RESRC_AVAIL and
OPEN4_SHARE_ACCESS_WANT_PUSH_DELEG_WHEN_UNCONTENDED.

Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
8b289b2c2355c3bea75f3e499b4aa251a3191382 19-Oct-2011 J. Bruce Fields <bfields@redhat.com> nfsd4: implement new 4.1 open reclaim types

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
01cd4afadbf376de07d364a632cc82a0fc5e8655 17-Oct-2011 Dan Carpenter <dan.carpenter@oracle.com> nfsd4: typo logical vs bitwise negate

This should be a bitwise negate here. It silences a Sparse warning:
fs/nfsd/nfs4xdr.c:693:16: warning: dubious: x & !y

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
a084daf512bb66fa3c8e21c7027daea521179cd0 10-Oct-2011 J. Bruce Fields <bfields@redhat.com> nfsd4: move name-length checks to xdr

Again, these checks are better in the xdr code.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
04f9e664b21c4440daf4d08f31db9b18517e4b8d 10-Oct-2011 J. Bruce Fields <bfields@redhat.com> nfsd4: move access/deny validity checks to xdr code

I'd rather put more of these sorts of checks into standardized xdr
decoders for the various types rather than have them cluttering up the
core logic in nfs4proc.c and nfs4state.c.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
38c2f4b12a455cb3a108fd5c79a10df2ba3ec9a7 23-Sep-2011 J. Bruce Fields <bfields@redhat.com> nfsd4: look up stateid's per clientid

Use a separate stateid idr per client, and lookup a stateid by first
finding the client, then looking up the stateid relative to that client.

Also some minor refactoring.

This allows us to improve error returns: we can return expired when the
clientid is not found and bad_stateid when the clientid is found but not
the stateid, as opposed to returning expired for both cases.

I hope this will also help to replace the state lock mostly by a
per-client lock, but that hasn't been done yet.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
36279ac10c3d69372af875f1affafd375db687a9 26-Sep-2011 J. Bruce Fields <bfields@redhat.com> nfsd4: assume test_stateid always has session

Test_stateid is 4.1-only and only allowed after a sequence operation, so
this check is unnecessary.

Cc: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
38c387b52d8404f8fd29d8c26bebc83a80733657 16-Sep-2011 J. Bruce Fields <bfields@redhat.com> nfsd4: match close replays on stateid, not open owner id

Keep around an unhashed copy of the final stateid after the last close
using an openowner, and when identifying a replay, match against that
stateid instead of just against the open owner id. Free it the next
time the seqid is bumped or the stateowner is destroyed.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
58e7b33a58d0cd07c9294d5161553b204c75662d 28-Aug-2011 Mi Jinlong <mijinlong@cn.fujitsu.com> nfsd41: try to check reply size before operation

For checking the size of reply before calling a operation,
we need try to get maxsize of the operation's reply.

v3: using new method as Bruce said,

"we could handle operations in two different ways:

- For operations that actually change something (write, rename,
open, close, ...), do it the way we're doing it now: be
very careful to estimate the size of the response before even
processing the operation.
- For operations that don't change anything (read, getattr, ...)
just go ahead and do the operation. If you realize after the
fact that the response is too large, then return the error at
that point.

So we'd add another flag to op_flags: say, OP_MODIFIES_SOMETHING. And for
operations with OP_MODIFIES_SOMETHING set, we'd do the first thing. For
operations without it set, we'd do the second."

Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
[bfields@redhat.com: crash, don't attempt to handle, undefined op_rsize_bop]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
ed748aacb8e3318fa2cf24e1c197d35b5fd29605 13-Sep-2011 Trond Myklebust <Trond.Myklebust@netapp.com> NFSD: Cleanup for nfsd4_path()

The current code is sort of hackish in that it assumes a referral is always
matched to an export. When we add support for junctions that may not be the
case.
We can replace nfsd4_path() with a function that encodes the components
directly from the dentries. Since nfsd4_path is currently the only user of
the 'ex_pathname' field in struct svc_export, this has the added benefit
of allowing us to get rid of that.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
fe0750e5c43189adb6e6fc59837af7d5a588f413 31-Jul-2011 J. Bruce Fields <bfields@redhat.com> nfsd4: split stateowners into open and lockowners

The stateowner has some fields that only make sense for openowners, and
some that only make sense for lockowners, and I find it a lot clearer if
those are separated out.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
7c13f344cf8bec22301c5ed7ef1d90eecb57ba43 31-Aug-2011 J. Bruce Fields <bfields@redhat.com> nfsd4: drop most stateowner refcounting

Maybe we'll bring it back some day, but we don't have much real use for
it now.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
9072d5c66b17292e3cd055bc8057b2ce6af2fe34 24-Aug-2011 J. Bruce Fields <bfields@redhat.com> nfsd4: cleanup seqid op stateowner usage

Now that the replay owner is in the cstate we can remove it from a lot
of other individual operations and further simplify
nfs4_preprocess_seqid_op().

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
f3e4223751392b9bc0195a806a6e99b4cc399ac0 24-Aug-2011 J. Bruce Fields <bfields@redhat.com> nfsd4: centralize handling of replay owners

Set the stateowner associated with a replay in one spot in
nfs4_preprocess_seqid_op() and keep it in cstate. This allows removing
a few lines of boilerplate from all the nfs4_preprocess_seqid_op()
callers.

Also turn ENCODE_SEQID_OP_TAIL into a function while we're here.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
b7d7ca35807b4c8ca3271885b47e67c843376f77 31-Aug-2011 J. Bruce Fields <bfields@redhat.com> nfsd4: fix off-by-one-error in SEQUENCE reply

The values here represent highest slotid numbers. Since slotid's are
numbered starting from zero, the highest should be one less than the
number of slots.

Reported-by: Rick Macklem <rmacklem@uoguelph.ca>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
a9004abc34239705840eaf6fe3d6cc9cb7656cba 23-Aug-2011 J. Bruce Fields <bfields@redhat.com> nfsd4: cleanup and consolidate seqid_mutating_err

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
75c096f753b273b59f1b9a0745e9e4b5d911a312 16-Aug-2011 J. Bruce Fields <bfields@redhat.com> nfsd4: it's OK to return nfserr_symlink

The nfsd4 code has a bunch of special exceptions for error returns which
map nfserr_symlink to other errors.

In fact, the spec makes it clear that nfserr_symlink is to be preferred
over less specific errors where possible.

The patch that introduced it back in 2.6.4 is "kNFSd: correct symlink
related error returns.", which claims that these special exceptions are
represent an NFSv4 break from v2/v3 tradition--when in fact the symlink
error was introduced with v4.

I suspect what happened was pynfs tests were written that were overly
faithful to the (known-incomplete) rfc3530 error return lists, and then
code was fixed up mindlessly to make the tests pass.

Delete these unnecessary exceptions.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
3d2544b1e4909b6dffa0d140273628913e255e45 15-Aug-2011 J. Bruce Fields <bfields@redhat.com> nfsd4: clean up S_IS -> NF4 file type mapping

A slightly unconventional approach to make the code more compact I could
live with, but let's give the poor reader *some* chance.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
576163005de286bbd418fcb99cfd0971523a0c6d 11-Aug-2011 J. Bruce Fields <bfields@redhat.com> nfsd4: fix seqid_mutating_error

The set of errors here does *not* agree with the set of errors specified
in the rfc!

While we're there, turn this macros into a function, for the usual
reasons, and move it to the one place where it's actually used.

Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
1091006c5eb15cba56785bd5b498a8d0b9546903 24-Jan-2011 J. Bruce Fields <bfields@redhat.com> nfsd: turn on reply cache for NFSv4

It's sort of ridiculous that we've never had a working reply cache for
NFSv4.

On the other hand, we may still not: our current reply cache is likely
not very good, especially in the TCP case (which is the only case that
matters for v4). What we really need here is some serious testing.

Anyway, here's a start.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
3e98abffd1665b884a322aedcd528577842f762f 16-Jul-2011 J. Bruce Fields <bfields@redhat.com> nfsd4: call nfsd4_release_compoundargs from pc_release

This simplifies cleanup a bit.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
17456804546b78a1c13d2b934c8f50bbde141a38 13-Jul-2011 Bryan Schumaker <bjschuma@netapp.com> NFSD: Added TEST_STATEID operation

This operation is used by the client to check the validity of a list of
stateids.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
e1ca12dfb1be7fe8b82ca723a9b511f7d808bf81 13-Jul-2011 Bryan Schumaker <bjschuma@netapp.com> NFSD: added FREE_STATEID operation

This operation is used by the client to tell the server to free a
stateid.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
c47d832bc0155153920e507f075647519bad09a2 16-May-2011 Daniel Mack <zonque@gmail.com> nfsd: make local functions static

This also fixes a number of sparse warnings.

Signed-off-by: Daniel Mack <zonque@gmail.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: J. Bruce Fields <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
6ce2357f1e73e0da8ebeace6ec829f48a646bb8c 27-Apr-2011 Bryan Schumaker <bjschuma@netapp.com> NFSD: Remove unused variable from nfsd4_decode_bind_conn_to_session()

Compiling gave me this warning:
fs/nfsd/nfs4xdr.c: In function ‘nfsd4_decode_bind_conn_to_session’:
fs/nfsd/nfs4xdr.c:427:6: warning: variable ‘dummy’ set but not used
[-Wunused-but-set-variable]

The local variable "dummy" wasn't being used past the READ32() macro that
set it. READ_BUF() should ensure that the xdr buffer is pushed past the
data read into dummy already, so nothing needs to be read in.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
[bfields@redhat.com: minor comment fixup.]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
b7c66360dc34e64742edabf4dc410070aa883119 22-Apr-2011 Andy Adamson <andros@netapp.com> nfsd v4.1 lOCKT clientid field must be ignored

RFC 5661 Section 18.11.3

The clientid field of the owner MAY be set to any value by the client
and MUST be ignored by the server. The reason the server MUST ignore
the clientid field is that the server MUST derive the client ID from
the session ID from the SEQUENCE operation of the COMPOUND request.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
5a02ab7c3c4580f94d13c683721039855b67cda6 10-Mar-2011 Mi Jinlong <mijinlong@cn.fujitsu.com> nfsd: wrong index used in inner loop

We must not use dummy for index.
After the first index, READ32(dummy) will change dummy!!!!

Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
[bfields@redhat.com: Trond points out READ_BUF alone is sufficient.]
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
3ec07aa9522e3d5e9d5ede7bef946756e623a0a0 08-Mar-2011 roel <roel.kluin@gmail.com> nfsd: wrong index used in inner loop

Index i was already used in the outer loop

Cc: stable@kernel.org
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
47c85291d3dd1a51501555000b90f8e281a0458e 16-Feb-2011 NeilBrown <neilb@suse.de> nfsd: correctly handle return value from nfsd_map_name_to_*

These functions return an nfs status, not a host_err. So don't
try to convert before returning.

This is a regression introduced by
3c726023402a2f3b28f49b9d90ebf9e71151157d; I fixed up two of the callers,
but missed these two.

Cc: stable@kernel.org
Reported-by: Herbert Poetzl <herbert@13thfloor.at>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
0d7bb71907546b2baf15d78edd3e508e12963dbf 18-Nov-2010 J. Bruce Fields <bfields@redhat.com> nfsd4: set sequence flag when backchannel is down

Implement the SEQ4_STATUS_CB_PATH_DOWN flag.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
1d1bc8f2074f0b728dfca2a3c16f2f5a3f298ffc 05-Oct-2010 J. Bruce Fields <bfields@redhat.com> nfsd4: support BIND_CONN_TO_SESSION

Basic xdr and processing for BIND_CONN_TO_SESSION. This adds a
connection to the list of connections associated with a session.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
3c726023402a2f3b28f49b9d90ebf9e71151157d 04-Jan-2011 J. Bruce Fields <bfields@redhat.com> nfsd4: return nfs errno from name_to_id functions

This avoids the need for the confusing ESRCH mapping.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2ca72e17e5acb1052c35c9faba609c2289ce7a92 04-Jan-2011 J. Bruce Fields <bfields@redhat.com> nfsd4: move idmap and acl header files into fs/nfsd

These are internal nfsd interfaces.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
da165dd60e136d0609e0a2c0c2a9b9a5372200d6 03-Jan-2011 J. Bruce Fields <bfields@redhat.com> nfsd: remove some unnecessary dropit handling

We no longer need a few of these special cases.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
22b6dee842c6341b49bc09cc5728eb2f8f2b3766 27-Dec-2010 Mi Jinlong <mijinlong@cn.fujitsu.com> nfsd4: fix oops on secinfo_no_name result encoding

The secinfo_no_name code oopses on encoding with

BUG: unable to handle kernel NULL pointer dereference at 00000044
IP: [<e2bd239a>] nfsd4_encode_secinfo+0x1c/0x1c1 [nfsd]

We should implement a nfsd4_encode_secinfo_no_name() instead using
nfsd4_encode_secinfo().

Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
04f4ad16b231abbfde34c762697ad035a3af0b5f 16-Dec-2010 J. Bruce Fields <bfields@redhat.com> nfsd4: implement secinfo_no_name

Implementation of this operation is mandatory for NFSv4.1.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
5afa040b307952bb804eba34b21646da2842e14d 09-Nov-2010 Mi Jinlong <mijinlong@cn.fujitsu.com> NFSv4.1: Make sure nfsd can decode SP4_SSV correctly at exchange_id

According to RFC, the argument of ssv_sp_parms4 is:

struct ssv_sp_parms4 {
state_protect_ops4 ssp_ops;
sec_oid4 ssp_hash_algs<>;
sec_oid4 ssp_encr_algs<>;
uint32_t ssp_window;
uint32_t ssp_num_gss_handles;
};

If client send a exchange_id with SP4_SSV, server cann't decode
the SP4_SSV's ssp_hash_algs and ssp_encr_algs arguments correctly.

Because the kernel treat the two arguments as a signal
sec_oid4 struct, but should be a set of sec_oid4 struct.

Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2b44f1ba40914777f4b1075254ba97663d4e2574 30-Sep-2010 Benny Halevy <bhalevy@panasas.com> nfsd4: adjust buflen for encoded attrs bitmap based on actual bitmap length

The existing code adjusted it based on the worst case scenario for the returned
bitmap and the best case scenario for the supported attrs attribute.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[bfields@redhat.com: removed likely/unlikely's]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
ebabe9a9001af0af56c0c2780ca1576246e7a74b 07-Jul-2010 Christoph Hellwig <hch@lst.de> pass a struct path to vfs_statfs

We'll need the path to implement the flags field for statvfs support.
We do have it available in all callers except:

- ecryptfs_statfs. This one doesn't actually need vfs_statfs but just
needs to do a caller to the lower filesystem statfs method.
- sys_ustat. Add a non-exported statfs_by_dentry helper for it which
doesn't won't be able to fill out the flags field later on.

In addition rename the helpers for statfs vs fstatfs to do_*statfs instead
of the misleading vfs prefix.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
039a87ca536a85bc169ce092e44bd57adfa1f563 30-Jul-2010 J. Bruce Fields <bfields@redhat.com> nfsd: minor nfsd read api cleanup

Christoph points that the NFSv2/v3 callers know which case they want
here, so we may as well just call the file=NULL case directly instead of
making this conditional.

Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
76407f76e0f71428f3c31faff004bff87fea51ba 22-Jun-2010 J. Bruce Fields <bfields@citi.umich.edu> nfsd4; fix session reference count leak

Note the session has to be put() here regardless of what happens to the
client.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
4dc6ec00f6347b72312fa41dfc587d5302b05544 19-Apr-2010 J. Bruce Fields <bfields@citi.umich.edu> nfsd4: implement reclaim_complete

This is a mandatory operation. Also, here (not in open) is where we
should be committing the reboot recovery information.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
d76829889ac4250a18cfcc1a606bb256bb9c570c 11-May-2010 Benny Halevy <bhalevy@panasas.com> nfsd4: keep a reference count on client while in use

Get a refcount on the client on SEQUENCE,
Release the refcount and renew the client when all respective compounds completed.
Do not expire the client by the laundromat while in use.
If the client was expired via another path, free it when the compounds
complete and the refcount reaches 0.

Note that unhash_client_locked must call list_del_init on cl_lru as
it may be called twice for the same client (once from nfs4_laundromat
and then from expire_client)

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
dbd65a7e44fff4741a0b2c84bd6bace85d22c242 03-May-2010 Benny Halevy <bhalevy@panasas.com> nfsd4: use local variable in nfs4svc_encode_compoundres

'cs' is already computed, re-use it.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
26c0c75e69265961e891ed80b38fb62a548ab371 24-Apr-2010 J. Bruce Fields <bfields@citi.umich.edu> nfsd4: fix unlikely race in session replay case

In the replay case, the

renew_client(session->se_client);

happens after we've droppped the sessionid_lock, and without holding a
reference on the session; so there's nothing preventing the session
being freed before we get here.

Thanks to Benny Halevy for catching a bug in an earlier version of this
patch.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Acked-by: Benny Halevy <bhalevy@panasas.com>
2bc3c1179c781b359d4f2f3439cb3df72afc17fc 19-Apr-2010 Neil Brown <neilb@suse.de> nfsd4: bug in read_buf

When read_buf is called to move over to the next page in the pagelist
of an NFSv4 request, it sets argp->end to essentially a random
number, certainly not an address within the page which argp->p now
points to. So subsequent calls to READ_BUF will think there is much
more than a page of spare space (the cast to u32 ensures an unsigned
comparison) so we can expect to fall off the end of the second
page.

We never encountered thsi in testing because typically the only
operations which use more than two pages are write-like operations,
which have their own decoding logic. Something like a getattr after a
write may cross a page boundary, but it would be very unusual for it to
cross another boundary after that.

Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
5a0e3ad6af8660be21ca98a971cd00f331318c05 24-Mar-2010 Tejun Heo <tj@kernel.org> include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
cf07d2ea43e5c22149435ee9002cb737eac20eca 01-Mar-2010 J. Bruce Fields <bfields@citi.umich.edu> nfsd4: simplify references to nfsd4 lease time

Instead of accessing the lease time directly, some users call
nfs4_lease_time(), and some a macro, NFSD_LEASE_TIME, defined as
nfs4_lease_time(). Neither layer of indirection serves any purpose.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
462d60577a997aa87c935ae4521bd303733a9f2b 30-Jan-2010 Al Viro <viro@zeniv.linux.org.uk> fix NFS4 handling of mountpoint stat

RFC says we need to follow the chain of mounts if there's more
than one stacked on that point.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3ad2f3fbb961429d2aa627465ae4829758bc7e07 03-Feb-2010 Daniel Mack <daniel@caiaq.de> tree-wide: Assorted spelling fixes

In particular, several occurances of funny versions of 'success',
'unknown', 'therefore', 'acknowledge', 'argument', 'achieve', 'address',
'beginning', 'desirable', 'separate' and 'necessary' are fixed.

Signed-off-by: Daniel Mack <daniel@caiaq.de>
Cc: Joe Perches <joe@perches.com>
Cc: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
de3cab793c6a5c8505d66bee111edcc7098380ba 12-Dec-2009 Ricardo Labiaga <Ricardo.Labiaga@netapp.com> nfsd4: Use FIRST_NFS4_OP in nfsd4_decode_compound()

Since we're checking for LAST_NFS4_OP, use FIRST_NFS4_OP to be consistent.

Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
c551866e649bac66a5145d100f34086d6edb581e 12-Dec-2009 Ricardo Labiaga <Ricardo.Labiaga@netapp.com> nfsd41: nfsd4_decode_compound() does not recognize all ops

The server incorrectly assumes that the operations in the
array start with value 0. The first operation (OP_ACCESS)
has a value of 3, causing the check in nfsd4_decode_compound
to be off.

Instead of comparing that the operation number is less than
the number of elements in the array, the server should verify
that it is less than the maximum valid operation number
defined by LAST_NFS4_OP.

Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
3227fa41abc191384fa81b3bcf52aa7fccb31536 26-Oct-2009 J. Bruce Fields <bfields@citi.umich.edu> nfsd: filter readdir results in V4ROOT case

As with lookup, we treat every boject as a mountpoint and pretend it
doesn't exist if it isn't exported.

The preexisting code here is confusing, but I haven't yet figured out
how to make it clearer.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
9a74af21330c8d46efa977d088a62cc1bfa954e9 03-Dec-2009 Boaz Harrosh <bharrosh@panasas.com> nfsd: Move private headers to source directory

Lots of include/linux/nfsd/* headers are only used by
nfsd module. Move them to the source directory

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
341eb184469f8e4a6841fc49a57ad4a27e51c335 03-Dec-2009 Boaz Harrosh <bharrosh@panasas.com> nfsd: Source files #include cleanups

Now that the headers are fixed and carry their own wait, all fs/nfsd/
source files can include a minimal set of headers. and still compile just
fine.

This patch should improve the compilation speed of the nfsd module.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
0a3adadee42f2865bb867b8c5f4955b7def9baad 05-Nov-2009 J. Bruce Fields <bfields@citi.umich.edu> nfsd: make fs/nfsd/vfs.h for common includes

None of this stuff is used outside nfsd, so move it out of the common
linux include directory.

Actually, probably none of the stuff in include/linux/nfsd/nfsd.h really
belongs there, so later we may remove that file entirely.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2671a4bf3516757ca028c139a7902a50f2bd994a 02-Sep-2009 Trond Myklebust <Trond.Myklebust@netapp.com> NFSd: Fix filehandle leak in exp_pseudoroot() and nfsd4_path()

nfsd4_path() allocates a temporary filehandle and then fails to free it
before the function exits, leaking reference counts to the dentry and
export that it refers to.

Also, nfsd4_lookupp() puts the result of exp_pseudoroot() in a temporary
filehandle which it releases on success of exp_pseudoroot() but not on
failure; fix exp_pseudoroot to ensure that on failure it releases the
filehandle before returning.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
557ce2646e775f6bda734dd92b10d4780874b9c7 28-Aug-2009 Andy Adamson <andros@netapp.com> nfsd41: replace page based DRC with buffer based DRC

Use NFSD_SLOT_CACHE_SIZE size buffers for sessions DRC instead of holding nfsd
pages in cache.

Connectathon testing has shown that 1024 bytes for encoded compound operation
responses past the sequence operation is sufficient, 512 bytes is a little too
small. Set NFSD_SLOT_CACHE_SIZE to 1024.

Allocate memory for the session DRC in the CREATE_SESSION operation
to guarantee that the memory resource is available for caching responses.
Allocate each slot individually in preparation for slot table size negotiation.

Remove struct nfsd4_cache_entry and helper functions for the old page-based
DRC.

The iov_len calculation in nfs4svc_encode_compoundres is now always
correct. Replay is now done in nfsd4_sequence under the state lock, so
the session ref count is only bumped on non-replay. Clean up the
nfs4svc_encode_compoundres session logic.

The nfsd4_compound_state statp pointer is also not used.
Remove nfsd4_set_statp().

Move useful nfsd4_cache_entry fields into nfsd4_slot.

Signed-off-by: Andy Adamson <andros@netapp.com
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
a06b1261bdb580b35967d0e055d1ab131b332254 31-Aug-2009 Trond Myklebust <Trond.Myklebust@netapp.com> NFSD: Fix a bug in the NFSv4 'supported attrs' mandatory attribute

The fact that the filesystem doesn't currently list any alternate
locations does _not_ imply that the fs_locations attribute should be
marked as "unsupported".

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
49557cc74c7bdf6a984be227ead9a84b3a26f053 24-Jul-2009 Andy Adamson <andros@netapp.com> nfsd41: Use separate DRC for setclientid

Instead of trying to share the generic 4.1 reply cache code for the
CREATE_SESSION reply cache, it's simpler to handle CREATE_SESSION
separately.

The nfs41 single slot clientid DRC holds the results of create session
processing. CREATE_SESSION can be preceeded by a SEQUENCE operation
(an embedded CREATE_SESSION) and the create session single slot cache must be
maintained. nfsd4_replay_cache_entry() and nfsd4_store_cache_entry() do not
implement the replay of an embedded CREATE_SESSION.

The clientid DRC slot does not need the inuse, cachethis or other fields that
the multiple slot session cache uses. Replace the clientid DRC cache struct
nfs4_slot cache with a new nfsd4_clid_slot cache. Save the xdr struct
nfsd4_create_session into the cache at the end of processing, and on a replay,
replace the struct for the replay request with the cached version all while
under the state lock.

nfsd4_proc_compound will handle both the solo and embedded CREATE_SESSION case
via the normal use of encode_operation.

Errors that do not change the create session cache:
A create session NFS4ERR_STALE_CLIENTID error means that a client record
(and associated create session slot) could not be found and therefore can't
be changed. NFSERR_SEQ_MISORDERED errors do not change the slot cache.

All other errors get cached.

Remove the clientid DRC specific check in nfs4svc_encode_compoundres to
put the session only if cstate.session is set which will now always be true.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
6c18ba9f5e506b8115b89b1aa7bdc25178f40b0a 16-Jun-2009 Alexandros Batsakis <Alexandros.Batsakis@netapp.com> nfsd41: move channel attributes from nfsd4_session to a nfsd4_channel_attr struct

the change is valid for both the forechannel and the backchannel (currently dummy)

Signed-off-by: Alexandros Batsakis <Alexandros.Batsakis@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
3c8e03166ae234d16e7871f8009638e0946d303c 16-May-2009 Yu Zhiguo <yuzg@cn.fujitsu.com> NFSv4: do exact check about attribute specified

Server should return NFS4ERR_ATTRNOTSUPP if an attribute specified is
not supported in current environment.
Operations CREATE, NVERIFY, OPEN, SETATTR and VERIFY should do this check.

This bug is found when do newpynfs tests. The names of the tests that failed
are following:
CR12 NVF7a NVF7b NVF7c NVF7d NVF7f NVF7r NVF7s
OPEN15 VF7a VF7b VF7c VF7d VF7f VF7r VF7s

Add function do_check_fattr() to do exact check:
1, Check attribute specified is supported by the NFSv4 server or not.
2, Check FATTR4_WORD0_ACL & FATTR4_WORD0_FS_LOCATIONS are supported
in current environment or not.
3, Check attribute specified is writable or not.

step 1 and 3 are done in function nfsd4_decode_fattr() but removed
to this function now.

Signed-off-by: Yu Zhiguo <yuzg@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
b2c0cea6b1cb210e962f07047df602875564069e 06-May-2009 J. Bruce Fields <bfields@citi.umich.edu> nfsd4: check for negative dentry before use in nfsv4 readdir

After 2f9092e1020246168b1309b35e085ecd7ff9ff72 "Fix i_mutex vs. readdir
handling in nfsd" (and 14f7dd63 "Copy XFS readdir hack into nfsd code"),
an entry may be removed between the first mutex_unlock and the second
mutex_lock. In this case, lookup_one_len() will return a negative
dentry. Check for this case to avoid a NULL dereference.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Reviewed-by: J. R. Okajima <hooanon05@yahoo.co.jp>
Cc: stable@kernel.org
9064caae8f47bb7ed5d91712e81f01c1aba2fa3c 29-Apr-2009 Randy Dunlap <randy.dunlap@oracle.com> nfsd: use C99 struct initializers

Eliminate 56 sparse warnings like this one:

fs/nfsd/nfs4xdr.c:1331:15: warning: obsolete array initializer, use C99 syntax

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
c654b8a9cba6002aad1c01919e4928a79a4a6dcf 16-Apr-2009 J. Bruce Fields <bfields@citi.umich.edu> nfsd: support ext4 i_version

ext4 supports a real NFSv4 change attribute, which is bumped whenever
the ctime would be updated, including times when two updates arrive
within a jiffy of each other. (Note that although ext4 has space for
nanosecond-precision ctime, the real resolution is lower: it actually
uses jiffies as the time-source.) This ensures clients will invalidate
their caches when they need to.

There is some fear that keeping the i_version up-to-date could have
performance drawbacks, so for now it's turned on only by a mount option.
We hope to do something better eventually.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Theodore Tso <tytso@mit.edu>
3352d2c2d0540955a7bbb3421a28330af7f9d79c 08-Apr-2009 J. Bruce Fields <bfields@citi.umich.edu> nfsd4: delete obsolete xdr comments

We don't need comments to tell us these macros are ugly. And we're long
past trying to share any of this code with the BSD's.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
bc749ca4c405d507e6ec6e3f3e5475e9a09faf0a 08-Apr-2009 J. Bruce Fields <bfields@citi.umich.edu> nfsd: eliminate ENCODE_HEAD macro

This macro doesn't serve any useful purpose.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
79fb54abd285b442e1f30f851902f3ddf58e7704 03-Apr-2009 Benny Halevy <bhalevy@panasas.com> nfsd41: CREATE_EXCLUSIVE4_1

Implement the CREATE_EXCLUSIVE4_1 open mode conforming to
http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-26

This mode allows the client to atomically create a file
if it doesn't exist while setting some of its attributes.

It must be implemented if the server supports persistent
reply cache and/or pnfs.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
8c18f2052e756e7d5dea712fc6e7ed70c00e8a39 03-Apr-2009 Benny Halevy <bhalevy@panasas.com> nfsd41: SUPPATTR_EXCLCREAT attribute

Return bitmask for supported EXCLUSIVE4_1 create attributes.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
7e70570647827345352cf6c17461c9fa166f570a 03-Apr-2009 Andy Adamson <andros@netapp.com> nfsd41: support for 3-word long attribute bitmask

Also, use client minorversion to generate supported attrs

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
c0d6fc8a2d55a8235c301aeb6d5254d5992895d1 03-Apr-2009 Benny Halevy <bhalevy@panasas.com> nfsd41: pass writable attrs mask to nfsd4_decode_fattr

In preparation for EXCLUSIVE4_1

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
60adfc50de3855628dea8f8896a65f471f51301c 03-Apr-2009 Andy Adamson <andros@netapp.com> nfsd41: clientid handling

Extract the clientid from sessionid to set the op_clientid on open.
Verify that the clid for other stateful ops is zero for minorversion != 0
Do all other checks for stateful ops without sessions.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Andy Adamson <andros@netapp.com>
[fixed whitespace indent]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41 remove sl_session from nfsd4_open]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
496c262cf01106a546ffb7df6fea84b8b881ee19 03-Apr-2009 Andy Adamson <andros@netapp.com> nfsd41: check encode size for sessions maxresponse cached

Calculate the space the compound response has taken after encoding the current
operation.

pad: add on 8 bytes for the next operation's op_code and status so that
there is room to cache a failure on the next operation.

Compare this length to the session se_fmaxresp_cached and return
nfserr_rep_too_big_to_cache if the length is too large.

Our se_fmaxresp_cached will always be a multiple of PAGE_SIZE, and so
will be at least a page and will therefore hold the xdr_buf head.

Signed-off-by: Andy Adamson <andros@netapp.com>
[nfsd41: non-page DRC for solo sequence responses]
[fixed nfsd4_check_drc_limit cosmetics]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: use cstate session in nfsd4_check_drc_limit]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
6668958fac1d05f55420de702f3678d46c1e93a5 03-Apr-2009 Andy Adamson <andros@netapp.com> nfsd41: stateid handling

When sessions are used, stateful operation sequenceid and stateid handling
are not used. When sessions are used, on the first open set the seqid to 1,
mark state confirmed and skip seqid processing.

When sessionas are used the stateid generation number is ignored when it is zero
whereas without sessions bad_stateid or stale stateid is returned.

Add flags to propagate session use to all stateful ops and down to
check_stateid_generation.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Andy Adamson <andros@netapp.com>
[nfsd4_has_session should return a boolean, not u32]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: pass nfsd4_compoundres * to nfsd4_process_open1]
[nfsd41: calculate HAS_SESSION in nfs4_preprocess_stateid_op]
[nfsd41: calculate HAS_SESSION in nfs4_preprocess_seqid_op]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
e10e0cfc2f27364c73b28adbd3c8688d97049e73 03-Apr-2009 Benny Halevy <bhalevy@panasas.com> nfsd41: destroy_session operation

Implement the destory_session operation confoming to
http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-26

[use sessionid_lock spin lock]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
bf864a31d50e3e94d6e76537b97d664913906ff8 03-Apr-2009 Andy Adamson <andros@netapp.com> nfsd41: non-page DRC for solo sequence responses

A session inactivity time compound (lease renewal) or a compound where the
sequence operation has sa_cachethis set to FALSE do not require any pages
to be held in the v4.1 DRC. This is because struct nfsd4_slot is already
caching the session information.

Add logic to the nfs41 server to not cache response pages for solo sequence
responses.

Return nfserr_replay_uncached_rep on the operation following the sequence
operation when sa_cachethis is FALSE.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: use cstate session in nfsd4_replay_cache_entry]
[nfsd41: rename nfsd4_no_page_in_cache]
[nfsd41 rename nfsd4_enc_no_page_replay]
[nfsd41 nfsd4_is_solo_sequence]
[nfsd41 change nfsd4_not_cached return]
Signed-off-by: Andy Adamson <andros@netapp.com>
[changed return type to bool]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41 drop parens in nfsd4_is_solo_sequence call]
Signed-off-by: Andy Adamson <andros@netapp.com>
[changed "== 0" to "!"]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
ec6b5d7b5064fde27aee798b81107ea3a830de85 03-Apr-2009 Andy Adamson <andros@netapp.com> nfsd41: create_session operation

Implement the create_session operation confoming to
http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-26

Look up the client id (generated by the server on exchange_id,
given by the client on create_session).
If neither a confirmed or unconfirmed client is found
then the client id is stale
If a confirmed cilent is found (i.e. we already received
create_session for it) then compare the sequence id
to determine if it's a replay or possibly a mis-ordered rpc.
If the seqid is in order, update the confirmed client seqid
and procedd with updating the session parameters.

If an unconfirmed client_id is found then verify the creds
and seqid. If both match move the client id to confirmed state
and proceed with processing the create_session.

Currently, we do not support persistent sessions, and RDMA.

alloc_init_session generates a new sessionid and creates
a session structure.

NFSD_PAGES_PER_SLOT is used for the max response cached calculation, and for
the counting of DRC pages using the hard limits set in struct srv_serv.

A note on NFSD_PAGES_PER_SLOT:

Other patches in this series allow for NFSD_PAGES_PER_SLOT + 1 pages to be
cached in a DRC slot when the response size is less than NFSD_PAGES_PER_SLOT *
PAGE_SIZE but xdr_buf pages are used. e.g. a READDIR operation will encode a
small amount of data in the xdr_buf head, and then the READDIR in the xdr_buf
pages. So, the hard limit calculation use of pages by a session is
underestimated by the number of cached operations using the xdr_buf pages.

Yet another patch caches no pages for the solo sequence operation, or any
compound where cache_this is False. So the hard limit calculation use of
pages by a session is overestimated by the number of these operations in the
cache.

TODO: improve resource pre-allocation and negotiate session
parameters accordingly. Respect and possibly adjust
backchannel attributes.

Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com>
[nfsd41: remove headerpadsz from channel attributes]
Our client and server only support a headerpadsz of 0.
[nfsd41: use DRC limits in fore channel init]
[nfsd41: do not change CREATE_SESSION back channel attrs]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[use sessionid_lock spin lock]
[nfsd41: use bool inuse for slot state]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41 remove sl_session from alloc_init_session]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[simplify nfsd4_encode_create_session error handling]
[nfsd41: fix comment style in init_forechannel_attrs]
[nfsd41: allocate struct nfsd4_session and slot table in one piece]
[nfsd41: no need to INIT_LIST_HEAD in alloc_init_session just prior to list_add]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
da3846a2866ddf239311766ff434a82e7b4ac701 03-Apr-2009 Andy Adamson <andros@netapp.com> nfsd41: nfsd DRC logic

Replay a request in nfsd4_sequence.
Add a minorversion to struct nfsd4_compound_state.

Pass the current slot to nfs4svc_encode_compound res via struct
nfsd4_compoundres to set an NFSv4.1 DRC entry.

Signed-off-by: Andy Adamson<andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: use bool inuse for slot state]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: use cstate session in nfs4svc_encode_compoundres]
[nfsd41 replace nfsd4_set_cache_entry]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
b85d4c01b76f6969a085d07a767fa45225cb14be 03-Apr-2009 Benny Halevy <bhalevy@panasas.com> nfsd41: sequence operation

Implement the sequence operation conforming to
http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-26

Check for stale clientid (as derived from the sessionid).
Enforce slotid range and exactly-once semantics using
the slotid and seqid.

If everything went well renew the client lease and
mark the slot INPROGRESS.

Add a struct nfsd4_slot pointer to struct nfsd4_compound_state.
To be used for sessions DRC replay.

[nfsd41: rename sequence catchthis to cachethis]
Signed-off-by: Andy Adamson<andros@netapp.com>
[pulled some code to set cstate->slot from "nfsd DRC logic"]
[use sessionid_lock spin lock]
[nfsd41: use bool inuse for slot state]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd: add a struct nfsd4_slot pointer to struct nfsd4_compound_state]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: add nfsd4_session pointer to nfsd4_compound_state]
[nfsd41: set cstate session]
[nfsd41: use cstate session in nfsd4_sequence]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[simplify nfsd4_encode_sequence error handling]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
0733d21338747483985a5964e852af160d88e429 03-Apr-2009 Andy Adamson <andros@netapp.com> nfsd41: exchange_id operation

Implement the exchange_id operation confoming to
http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-28

Based on the client provided name, hash a client id.
If a confirmed one is found, compare the op's creds and
verifier. If the creds match and the verifier is different
then expire the old client (client re-incarnated), otherwise,
if both match, assume it's a replay and ignore it.

If an unconfirmed client is found, then copy the new creds
and verifer if need update, otherwise assume replay.

The client is moved to a confirmed state on create_session.

In the nfs41 branch set the exchange_id flags to
EXCHGID4_FLAG_USE_NON_PNFS | EXCHGID4_FLAG_SUPP_MOVED_REFER
(pNFS is not supported, Referrals are supported,
Migration is not.).

Address various scenarios from section 18.35 of the spec:

1. Check for EXCHGID4_FLAG_UPD_CONFIRMED_REC_A and set
EXCHGID4_FLAG_CONFIRMED_R as appropriate.

2. Return error codes per 18.35.4 scenarios.

3. Update client records or generate new client ids depending on
scenario.

Note: 18.35.4 case 3 probably still needs revisiting. The handling
seems not quite right.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Andy Adamosn <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: use utsname for major_id (and copy to server_scope)]
[nfsd41: fix handling of various exchange id scenarios]
Signed-off-by: Mike Sager <sager@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: reverse use of EXCHGID4_INVAL_FLAG_MASK_A]
[simplify nfsd4_encode_exchange_id error handling]
[nfsd41: embed an xdr_netobj in nfsd4_exchange_id]
[nfsd41: return nfserr_serverfault for spa_how == SP4_MACH_CRED]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2db134eb3b39faefc7fbfb200156d175edba2f68 03-Apr-2009 Andy Adamson <andros@netapp.com> nfsd41: xdr infrastructure

Define nfsd41_dec_ops vector and add it to nfsd4_minorversion for
minorversion 1.

Note: nfsd4_enc_ops vector is shared for v4.0 and v4.1
since we don't need to filter out obsolete ops as this is
done in the decoding phase.

exchange_id, create_session, destroy_session, and sequence ops are
implemented as stubs returning nfserr_opnotsupp at this stage.

[was nfsd41: xdr stubs]
[get rid of CONFIG_NFSD_V4_1]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
a1c8c4d1ff54c6c86930ee3c4c73c69eeb9ede61 09-Mar-2009 J. Bruce Fields <bfields@citi.umich.edu> nfsd4: support putpubfh operation

Currently putpubfh returns NFSERR_OPNOTSUPP, which isn't actually
allowed for v4. The right error is probably NFSERR_NOTSUPP.

But let's just implement it; though rarely seen, it can be used by
Solaris (with a special mount option), is mandated by the rfc, and is
trivial for us to support.

Thanks to Yang Hongyang for pointing out the original problem, and to
Mike Eisler, Tom Talpey, Trond Myklebust, and Dave Noveck for further
argument....

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
1e685ec270cb97680be4eb8cf6b615f5f7f1403a 04-Mar-2009 Benny Halevy <bhalevy@panasas.com> NFSD: return nfsv4 error code nfserr_notsupp rather than nfsv[23]'s nfserr_opnotsupp

Thanks for Bill Baker at sun.com for catching this
at Connectathon 2009.

This bug was introduced in 2.6.27

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
84f09f46b4ee9e4e9b6381f8af31817516d2091b 04-Mar-2009 Benny Halevy <bhalevy@panasas.com> NFSD: provide encode routine for OP_OPENATTR

Although this operation is unsupported by our implementation
we still need to provide an encode routine for it to
merely encode its (error) status back in the compound reply.

Thanks for Bill Baker at sun.com for testing with the Sun
OpenSolaris' client, finding, and reporting this bug at
Connectathon 2009.

This bug was introduced in 2.6.27

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
4e65ebf08951326709817e654c149d0a94982e01 15-Dec-2008 Marc Eshel <eshel@almaden.ibm.com> nfsd: delete wrong file comment from nfsd/nfs4xdr.c

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
e31a1b662f40fd460e982ef87582c66c51596cd0 12-Aug-2008 Benny Halevy <bhalevy@panasas.com> nfsd: nfs4xdr decode_stateid helper function

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
5bf8c6911fe88bea4f9850007796fbacf9fbfb88 12-Aug-2008 Benny Halevy <bhalevy@panasas.com> nfsd: properly xdr-decode NFS4_OPEN_CLAIM_DELEGATE_CUR stateid

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
1b6b2257dc5a88aae375ff44485f8d6c4ee800e4 12-Aug-2008 Benny Halevy <bhalevy@panasas.com> nfsd: don't declare p in ENCODE_SEQID_OP_HEAD

After using the encode_stateid helper the "p" pointer declared
by ENCODE_SEQID_OP_HEAD is warned as unused.
In the single site where it is still needed it can be declared
separately using the ENCODE_HEAD macro.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
e2f282b9f0538e4f63255ffa35bf3b902f5fbde2 12-Aug-2008 Benny Halevy <bhalevy@panasas.com> nfsd: nfs4xdr encode_stateid helper function

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
5033b77a931a12bc7395c1834fa50f6d477be3ae 12-Aug-2008 Benny Halevy <bhalevy@panasas.com> nfsd: fix nfsd4_encode_open buffer space reservation

nfsd4_encode_open first reservation is currently for 36 + sizeof(stateid_t)
while it writes after the stateid a cinfo (20 bytes) and 5 more 4-bytes
words, for a total of 40 + sizeof(stateid_t).

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
c47b2ca42e848e2dce122acbefa0de59320f5683 12-Aug-2008 Benny Halevy <bhalevy@panasas.com> nfsd: properly xdr-encode deleg stateid returned from open

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
5108b27651727b5aba0826e8fd7be71b42428701 18-Jul-2008 Harvey Harrison <harvey.harrison@gmail.com> nfsd: nfs4xdr.c do-while is not a compound statement

The WRITEMEM macro produces sparse warnings of the form:
fs/nfsd/nfs4xdr.c:2668:2: warning: do-while statement is not a compound statement

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Cc: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
ad1060c89cfe451de849373d98e42fad58dd25ae 18-Jul-2008 J. Bruce Fields <bfields@citi.umich.edu> nfsd: Use C99 initializers in fs/nfsd/nfs4xdr.c

Thanks to problem report and original patch from Harvey Harrison.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Harvey Harrison <harvey.harrison@gmail.com>
Cc: Benny Halevy <bhalevy@panasas.com>
695e12f8d2917378d3b93059209e17415de96204 04-Jul-2008 Benny Halevy <bhalevy@panasas.com> nfsd: tabulate nfs4 xdr encoding functions

In preparation for minorversion 1

All encoders now return an nfserr status (typically their
nfserr argument). Unsupported ops go through nfsd4_encode_operation
too, so use nfsd4_encode_noop to encode nothing for their reply body.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
f2feb96bc3d18e50cab7de9eab142f99d91aa5f6 02-Jul-2008 Benny Halevy <bhalevy@panasas.com> nfsd: nfs4 minorversion decoder vectors

Have separate vectors of operation decoders for each minorversion.
Obsolete ops in newer minorversions have default implementation returning
nfserr_opnotsupp.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
3c375c6f3a809d0d999d6dc933634f0b97ed7ae9 02-Jul-2008 Benny Halevy <bhalevy@panasas.com> nfsd: unsupported nfs4 ops should fail with nfserr_opnotsupp

nfserr_opnotsupp should be returned for unsupported nfs4 ops
rather than nfserr_op_illegal.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
347e0ad9c91b5bd7506d61f236048cc72b7fc151 02-Jul-2008 Benny Halevy <bhalevy@panasas.com> nfsd: tabulate nfs4 xdr decoding functions

In preparation for minorversion 1

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
30cff1ffff3981c8d96dc33870b652e70190ba37 02-Jul-2008 Benny Halevy <bhalevy@panasas.com> nfsd: return nfserr_minor_vers_mismatch when compound minorversion != 0

Check minorversion once before decoding any operation and reject with
nfserr_minor_vers_mismatch if != 0 (this still happens in nfsd4_proc_compound).
In this case return a zero length resultdata array as required by RFC3530.

minorversion 1 processing will have its own vector of decoders.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
13b1867cacbfe6d8203f432996bd8a2ee6b04e79 28-May-2008 Benny Halevy <bhalevy@panasas.com> nfsd: make nfs4xdr WRITEMEM safe against zero count

WRITEMEM zeroes the last word in the destination buffer
for padding purposes, but this must not be done if
no bytes are to be copied, as it would result
in zeroing of the word right before the array.

The current implementation works since it's always called
with non zero nbytes or it follows an encoding of the
string (or opaque) length which, if equal to zero,
can be overwritten with zero.

Nevertheless, it seems safer to check for this case.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
e36cd4a2873c398ba188f16e4087cce7f00a1506 24-Apr-2008 J. Bruce Fields <bfields@citi.umich.edu> nfsd: don't allow setting ctime over v4

Presumably this is left over from earlier drafts of v4, which listed
TIME_METADATA as writeable. It's read-only in rfc 3530, and shouldn't
be modifiable anyway.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
c0ce6ec87c59d7a29438717b1f72f83fb408f416 11-Feb-2008 J. Bruce Fields <bfields@citi.umich.edu> nfsd: clarify readdir/mountpoint-crossing code

The code here is difficult to understand; attempt to clarify somewhat by
pulling out one of the more mystifying conditionals into a separate
function.

While we're here, also add lease_time to the list of attributes that we
don't really need to cross a mountpoint to fetch.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Peter Staubach <staubach@redhat.com>
5477549161480432d053565d2720f08626baf9e3 15-Feb-2008 Jan Blunck <jblunck@suse.de> Use struct path in struct svc_export

I'm embedding struct path into struct svc_export.

[akpm@linux-foundation.org: coding-style fixes]
[ezk@cs.sunysb.edu: NFSD: fix wrong mnt_writer count in rename]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Acked-by: J. Bruce Fields <bfields@citi.umich.edu>
Acked-by: Christoph Hellwig <hch@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
406a7ea97d9dc1a9348ba92c4cd0e7c678185c4c 27-Nov-2007 Frank Filz <ffilzlnx@us.ibm.com> nfsd: Allow AIX client to read dir containing mountpoints

This patch addresses a compatibility issue with a Linux NFS server and
AIX NFS client.

I have exported /export as fsid=0 with sec=krb5:krb5i
I have mount --bind /home onto /export/home
I have exported /export/home with sec=krb5i

The AIX client mounts / -o sec=krb5:krb5i onto /mnt

If I do an ls /mnt, the AIX client gets a permission error. Looking at
the network traceIwe see a READDIR looking for attributes
FATTR4_RDATTR_ERROR and FATTR4_MOUNTED_ON_FILEID. The response gives a
NFS4ERR_WRONGSEC which the AIX client is not expecting.

Since the AIX client is only asking for an attribute that is an
attribute of the parent file system (pseudo root in my example), it
seems reasonable that there should not be an error.

In discussing this issue with Bruce Fields, I initially proposed
ignoring the error in nfsd4_encode_dirent_fattr() if all that was being
asked for was FATTR4_RDATTR_ERROR and FATTR4_MOUNTED_ON_FILEID, however,
Bruce suggested that we avoid calling cross_mnt() if only these
attributes are requested.

The following patch implements bypassing cross_mnt() if only
FATTR4_RDATTR_ERROR and FATTR4_MOUNTED_ON_FILEID are called. Since there
is some complexity in the code in nfsd4_encode_fattr(), I didn't want to
duplicate code (and introduce a maintenance nightmare), so I added a
parameter to nfsd4_encode_fattr() that indicates whether it should
ignore cross mounts and simply fill in the attribute using the passed in
dentry as opposed to it's parent.

Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
ca2a05aa7c72309ee65164c78fa2be7a5038215e 11-Nov-2007 J. Bruce Fields <bfields@citi.umich.edu> nfsd: Fix handling of negative lengths in read_buf()

The length "nbytes" passed into read_buf should never be negative, but
we check only for too-large values of "nbytes", not for too-small
values. Make nbytes unsigned, so it's clear that the former tests are
sufficient. (Despite this read_buf() currently correctly returns an xdr
error in the case of a negative length, thanks to an unsigned
comparison with size_of() and bounds-checking in kmalloc(). This seems
very fragile, though.)

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
a16e92edcd0a2846455a30823e1bac964e743baa 28-Sep-2007 J. Bruce Fields <bfields@citi.umich.edu> knfsd: query filesystem for NFSv4 getattr of FATTR4_MAXNAME

Without this we always return 2^32-1 as the the maximum namelength.

Thanks to Andreas Gruenbacher for bug report and testing.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Andreas Gruenbacher <agruen@suse.de>
40ee5dc6af351c1b3de245abed4bd8e6a4a5646a 16-Aug-2007 Peter Staubach <staubach@redhat.com> knfsd: 64 bit ino support for NFS server

Modify the NFS server code to support 64 bit ino's, as
appropriate for the system and the NFS protocol version.

The gist of the changes is to query the underlying file system
for attributes and not just to use the cached attributes in the
inode. For this specific purpose, the inode only contains an
ino field which unsigned long, which is large enough on 64 bit
platforms, but is not large enough on 32 bit platforms.

I haven't been able to find any reason why ->getattr can't be called
while i_mutex. The specification indicates that i_mutex is not
required to be held in order to invoke ->getattr, but it doesn't say
that i_mutex can't be held while invoking ->getattr.

I also haven't come to any conclusions regarding the value of
lease_get_mtime() and whether it should or should not be invoked
by fill_post_wcc() too. I chose not to change this because I
thought that it was safer to leave well enough alone. If we
decide to make a change, it can be done separately.

Signed-off-by: Peter Staubach <staubach@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Acked-by: Neil Brown <neilb@suse.de>
817cb9d43d4c330f9fc023d96e5beaa1abe8c4b7 12-Sep-2007 Chuck Lever <chuck.lever@oracle.com> NFSD: Convert printk's to dprintk's in NFSD's nfs4xdr

Due to recent edict to remove or replace printk's that can flood the system
log.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
ca5c8cde93d65db3139604ca6b91bf8ff3f775e2 26-Jul-2007 Al Viro <viro@ftp.linux.org.uk> lockd and nfsd endianness annotation fixes

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4796f45740bc6f2e3e6cc14e7ed481b38bd0bd39 17-Jul-2007 J. Bruce Fields <bfields@citi.umich.edu> knfsd: nfsd4: secinfo handling without secinfo= option

We could return some sort of error in the case where someone asks for secinfo
on an export without the secinfo= option set--that'd be no worse than what
we've been doing. But it's not really correct. So, hack up an approximate
secinfo response in that case--it may not be complete, but it'll tell the
client at least one acceptable security flavor.

Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
dcb488a3b7ac3987e21148f44f641c9b2e734232 17-Jul-2007 Andy Adamson <andros@citi.umich.edu> knfsd: nfsd4: implement secinfo

Implement the secinfo operation.

(Thanks to Usha Ketineni wrote an earlier version of this support.)

Cc: Usha Ketineni <uketinen@us.ibm.com>
Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
df547efb03e3e8f9ea726e1d07fbbd6fd0706cd7 17-Jul-2007 J. Bruce Fields <bfields@citi.umich.edu> knfsd: nfsd4: simplify exp_pseudoroot arguments

We're passing three arguments to exp_pseudoroot, two of which are just fields
of the svc_rqst. Soon we'll want to pass in a third field as well. So let's
just give up and pass in the whole struct svc_rqst.

Also sneak in some minor style cleanups while we're at it.

Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
e63340ae6b6205fef26b40a75673d1c9c0c8bb90 08-May-2007 Randy Dunlap <randy.dunlap@oracle.com> header cleaning: don't include smp_lock.h when not used

Remove includes of <linux/smp_lock.h> where it is not used/needed.
Suggested by Al Viro.

Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc,
sparc64, and arm (all 59 defconfigs).

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
f34f924274ad8f84c6d86ea9e52b0682347f5701 16-Feb-2007 J. Bruce Fields <bfields@citi.umich.edu> [PATCH] knfsd: nfsd4: fix error return on unsupported acl

We should be returning ATTRNOTSUPP, not NOTSUPP, when acls are unsupported.

Also fix a comment.

Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
a4db5fe5dfb3a5b5b550f1acd95ef3de01a3f063 16-Feb-2007 J. Bruce Fields <bfields@snoopy.citi.umich.edu> [PATCH] knfsd: nfsd4: fix memory leak on kmalloc failure in savemem

The wrong pointer is being kfree'd in savemem() when defer_free returns with
an error.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
28e05dd8457c7a7fa1c3faac169a95e0ce4b4a12 16-Feb-2007 J. Bruce Fields <bfields@citi.umich.edu> [PATCH] knfsd: nfsd4: represent nfsv4 acl with array instead of linked list

Simplify the memory management and code a bit by representing acls with an
array instead of a linked list.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
af6a4e280e3ff453653f39190b57b345ff0bec16 14-Feb-2007 NeilBrown <neilb@suse.de> [PATCH] knfsd: add some new fsid types

Add support for using a filesystem UUID to identify and export point in the
filehandle.

For NFSv2, this UUID is xor-ed down to 4 or 8 bytes so that it doesn't take up
too much room. For NFSv3+, we use the full 16 bytes, and possibly also a
64bit inode number for exports beneath the root of a filesystem.

When generating an fsid to return in 'stat' information, use the UUID (hashed
down to size) if it is available and a small 'fsid' was not specifically
provided.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
a0ad13ef643a5829d63c456ab6143bbda60b44a9 26-Jan-2007 NeilBrown <neilb@suse.de> [PATCH] knfsd: Fix type mismatch with filldir_t used by nfsd

nfsd defines a type 'encode_dent_fn' which is much like 'filldir_t' except
that the first pointer is 'struct readdir_cd *' rather than 'void *'. It
then casts encode_dent_fn points to 'filldir_t' as needed. This hides any
other type mismatches between the two such as the fact that the 'ino' arg
recently changed from ino_t to u64.

So: get rid of 'encode_dent_fn', get rid of the cast of the function type,
change the first arg of various functions from 'struct readdir_cd *' to
'void *', and live with the fact that we have a little less type checking
on the calling of these functions now. Less internal (to nfsd) checking
offset by more external checking, which is more important.

Thanks to Gabriel Paubert <paubert@iram.es> for discovering this and
providing an initial patch.

Signed-off-by: Gabriel Paubert <paubert@iram.es>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
021d3a72459191a76e8e482ee4937ba6bc9fd712 13-Dec-2006 J.Bruce Fields <bfields@fieldses.org> [PATCH] knfsd: nfsd4: handling more nfsd_cross_mnt errors in nfsd4 readdir

This patch on its own causes no change in behavior, since nfsd_cross_mnt()
only returns -EAGAIN; but in the future I'd like it to also be able to return
-ETIMEDOUT, so we may as well handle any possible error here.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
b8dd7b9ab194d9ab322881f49fde42954757efae 20-Oct-2006 Al Viro <viro@ftp.linux.org.uk> [PATCH] nfsd: NFSv4 errno endianness annotations

don't use the same variable to store NFS and host error values

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
b37ad28bcaa7c486a4ff0fb6c3bdaaacd67b86ce 20-Oct-2006 Al Viro <viro@ftp.linux.org.uk> [PATCH] nfsd: nfs4 code returns error values in net-endian

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2ebbc012a9433a252be7ab4ce54e94bf7b21e506 20-Oct-2006 Al Viro <viro@ftp.linux.org.uk> [PATCH] xdr annotations: NFSv4 server

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
cc45f0175088e000ac7493e5e3f05579b6f7d240 20-Oct-2006 Al Viro <viro@ftp.linux.org.uk> [PATCH] bug: nfsd/nfs4xdr.c misuse of ERR_PTR()

a) ERR_PTR(nfserr_something) is a bad idea;
IS_ERR() will be false for it.
b) mixing nfserr_.... with -EOPNOTSUPP is
even worse idea.

nfsd4_path() does both; caller expects to get NFS protocol error out it if
anything goes wrong, but if it does we either do not notice (see (a)) or get
host-endian negative (see (b)).

IOW, that's a case when we can't use ERR_PTR() to return error, even though we
return a pointer in case of success.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
42ca09938157105c1f573c831a35e9c3e02eb354 04-Oct-2006 J.Bruce Fields <bfields@fieldses.org> [PATCH] knfsd: nfsd4: actually use all the pieces to implement referrals

Use all the pieces set up so far to implement referral support, allowing
return of NFS4ERR_MOVED and fs_locations attribute.

Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
81c3f4130202a1dcb2b28ab56684eb5e9d43d8c1 04-Oct-2006 J.Bruce Fields <bfields@fieldses.org> [PATCH] knfsd: nfsd4: xdr encoding for fs_locations

Encode fs_locations attribute.

Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
7adae489fe794e3e203ff168595f635d0b845e59 04-Oct-2006 Greg Banks <gnb@melbourne.sgi.com> [PATCH] knfsd: Prepare knfsd for support of rsize/wsize of up to 1MB, over TCP

The limit over UDP remains at 32K. Also, make some of the apparently
arbitrary sizing constants clearer.

The biggest change here involves replacing NFSSVC_MAXBLKSIZE by a function of
the rqstp. This allows it to be different for different protocols (udp/tcp)
and also allows it to depend on the servers declared sv_bufsiz.

Note that we don't actually increase sv_bufsz for nfs yet. That comes next.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
3cc03b164cf01c6f36e64720b58610d292fb26f7 04-Oct-2006 NeilBrown <neilb@suse.de> [PATCH] knfsd: Avoid excess stack usage in svc_tcp_recvfrom

.. by allocating the array of 'kvec' in 'struct svc_rqst'.

As we plan to increase RPCSVC_MAXPAGES from 8 upto 256, we can no longer
allocate an array of this size on the stack. So we allocate it in 'struct
svc_rqst'.

However svc_rqst contains (indirectly) an array of the same type and size
(actually several, but they are in a union). So rather than waste space, we
move those arrays out of the separately allocated union and into svc_rqst to
share with the kvec moved out of svc_tcp_recvfrom (various arrays are used at
different times, so there is no conflict).

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
4452435948424e5322c2a2fefbdc2cf3732cc45d 04-Oct-2006 NeilBrown <neilb@suse.de> [PATCH] knfsd: Replace two page lists in struct svc_rqst with one

We are planning to increase RPCSVC_MAXPAGES from about 8 to about 256. This
means we need to be a bit careful about arrays of size RPCSVC_MAXPAGES.

struct svc_rqst contains two such arrays. However the there are never more
that RPCSVC_MAXPAGES pages in the two arrays together, so only one array is
needed.

The two arrays are for the pages holding the request, and the pages holding
the reply. Instead of two arrays, we can simply keep an index into where the
first reply page is.

This patch also removes a number of small inline functions that probably
server to obscure what is going on rather than clarify it, and opencode the
needed functionality.

Also remove the 'rq_restailpage' variable as it is *always* 0. i.e. if the
response 'xdr' structure has a non-empty tail it is always in the same pages
as the head.

check counters are initilised and incr properly
check for consistant usage of ++ etc
maybe extra some inlines for common approach
general review

Signed-off-by: Neil Brown <neilb@suse.de>
Cc: Magnus Maatta <novell@kiruna.se>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
73dff8be9ea89df26bfb6a0443ad912de6e7bd00 03-Oct-2006 Eric Sesterhenn <snakebyte@gmx.de> BUG_ON() conversion in fs/nfsd/

This patch converts an if () BUG(); construct to BUG_ON();
which occupies less space, uses unlikely and is safer when
BUG() is disabled.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
726c334223180e3c0197cc980a432681370d4baf 23-Jun-2006 David Howells <dhowells@redhat.com> [PATCH] VFS: Permit filesystem to perform statfs with a known root dentry

Give the statfs superblock operation a dentry pointer rather than a superblock
pointer.

This complements the get_sb() patch. That reduced the significance of
sb->s_root, allowing NFS to place a fake root there. However, NFS does
require a dentry to use as a target for the statfs operation. This permits
the root in the vfsmount to be used instead.

linux/mount.h has been added where necessary to make allyesconfig build
successfully.

Interest has also been expressed for use with the FUSE and XFS filesystems.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Nathan Scott <nathans@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
bb6e8a9f4005237401a45f1ea43db060ea5f9725 11-Apr-2006 NeilBrown <neilb@suse.de> [PATCH] knfsd: nfsd4: fix corruption on readdir encoding with 64k pages

Fix corruption on readdir encoding with 64k pages.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
6ed6decccf544970664757464cfb67e081775e6a 11-Apr-2006 NeilBrown <neilb@suse.de> [PATCH] knfsd: nfsd4: fix corruption of returned data when using 64k pages

In v4 we grab an extra page just for the padding of returned data. The
formula that the rpc server uses to allocate pages for the response doesn't
take into account this extra page.

Instead of adjusting those formulae, we adopt the same solution as v2 and v3,
and put the "tail" data in the same page as the "head" data.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
b905b7b0a054d2ab3e0c9304def998546c93f6b5 11-Apr-2006 NeilBrown <neilb@suse.de> [PATCH] knfsd: nfsd4: better nfs4acl errors

We're returning -1 in a few places in the NFSv4<->POSIX acl translation code
where we could return a reasonable error.

Also allows some minor simplification elsewhere.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
e8c96f8c29d89af0c13dc2819a9a00575846ca18 24-Mar-2006 Tobias Klauser <tklauser@nuerscht.ch> [PATCH] fs: Use ARRAY_SIZE macro

Use ARRAY_SIZE macro instead of sizeof(x)/sizeof(x[0]) and remove a
duplicate of ARRAY_SIZE. Some trailing whitespaces are also deleted.

Signed-off-by: Tobias Klauser <tklauser@nuerscht.ch>
Cc: David Howells <dhowells@redhat.com>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Cc: Chris Mason <mason@suse.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Nathan Scott <nathans@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
34081efc12aaaa12f20e5b59f3cb98ba6e27fb34 19-Jan-2006 Fred Isaman <iisaman@citi.umich.edu> [PATCH] nfsd4: Fix bug in rdattr_error return

Fix bug in rdattr_error return which causes correct error code to be
overwritten by nfserr_toosmall.

Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
3a65588adc4401622b204caa897123e16a4a0318 19-Jan-2006 J. Bruce Fields <bfields@citi.umich.edu> [PATCH] nfsd4: rename lk_stateowner

One of the things that's confusing about nfsd4_lock is that the lk_stateowner
field could be set to either of two different lockowners: the open owner or
the lock owner. Rename to lk_replay_owner and add a comment to make it clear
that it's used for whichever stateowner has its sequence id bumped for replay
detection.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
f99d49adf527fa6f7a9c42257fa76bca6b8df1e3 07-Nov-2005 Jesper Juhl <jesper.juhl@gmail.com> [PATCH] kfree cleanup: fs

This is the fs/ part of the big kfree cleanup patch.

Remove pointless checks for NULL prior to calling kfree() in fs/.

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
e34ac862ee6644378bfe6ea65c2e0dda4545513d 08-Jul-2005 NeilBrown <neilb@cse.unsw.edu.au> [PATCH] nfsd4: fix fh_expire_type

After discussion at the recent NFSv4 bake-a-thon, I realized that my
assumption that NFS4_FH_PERSISTENT required filehandles to persist was a
misreading of the spec. This also fixes an interoperability problem with the
Solaris client.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
7fb64cee34f5dc743f697041717cafda8a94b5ac 08-Jul-2005 NeilBrown <neilb@cse.unsw.edu.au> [PATCH] nfsd4: seqid comments

Add some comments on the use of so_seqid, in an attempt to avoid some of the
confusion outlined in the previous patch....

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
bd9aac523b812d58e644fde5e59f5697fb9e3822 08-Jul-2005 NeilBrown <neilb@cse.unsw.edu.au> [PATCH] nfsd4: fix open_reclaim seqid

The sequence number we store in the sequence id is the last one we received
from the client. So on the next operation we'll check that the client gives
us the next higher number.

We increment sequence id's at the last moment, in encode, so that we're sure
of knowing the right error return. (The decision to increment the sequence id
depends on the exact error returned.)

However on the *first* use of a sequence number, if we set the sequence number
to the one received from the client and then let the increment happen on
encode, we'll be left with a sequence number one to high.

For that reason, ENCODE_SEQID_OP_TAIL only increments the sequence id on
*confirmed* stateowners.

This creates a problem for open reclaims, which are confirmed on first use.
Therefore the open reclaim code, as a special exception, *decrements* the
sequence id, cancelling out the undesired increment on encode. But this
prevents the sequence id from ever being incremented in the case where
multiple reclaims are sent with the same openowner. Yuch!

We could add another exception to the open reclaim code, decrementing the
sequence id only if this is the first use of the open owner.

But it's simpler by far to modify the meaning of the op_seqid field: instead
of representing the previous value sent by the client, we take op_seqid, after
encoding, to represent the *next* sequence id that we expect from the client.
This eliminates the need for special-case handling of the first use of a
stateowner.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
fd39ca9a808c6026989bc2188868a0574eb37108 24-Jun-2005 NeilBrown <neilb@cse.unsw.edu.au> [PATCH] knfsd: nfsd4: make needlessly global code static

This patch contains the following possible cleanups:

- make needlessly global code static

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
7b190fecfa33d72bcf74c9473134c2ad14ae9545 24-Jun-2005 NeilBrown <neilb@cse.unsw.edu.au> [PATCH] knfsd: nfsd4: delegation recovery

Allow recovery of delegations after reboot.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
496400014f22c4dbdbc1e89249a2feba46939708 24-Jun-2005 NeilBrown <neilb@cse.unsw.edu.au> [PATCH] nfsd4: fix fh_expire_type

We're returning NFS4_FH_NOEXPIRE_WITH_OPEN | NFS4_FH_VOL_RENAME for the
fh_expire_type attribute. This is incorrect:
1. The spec actually only allows NOEXPIRE_WITH_OPEN when
VOLATILE_ANY is also set.
2. Filehandles for open files can expire, if the file is removed
and there is a reboot.
3. Filehandles are only volatile on rename in the nosubtree check
case.

Unfortunately, there's no way to indicate that we only expire on remove. So
our only choice is FH4_VOLATILE_ANY. Although it's redundant, we also set
FH4_VOL_RENAME in the subtree check case, since subtreecheck does actually
cause problems in practice and it seems possibly useful to give clients some
way to distinguish that case.

Fix a mispelled #define while we're at it.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 17-Apr-2005 Linus Torvalds <torvalds@ppc970.osdl.org> Linux-2.6.12-rc2

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!