Cross Reference: /fs/nfsd/nfs4xdr.c

History log of /fs/nfsd/nfs4xdr.c
Revision	Date	Author	Comments
15b23ef5d348ea51c5e7573e2ef4116fbc7cb099	24-Sep-2014	J. Bruce Fields <bfields@redhat.com>	nfsd4: fix corruption of NFSv4 read data The calculation of page_ptr here is wrong in the case the read doesn't start at an offset that is a multiple of a page. The result is that nfs4svc_encode_compoundres sets rq_next_page to a value one too small, and then the loop in svc_free_res_pages may incorrectly fail to clear a page pointer in rq_respages[]. Pages left in rq_respages[] are available for the next rpc request to use, so xdr data may be written to that page, which may hold data still waiting to be transmitted to the client or data in the page cache. The observed result was silent data corruption seen on an NFSv4 client. We tag this as "fixing" 05638dc73af2 because that commit exposed this bug, though the incorrect calculation predates it. Particular thanks to Andrea Arcangeli and David Gilbert for analysis and testing. Fixes: 05638dc73af2 "nfsd4: simplify server xdr->next_page use" Cc: stable@vger.kernel.org Reported-by: Andrea Arcangeli <aarcange@redhat.com> Tested-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
24bab491220faa446d945624086d838af41d616c	26-Sep-2014	Anna Schumaker <Anna.Schumaker@netapp.com>	NFSD: Implement SEEK This patch adds server support for the NFS v4.2 operation SEEK, which returns the position of the next hole or data segment in a file. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
87a15a8090c0e5284c0e53528d9defa5d9237866	26-Sep-2014	Anna Schumaker <Anna.Schumaker@netapp.com>	NFSD: Add generic v4.2 infrastructure It's cleaner to introduce everything at once and have the server reply with "not supported" than it would be to introduce extra operations when implementing a specific one in the middle of the list. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
aee3776441461c14ba6d8ed9e2149933e65abb6e	20-Aug-2014	J. Bruce Fields <bfields@redhat.com>	nfsd4: fix rd_dircount enforcement Commit 3b299709091b "nfsd4: enforce rd_dircount" totally misunderstood rd_dircount; it refers to total non-attribute bytes returned, not number of directory entries returned. Bring the code into agreement with RFC 3530 section 14.2.24. Cc: stable@vger.kernel.org Fixes: 3b299709091b "nfsd4: enforce rd_dircount" Signed-off-by: J. Bruce Fields <bfields@redhat.com>
f7b43d0c992c3ec3e8d9285c3fb5e1e0eb0d031a	12-Aug-2014	J. Bruce Fields <bfields@redhat.com>	nfsd4: reserve adequate space for LOCK op As of 8c7424cff6 "nfsd4: don't try to encode conflicting owner if low on space", we permit the server to process a LOCK operation even if there might not be space to return the conflicting lockowner, because we've made returning the conflicting lockowner optional. However, the rpc server still wants to know the most we might possibly return, so we need to take into account the possible conflicting lockowner in the svc_reserve_space() call here. Symptoms were log messages like "RPC request reserved 88 but used 108". Fixes: 8c7424cff6 "nfsd4: don't try to encode conflicting owner if low on space" Reported-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
1383bf37ce2554d7632f21ee03f3ea815edaf933	11-Aug-2014	J. Bruce Fields <bfields@redhat.com>	nfsd4: remove obsolete comment We do what Neil suggests now. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
58fb12e6a42f30adf209f8f41385a3bbb2c82420	30-Jul-2014	Jeff Layton <jlayton@primarydata.com>	nfsd: Add a mutex to protect the NFSv4.0 open owner replay cache We don't want to rely on the client_mutex for protection in the case of NFSv4 open owners. Instead, we add a mutex that will only be taken for NFSv4.0 state mutating operations, and that will be released once the entire compound is done. Also, ensure that nfsd4_cstate_assign_replay/nfsd4_cstate_clear_replay take a reference to the stateowner when they are using it for NFSv4.0 open and lock replay caching. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
f98bac5a30b60a2fca854dd5ee7256221d8ccf0a	07-Jul-2014	Kinglong Mee <kinglongmee@gmail.com>	NFSD: Fix crash encoding lock reply on 32-bit Commit 8c7424cff6 "nfsd4: don't try to encode conflicting owner if low on space" forgot to free conf->data in nfsd4_encode_lockt and before sign conf->data to NULL in nfsd4_encode_lock_denied, causing a leak. Worse, kfree() can be called on an uninitialized pointer in the case of a succesful lock (or one that fails for a reason other than a conflict). (Note that lock->lk_denied.ld_owner.data appears it should be zero here, until you notice that it's one arm of a union the other arm of which is written to in the succesful case by the memcpy(&lock->lk_resp_stateid, &lock_stp->st_stid.sc_stateid, sizeof(stateid_t)); in nfsd4_lock(). In the 32-bit case this overwrites ld_owner.data.) Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Fixes: 8c7424cff6 ""nfsd4: don't try to encode conflicting owner if low on space" Signed-off-by: J. Bruce Fields <bfields@redhat.com>
5d6031ca742f9f07b9c9d9322538619f3bd155ac	17-Jul-2014	J. Bruce Fields <bfields@redhat.com>	nfsd4: zero op arguments beyond the 8th compound op The first 8 ops of the compound are zeroed since they're a part of the argument that's zeroed by the memset(rqstp->rq_argp, 0, procp->pc_argsize); in svc_process_common(). But we handle larger compounds by allocating the memory on the fly in nfsd4_decode_compound(). Other than code recently fixed by 01529e3f8179 "NFSD: Fix memory leak in encoding denied lock", I don't know of any examples of code depending on this initialization. But it definitely seems possible, and I'd rather be safe. Compounds this long are unusual so I'm much more worried about failure in this poorly tested cases than about an insignificant performance hit. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
d5d5c304b13bc3cade13b8a1b5833c8b3a0975f1	09-Jul-2014	Kinglong Mee <kinglongmee@gmail.com>	NFSD: Fix bad checking of space for padding in splice read Note that the caller has already reserved space for count and eof, so xdr->p has already moved past them, only the padding remains. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Fixes dc97618ddd (nfsd4: separate splice and readv cases) Signed-off-by: J. Bruce Fields <bfields@redhat.com>
01529e3f817908b394221b0a5d985ae3541641cc	07-Jul-2014	Kinglong Mee <kinglongmee@gmail.com>	NFSD: Fix memory leak in encoding denied lock Commit 8c7424cff6 (nfsd4: don't try to encode conflicting owner if low on space) forgot free conf->data in nfsd4_encode_lockt and before sign conf->data to NULL in nfsd4_encode_lock_denied. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
b607664ee74313c7f3f657a044eda572051e560e	30-Jun-2014	Trond Myklebust <trond.myklebust@primarydata.com>	nfsd: Cleanup nfs4svc_encode_compoundres Move the slot return, put session etc into a helper in fs/nfsd/nfs4state.c instead of open coding in nfs4svc_encode_compoundres. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
1055414fe19db2db6c8947c0b9ee9c8fe07beea1	29-Jun-2014	Kinglong Mee <kinglongmee@gmail.com>	NFSD: Avoid warning message when compile at i686 arch fs/nfsd/nfs4xdr.c: In function 'nfsd4_encode_readv': >> fs/nfsd/nfs4xdr.c:3137:148: warning: comparison of distinct pointer types lacks a cast [enabled by default] thislen = min(len, ((void )xdr->end - (void )xdr->p)); Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
d5e2338324102dcf34aa25aeaf96064cc4d94dce	24-Jun-2014	J. Bruce Fields <bfields@redhat.com>	nfsd4: replace defer_free by svcxdr_tmpalloc Avoid an extra allocation for the tmpbuf struct itself, and stop ignoring some allocation failures. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
bcaab953b1d3790c724a211f2452b574fd49a7ce	24-Jun-2014	J. Bruce Fields <bfields@redhat.com>	nfsd4: remove nfs4_acl_new This is a not-that-useful kmalloc wrapper. And I'd like one of the callers to actually use something other than kmalloc. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
29c353b3fe54789706c0a37560ce4548a6362c2c	24-Jun-2014	J. Bruce Fields <bfields@redhat.com>	nfsd4: define svcxdr_dupstr to share some common code Signed-off-by: J. Bruce Fields <bfields@redhat.com>
ce043ac826f3ad224142f84d860316a5fd05f79c	24-Jun-2014	J. Bruce Fields <bfields@redhat.com>	nfsd4: remove unused defer_free argument 28e05dd8457c "knfsd: nfsd4: represent nfsv4 acl with array instead of linked list" removed the last user that wanted a custom free function. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
7fb84306f55d6cc32ea894d47cbb2faa18c8f45b	24-Jun-2014	J. Bruce Fields <bfields@redhat.com>	nfsd4: rename cr_linkname->cr_data The name of a link is currently stored in cr_name and cr_namelen, and the content in cr_linkname and cr_linklen. That's confusing. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
b829e9197ad3d8b86dbd5dc1d9bbc5508d214cec	19-Jun-2014	J. Bruce Fields <bfields@redhat.com>	nfsd: fix rare symlink decoding bug An NFS operation that creates a new symlink includes the symlink data, which is xdr-encoded as a length followed by the data plus 0 to 3 bytes of zero-padding as required to reach a 4-byte boundary. The vfs, on the other hand, wants null-terminated data. The simple way to handle this would be by copying the data into a newly allocated buffer with space for the final null. The current nfsd_symlink code tries to be more clever by skipping that step in the (likely) case where the byte following the string is already 0. But that assumes that the byte following the string is ours to look at. In fact, it might be the first byte of a page that we can't read, or of some object that another task might modify. Worse, the NFSv4 code tries to fix the problem by actually writing to that byte. In the NFSv2/v3 cases this actually appears to be safe: - nfs3svc_decode_symlinkargs explicitly null-terminates the data (after first checking its length and copying it to a new page). - NFSv2 limits symlinks to 1k. The buffer holding the rpc request is always at least a page, and the link data (and previous fields) have maximum lengths that prevent the request from reaching the end of a page. In the NFSv4 case the CREATE op is potentially just one part of a long compound so can end up on the end of a page if you're unlucky. The minimal fix here is to copy and null-terminate in the NFSv4 case. The nfsd_symlink() interface here seems too fragile, though. It should really either do the copy itself every time or just require a null-terminated string. Reported-by: Jeff Layton <jlayton@primarydata.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
c3a4561796cffae6996264876ffca147b5c3709a	06-Jul-2014	Kinglong Mee <kinglongmee@gmail.com>	nfsd: Fix bad reserving space for encoding rdattr_error Introduced by commit 561f0ed498 (nfsd4: allow large readdirs). Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
69bbd9c7b99974f3a701d4de6ef7010c37182a47	26-Jun-2014	Avi Kivity <avi@cloudius-systems.com>	nfs: fix nfs4d readlink truncated packet XDR requires 4-byte alignment; nfs4d READLINK reply writes out the padding, but truncates the packet to the padding-less size. Fix by taking the padding into consideration when truncating the packet. Symptoms: # ll /mnt/ ls: cannot read symbolic link /mnt/test: Input/output error total 4 -rw-r--r--. 1 root root 0 Jun 14 01:21 123456 lrwxrwxrwx. 1 root root 6 Jul 2 03:33 test drwxr-xr-x. 1 root root 0 Jul 2 23:50 tmp drwxr-xr-x. 1 root root 60 Jul 2 23:44 tree Signed-off-by: Avi Kivity <avi@cloudius-systems.com> Fixes: 476a7b1f4b2c (nfsd4: don't treat readlink like a zero-copy operation) Reviewed-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
76f47128f9b33af1e96819746550d789054c9664	19-Jun-2014	J. Bruce Fields <bfields@redhat.com>	nfsd: fix rare symlink decoding bug An NFS operation that creates a new symlink includes the symlink data, which is xdr-encoded as a length followed by the data plus 0 to 3 bytes of zero-padding as required to reach a 4-byte boundary. The vfs, on the other hand, wants null-terminated data. The simple way to handle this would be by copying the data into a newly allocated buffer with space for the final null. The current nfsd_symlink code tries to be more clever by skipping that step in the (likely) case where the byte following the string is already 0. But that assumes that the byte following the string is ours to look at. In fact, it might be the first byte of a page that we can't read, or of some object that another task might modify. Worse, the NFSv4 code tries to fix the problem by actually writing to that byte. In the NFSv2/v3 cases this actually appears to be safe: - nfs3svc_decode_symlinkargs explicitly null-terminates the data (after first checking its length and copying it to a new page). - NFSv2 limits symlinks to 1k. The buffer holding the rpc request is always at least a page, and the link data (and previous fields) have maximum lengths that prevent the request from reaching the end of a page. In the NFSv4 case the CREATE op is potentially just one part of a long compound so can end up on the end of a page if you're unlucky. The minimal fix here is to copy and null-terminate in the NFSv4 case. The nfsd_symlink() interface here seems too fragile, though. It should really either do the copy itself every time or just require a null-terminated string. Reported-by: Jeff Layton <jlayton@primarydata.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
3c7aa15d2073d81e56e8ba8771a4ab6f23be7ae2	10-Jun-2014	Kinglong Mee <kinglongmee@gmail.com>	NFSD: Using min/max/min_t/max_t for calculate Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
f41c5ad2ff2657978a9712b9ea80cd812a7da2b0	13-Jun-2014	Kinglong Mee <kinglongmee@gmail.com>	NFSD: fix bug for readdir of pseudofs Commit 561f0ed498ca (nfsd4: allow large readdirs) introduces a bug about readdir the root of pseudofs. Call xdr_truncate_encode() revert encoded name when skipping. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
542d1ab3c7ce53be7d7122a83d016304af4e6345	02-Jun-2014	J. Bruce Fields <bfields@redhat.com>	nfsd4: kill READ64 Signed-off-by: J. Bruce Fields <bfields@redhat.com>
06553991e7757c668efb3bce9dcc740f31aead60	02-Jun-2014	J. Bruce Fields <bfields@redhat.com>	nfsd4: kill READ32 While we're here, let's kill off a couple of the read-side macros. Leaving the more complicated ones alone for now. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
da2ebce6a0f64cc01bd00aba998c0a4fa7c09843	30-May-2014	Jeff Layton <jlayton@primarydata.com>	nfsd: make nfsd4_encode_fattr static sparse says: CHECK fs/nfsd/nfs4xdr.c fs/nfsd/nfs4xdr.c:2043:1: warning: symbol 'nfsd4_encode_fattr' was not declared. Should it be static? Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12337901d654415d9f764b5f5ba50052e9700f37	28-May-2014	Christoph Hellwig <hch@lst.de>	nfsd: getattr for FATTR4_WORD0_FILES_AVAIL needs the statfs buffer Note nobody's ever noticed because the typical client probably never requests FILES_AVAIL without also requesting something else on the list. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
94eb36892d727145794b80dceffc435d1d68edbb	23-May-2014	Kinglong Mee <kinglongmee@gmail.com>	NFSD: Adds macro EX_UUID_LEN for exports uuid's length Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
a5cddc885b99458df963a75abbe0b40cbef56c48	13-May-2014	J. Bruce Fields <bfields@redhat.com>	nfsd4: better reservation of head space for krb5 RPC_MAX_AUTH_SIZE is scattered around several places. Better to set it once in the auth code, where this kind of estimate should be made. And while we're at it we can leave it zero when we're not using krb5i or krb5p. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
d05d5744ef67879877dbe2e3d0fb9fcc27ee44e5	22-Mar-2014	J. Bruce Fields <bfields@redhat.com>	nfsd4: kill write32, write64 And switch a couple other functions from the encode(&p,...) convention to the p = encode(p,...) convention mostly used elsewhere. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
0c0c267ba96f606b541ab8e4bcde54e6b3f0198f	22-Mar-2014	J. Bruce Fields <bfields@redhat.com>	nfsd4: kill WRITEMEM Signed-off-by: J. Bruce Fields <bfields@redhat.com>
b64c7f3bdfbb468d9026ca91d55c57675724f516	22-Mar-2014	J. Bruce Fields <bfields@redhat.com>	nfsd4: kill WRITE64 Signed-off-by: J. Bruce Fields <bfields@redhat.com>
c373b0a4289ebf1ca6fbf4614d8b457b5f1b489f	22-Mar-2014	J. Bruce Fields <bfields@redhat.com>	nfsd4: kill WRITE32 These macros just obscure what's going on. Adopt the convention of the client-side code. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
c8f13d977518e588ac89dcf8e841821569108109	08-May-2014	J. Bruce Fields <bfields@redhat.com>	nfsd4: really fix nfs4err_resource in 4.1 case encode_getattr, for example, can return nfserr_resource to indicate it ran out of buffer space. That's not a legal error in the 4.1 case. And in the 4.1 case, if we ran out of buffer space, we should have exceeded a session limit too. (Note in 1bc49d83c37cfaf46be357757e592711e67f9809 "nfsd4: fix nfs4err_resource in 4.1 case" we originally tried fixing this error return before fixing the problem that we could error out while we still had lots of available space. The result was to trade one illegal error for another in those cases. We decided that was helpful, so reverted the change in fc208d026be0c7d60db9118583fc62f6ca97743d, and are only reinstating it now that we've elimited almost all of those cases.) Signed-off-by: J. Bruce Fields <bfields@redhat.com>
b042098063849794d69b5322fcc6cf9fb5f2586e	18-Mar-2014	J. Bruce Fields <bfields@redhat.com>	nfsd4: allow exotic read compounds I'm not sure why a client would want to stuff multiple reads in a single compound rpc, but it's legal for them to do it, and we should really support it. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
fec25fa4ad728dd9b063313f2a61ff65eae0d571	13-May-2014	J. Bruce Fields <bfields@redhat.com>	nfsd4: more read encoding cleanup More cleanup, no change in functionality. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
34a78b488f144e011493fa51f10c01d034d47c8e	13-May-2014	J. Bruce Fields <bfields@redhat.com>	nfsd4: read encoding cleanup Trivial cleanup, no change in functionality. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
dc97618ddda9a23e5211e800f0614e9612178200	18-Mar-2014	J. Bruce Fields <bfields@redhat.com>	nfsd4: separate splice and readv cases The splice and readv cases are actually quite different--for example the former case ignores the array of vectors we build up for the latter. It is probably clearer to separate the two cases entirely. There's some code duplication between the split out encoders, but this is only temporary and will be fixed by a later patch. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
b0e35fda827e72cf4b065b52c4c472c28c004fca	04-Feb-2014	J. Bruce Fields <bfields@redhat.com>	nfsd4: turn off zero-copy-read in exotic cases We currently allow only one read per compound, with operations before and after whose responses will require no more than about a page to encode. While we don't expect clients to violate those limits any time soon, this limitation isn't really condoned by the spec, so to future proof the server we should lift the limitation. At the same time we'd like to continue to support zero-copy reads. Supporting multiple zero-copy-reads per compound would require a new data structure to replace struct xdr_buf, which can represent only one set of included pages. So for now we plan to modify encode_read() to support either zero-copy or non-zero-copy reads, and use some heuristics at the start of the compound processing to decide whether a zero-copy read will work. This will allow us to support more exotic compounds without introducing a performance regression in the normal case. Later patches handle those "exotic compounds", this one just makes sure zero-copy is turned off in those cases. Signed-off-by: J. Bruce Fields <bfields@redhat.com>