History log of /include/linux/nfs_page.h
Revision Date Author Comments
7c3af975257383ece54b83c0505d3e0656cb7daf 08-Aug-2014 Weston Andros Adamson <dros@primarydata.com> nfs: don't sleep with inode lock in lock_and_join_requests

This handles the 'nonblock=false' case in nfs_lock_and_join_requests.
If the group is already locked and blocking is allowed, drop the inode lock
and wait for the group lock to be cleared before trying it all again.
This should fix warnings found in peterz's tree (sched/wait branch), where
might_sleep() checks are added to wait.[ch].

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Reviewed-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
b412ddf0661e11485876a202c48868143e3a01cf 18-Jul-2014 Weston Andros Adamson <dros@primarydata.com> nfs: fix comment and add warn_on for PG_INODE_REF

Fix the comment in nfs_page.h for PG_INODE_REF to reflect that it's no longer
set only on head requests. Also add a WARN_ON_ONCE in nfs_inode_remove_request
as PG_INODE_REF should always be set.

Suggested-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
e7029206ff43f6cf7d6fcb741adb126f47200516 18-Jul-2014 Weston Andros Adamson <dros@primarydata.com> nfs: check wait_on_bit_lock err in page_group_lock

Return errors from wait_on_bit_lock from nfs_page_group_lock.

Add a bool argument @wait to nfs_page_group_lock. If true, loop over
wait_on_bit_lock until it returns cleanly. If false, return the error
from wait_on_bit_lock.

Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
53113ad35e4b9ce82d949c7c67c7b666fad5d907 09-Jun-2014 Weston Andros Adamson <dros@primarydata.com> pnfs: clean up *_resend_to_mds

Clean up pnfs_read_done_resend_to_mds and pnfs_write_done_resend_to_mds:
- instead of passing all arguments from a nfs_pgio_header, just pass the header
- share the common code

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
d45f60c67848b9f19160692581d78e5b4757a000 09-Jun-2014 Weston Andros Adamson <dros@primarydata.com> nfs: merge nfs_pgio_data into _header

struct nfs_pgio_data only exists as a member of nfs_pgio_header, but is
passed around everywhere, because there used to be multiple _data structs
per _header. Many of these functions then use the _data to find a pointer
to the _header. This patch cleans this up by merging the nfs_pgio_data
structure into nfs_pgio_header and passing nfs_pgio_header around instead.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
1e7f3a485922211b6e4a082ebc6bf05810b0b6ea 09-Jun-2014 Weston Andros Adamson <dros@primarydata.com> nfs: move nfs_pgio_data and remove nfs_rw_header

nfs_rw_header was used to allocate an nfs_pgio_header along with an
nfs_pgio_data, because a _header would need at least one _data.

Now there is only ever one nfs_pgio_data for each nfs_pgio_header -- move
it to nfs_pgio_header and get rid of nfs_rw_header.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
20633f042fd0907300069714b98aaf607a8b5bf8 15-May-2014 Weston Andros Adamson <dros@primarydata.com> nfs: page group syncing in write path

Operations that modify state for a whole page must be syncronized across
all requests within a page group. In the write path, this is calling
end_page_writeback and removing the head request from an inode.
Both of these operations should not be called until all requests
in a page group have reached the point where they would call them.

This patch should have no effect yet since all page groups currently
have one request, but will come into play when pg_test functions are
modified to split pages into sub-page regions.

Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
67d0338edd71db9a4f406d8778f7c525d31e9f7f 15-May-2014 Weston Andros Adamson <dros@primarydata.com> nfs: page group syncing in read path

Operations that modify state for a whole page must be syncronized across
all requests within a page group. In the read path, this is calling
unlock_page and SetPageUptodate. Both of these functions should not be
called until all requests in a page group have reached the point where
they would call them.

This patch should have no effect yet since all page groups currently
have one request, but will come into play when pg_test functions are
modified to split pages into sub-page regions.

Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2bfc6e566daa8386c9cffef2f7de17fc330d3835 15-May-2014 Weston Andros Adamson <dros@primarydata.com> nfs: add support for multiple nfs reqs per page

Add "page groups" - a circular list of nfs requests (struct nfs_page)
that all reference the same page. This gives nfs read and write paths
the ability to account for sub-page regions independently. This
somewhat follows the design of struct buffer_head's sub-page
accounting.

Only "head" requests are ever added/removed from the inode list in
the buffered write path. "head" and "sub" requests are treated the
same through the read path and the rest of the write/commit path.
Requests are given an extra reference across the life of the list.

Page groups are never rejoined after being split. If the read/write
request fails and the client falls back to another path (ie revert
to MDS in PNFS case), the already split requests are pushed through
the recoalescing code again, which may split them further and then
coalesce them into properly sized requests on the wire. Fragmentation
shouldn't be a problem with the current design, because we flush all
requests in page group when a non-contiguous request is added, so
the only time resplitting should occur is on a resend of a read or
write.

This patch lays the groundwork for sub-page splitting, but does not
actually do any splitting. For now all page groups have one request
as pg_test functions don't yet split pages. There are several related
patches that are needed support multiple requests per page group.

Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
b4fdac1a5150174df0847a45dc6612ce5ce3daeb 15-May-2014 Weston Andros Adamson <dros@primarydata.com> nfs: modify pg_test interface to return size_t

This is a step toward allowing pg_test to inform the the
coalescing code to reduce the size of requests so they may fit in
whatever scheme the pg_test callback wants to define.

For now, just return the size of the request if there is space, or 0
if there is not. This shouldn't change any behavior as it acts
the same as when the pg_test functions returned bool.

Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
8c8f1ac109726e4ed44a920f5c962c84610d4a17 15-May-2014 Weston Andros Adamson <dros@primarydata.com> nfs: remove unused arg from nfs_create_request

@inode is passed but not used.

Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
12c05792599ec57ebab33096b2c75b863dfe6ea4 15-May-2014 Weston Andros Adamson <dros@primarydata.com> nfs: clean up PG_* flags

Remove unused flags PG_NEED_COMMIT and PG_NEED_RESCHED.
Add comments describing how each flag is used.

Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
1ed26f33008e954a8e91d26f97d4380dea8145db 06-May-2014 Anna Schumaker <Anna.Schumaker@netapp.com> NFS: Create a common initiate_pgio() function

Most of this code is the same for both the read and write paths, so
combine everything and use the rw_ops when necessary.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
0eecb2145c1ce18e36617008424a93836ad0a3bd 06-May-2014 Anna Schumaker <Anna.Schumaker@netapp.com> NFS: Create a common nfs_pgio_result_common function

Combining these functions will let me make a single nfs_rw_common_ops
struct (see the next patch).

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
a4cdda59111f92000297e0d3edb1e0e08ba3549b 06-May-2014 Anna Schumaker <Anna.Schumaker@netapp.com> NFS: Create a common pgio_rpc_prepare function

The read and write paths do exactly the same thing for the rpc_prepare
rpc_op. This patch combines them together into a single function.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
4a0de55c565a36cac8422b76a948c4634a90781e 06-May-2014 Anna Schumaker <Anna.Schumaker@netapp.com> NFS: Create a common rw_header_alloc and rw_header_free function

I create a new struct nfs_rw_ops to decide the differences between reads
and writes. This struct will be set when initializing a new
nfs_pgio_descriptor, and then passed on to the nfs_rw_header when a new
header is allocated.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
f6166384095b7ecf77752b5e9096e6d03d75f7ae 02-Aug-2012 Peng Tao <bergwolf@gmail.com> NFS41: add pg_layout_private to nfs_pageio_descriptor

To allow layout driver to pass private information around
pg_init/pg_doio.

Signed-off-by: Peng Tao <tao.peng@emc.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2f2c63bc221c5fcded24de2704575d0abf96b910 08-Jun-2012 Trond Myklebust <Trond.Myklebust@netapp.com> NFS: Cleanup - only store the write verifier in struct nfs_page

The 'committed' field is not needed once we have put the struct nfs_page
on the right list.

Also correct the type of the verifier: it is not an array of __be32, but
simply an 8 byte long opaque array.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
1d1afcbc294cc7c788eb5c7b6b98e8d63caf002c 09-May-2012 Trond Myklebust <Trond.Myklebust@netapp.com> NFS: Clean up - Rename nfs_unlock_request and nfs_unlock_request_dont_release

Function rename to ensure that the functionality of nfs_unlock_request()
mirrors that of nfs_lock_request(). Then let nfs_unlock_and_release_request()
do the work of what used to be called nfs_unlock_request()...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Fred Isaman <iisaman@netapp.com>
7ad84aa9448571678c243f0c5ef383fbe5b50f4f 09-May-2012 Trond Myklebust <Trond.Myklebust@netapp.com> NFS: Clean up - simplify nfs_lock_request()

We only have two places where we need to grab a reference when trying
to lock the nfs_page. We're better off making that explicit.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Fred Isaman <iisaman@netapp.com>
3aff4ebb95b20ad8db2c1447e8c52097d89af5a7 09-May-2012 Trond Myklebust <Trond.Myklebust@netapp.com> NFS: Prevent a deadlock in the new writeback code

We have to unlock the nfs_page before we call nfs_end_page_writeback
to avoid races with functions that expect the page to be unlocked
when PG_locked and PG_writeback are not set.
The problem is that nfs_unlock_request also releases the nfs_page,
causing a deadlock if the release of the nfs_open_context
triggers an iput() while the PG_writeback flag is still set...

The solution is to separate the unlocking and release of the nfs_page,
so that we can do the former before nfs_end_page_writeback and the
latter after.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Fred Isaman <iisaman@netapp.com>
584aa810b6240d88c28113a90c5029449814a3b5 20-Apr-2012 Fred Isaman <iisaman@netapp.com> NFS: rewrite directio read to use async coalesce code

This also has the advantage that it allows directio to use pnfs.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
9533da2979757258d3fd5429d830a297013d69ed 20-Apr-2012 Fred Isaman <iisaman@netapp.com> NFS: remove unused wb_complete field from struct nfs_page

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
061ae2edb7375ab6776468b075da71008a098b55 20-Apr-2012 Fred Isaman <iisaman@netapp.com> NFS: create completion structure to pass into page_init functions

Factors out the code that will need to change when directio
starts using these code paths. This will allow directio to use
the generic pagein and flush routines

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
4db6e0b74c0f6dfc2f9c0690e8df512e3b635983 20-Apr-2012 Fred Isaman <iisaman@netapp.com> NFS: merge _full and _partial read rpc_ops

Decouple nfs_pgio_header and nfs_read_data, and have (possibly
multiple) nfs_read_datas each take a refcount on nfs_pgio_header.

For the moment keeps nfs_read_header as a way to preallocate a single
nfs_read_data with the nfs_pgio_header. The code doesn't need this,
and would be prettier without, but given the amount of churn I am
already introducing I didn't want to play with tuning new mempools.

This also fixes bug in pnfs_ld_handle_read_error. In the case of
desc->pg_bsize < PAGE_CACHE_SIZE, the pages list was empty, causing
replay attempt to do nothing.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
8dd3775889345850ecddd689b5c200cdd91bd8c9 15-Mar-2012 Trond Myklebust <Trond.Myklebust@netapp.com> NFSv4.1: Clean ups and bugfixes for the pNFS read/writeback/commit code

Move more pnfs-isms out of the generic commit code.

Bugfixes:

- filelayout_scan_commit_lists doesn't need to get/put the lseg.
In fact since it is run under the inode->i_lock, the lseg_put()
can deadlock.

- Ensure that we distinguish between what needs to be done for
commit-to-data server and what needs to be done for commit-to-MDS
using the new flag PG_COMMIT_TO_DS. Otherwise we may end up calling
put_lseg() on a bucket for a struct nfs_page that got written
through the MDS.

- Fix a case where we were using list_del() on an nfs_page->wb_list
instead of list_del_init().

- filelayout_initiate_commit needs to call filelayout_commit_release
on error instead of the mds_ops->rpc_release(). Otherwise it won't
clear the commit lock.

Cleanups:

- Let the files layout manage the commit lists for the pNFS case.
Don't expose stuff like pnfs_choose_commit_list, and the fact
that the commit buckets hold references to the layout segment
in common code.

- Cast out the put_lseg() calls for the struct nfs_read/write_data->lseg
into the pNFS layer from whence they came.

- Let the pNFS layer manage the NFS_INO_PNFS_COMMIT bit.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Fred Isaman <iisaman@netapp.com>
d6d6dc7cdfda7c8f49a89a7b7261846f319da6d1 08-Mar-2012 Fred Isaman <iisaman@netapp.com> NFS: remove nfs_inode radix tree

The radix tree is only being used to compile lists of reqs needing commit.
It is simpler to just put the reqs directly into a list.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
9994b62b5621f88828d442fcd03fe3ce4c43344b 08-Mar-2012 Fred Isaman <iisaman@netapp.com> NFS: remove NFS_PAGE_TAG_LOCKED

The last real use of this tag was removed by
commit 7f2f12d963 NFS: Simplify nfs_wb_page()

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
fba730050d1246d0e6ef44e026e0b584732fec2b 19-Oct-2011 Trond Myklebust <Trond.Myklebust@netapp.com> NFS: Don't rely on PageError in nfs_readpage_release_partial

Don't rely on the PageError flag to tell us if one of the partial reads of
the page failed. Instead, replace that with a dedicated flag in the
struct nfs_page.

Then clean out redundant uses of the PageError flag: the VM no longer
checks it for reads.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
dce81290eed64d24493989bb7a08f9e20495e184 13-Jul-2011 Trond Myklebust <Trond.Myklebust@netapp.com> NFS: Move the pnfs write code into pnfs.c

...and ensure that we recoalese to take into account differences in
differences in block sizes when falling back to write through the MDS.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
493292ddc78d18ee2ad2d5c24c2b7dd6a24641d2 13-Jul-2011 Trond Myklebust <Trond.Myklebust@netapp.com> NFS: Move the pnfs read code into pnfs.c

...and ensure that we recoalese to take into account differences in
block sizes when falling back to read through the MDS.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
d9156f9f364897e93bdd98b4ad22138de18f7c24 12-Jul-2011 Trond Myklebust <Trond.Myklebust@netapp.com> NFS: Allow the nfs_pageio_descriptor to signal that a re-coalesce is needed

If an attempt to do pNFS fails, and we have to fall back to writing through
the MDS, then we may want to re-coalesce the requests that we already have
since the block size for the MDS read/writes may be different to that of
the DS read/writes.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
50828d7e6767a92726708bc0666e2b8b84575808 12-Jul-2011 Trond Myklebust <Trond.Myklebust@netapp.com> NFS: Cache rpc_ops in struct nfs_pageio_descriptor

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
d8007d4dd6ff8749cc8a4063c3ec87442db76d82 10-Jun-2011 Trond Myklebust <Trond.Myklebust@netapp.com> NFSv4.1: Add an initialisation callback for pNFS

Ensure that we always get a layout before setting up the i/o request.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
1751c3638f2a07a8c66a803a31791bab9bd3fced 10-Jun-2011 Trond Myklebust <Trond.Myklebust@netapp.com> NFS: Cleanup of the nfs_pageio code in preparation for a pnfs bugfix

We need to ensure that the layouts are set up before we can decide to
coalesce requests. To do so, we want to further split up the struct
nfs_pageio_descriptor operations into an initialisation callback, a
coalescing test callback, and a 'do i/o' callback.

This patch cleans up the existing callback methods before adding the
'initialisation' callback.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
19345cb299e8234006c5125151ab723e851a1d24 20-Jun-2011 Benny Halevy <benny@tonian.com> NFSv4.1: file layout must consider pg_bsize for coalescing

Otherwise we end up overflowing the rpc buffer size on the receive end.

Signed-off-by: Benny Halevy <benny@tonian.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18ad0a9f2ccd260d37dd6bc5fa04c7819def4c84 25-May-2011 Benny Halevy <bhalevy@panasas.com> NFSv4.1: change pg_test return type to bool

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
4d65c520fb4abed970069d18c119cfe85624f46d 25-Mar-2011 Trond Myklebust <Trond.Myklebust@netapp.com> NFS: Fix a hang in the writeback path

Now that the inode scalability patches have been merged, it is no longer
safe to call igrab() under the inode->i_lock.
Now that we no longer call nfs_clear_request() until the nfs_page is
being freed, we know that we are always holding a reference to the
nfs_open_context, which again holds a reference to the path, and so
the inode cannot be freed until the last nfs_page has been removed
from the radix tree and freed.

We can therefore skip the igrab()/iput() altogether.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
a861a1e1c398fe34701569fd8ac9225dfe0a9a7e 23-Mar-2011 Fred Isaman <iisaman@netapp.com> NFSv4.1: add generic layer hooks for pnfs COMMIT

We create three major hooks for the pnfs code.

pnfs_mark_request_commit() is called during writeback_done from
nfs_mark_request_commit, which gives the driver an opportunity to
claim it wants control over commiting a particular req.

pnfs_choose_commit_list() is called from nfs_scan_list
to choose which list a given req should be added to, based on
where we intend to send it for COMMIT. It is up to the driver
to have preallocated list headers for each destination it may need.

pnfs_commit_list() is how the driver actually takes control, it is
used instead of nfs_commit_list().

In order to pass information between the above functions, we create
a union in nfs_page to hold a lseg (which is possible because the req is
not on any list while in transition), and add some flags to indicate
if we need to use the pnfs code.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
b31268ac793fd300da66b9c28bbf0a200339ab96 21-Mar-2011 Trond Myklebust <Trond.Myklebust@netapp.com> FS: Use stable writes when not doing a bulk flush

If we're only doing a single write, and there are no other unstable
writes being queued up, we might want to just flip to using a stable
write RPC call.

Reviewed-by: NeilBrown <neilb@suse.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
c76069bda0f17cd3e153e54d9ac01242909c6b15 03-Mar-2011 Fred Isaman <iisaman@netapp.com> NFSv4.1: rearrange ->doio args

This will make it possible to clear the lseg pointer in the same
function as it is put, instead of in the caller nfs_pageio_doio().

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
bae724ef95b0d0a1f4518f5451e7c8aabc41f820 01-Mar-2011 Fred Isaman <iisaman@netapp.com> NFSv4.1: shift pnfs_update_layout locations

Move the pnfs_update_layout call location to nfs_pageio_do_add_request().
Grab the lseg sent in the doio function to nfs_read_rpcsetup and attach
it to each nfs_read_data so it can be sent to the layout driver.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com>
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Tao Guo <guotao@nrchpc.ac.cn>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
94ad1c80e28f9700c84b4d28d1e5302ddf63a6fd 01-Mar-2011 Fred Isaman <iisaman@netapp.com> NFSv4.1: coelesce across layout stripes

Add a pg_test layout driver hook which is used to avoid coelescing I/O across
layout stripes.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com>
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Tao Guo <guotao@nrchpc.ac.cn>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2df485a774ba59c3f43bfe84107672c1d9b731a0 08-Dec-2010 Trond Myklebust <Trond.Myklebust@netapp.com> nfs: remove extraneous and problematic calls to nfs_clear_request

When a nfs_page is freed, nfs_free_request is called which also calls
nfs_clear_request to clean out the lock and open contexts and free the
pagecache page.

However, a couple of places in the nfs code call nfs_clear_request
themselves. What happens here if the refcount on the request is still high?
We'll be releasing contexts and freeing pointers while the request is
possibly still in use.

Remove those bare calls to nfs_clear_context. That should only be done when
the request is being freed.

Note that when doing this, we need to watch out for tests of req->wb_page.
Previously, nfs_set_page_tag_locked() and nfs_clear_page_tag_locked()
would check the value of req->wb_page to figure out if the page is mapped
into the nfsi->nfs_page_tree. We now indicate the page is mapped using
the new bit PG_MAPPED in req->wb_flags .

Reported-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
f11ac8db5d07b6e99d41ff4aa39d878ee5cef1c5 25-Jun-2010 Trond Myklebust <Trond.Myklebust@netapp.com> NFSv4: Ensure that we track the NFSv4 lock state in read/write requests.

This patch fixes bugzilla entry 14501:
https://bugzilla.kernel.org/show_bug.cgi?id=14501

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
e468bae97d243fe0e1515abaa1f7d0edf1476ad0 13-Jun-2008 Trond Myklebust <Trond.Myklebust@netapp.com> NFS: Allow redirtying of a completed unstable write.

Currently, if an unstable write completes, we cannot redirty the page in
order to reflect a new change in the page data until after we've sent a
COMMIT request.

This patch allows a page rewrite to proceed without the unnecessary COMMIT
step, putting it immediately back onto the dirty page list, undoing the
VM unstable write accounting, and removing the NFS_PAGE_TAG_COMMIT tag from
the NFS radix tree.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
acee478afc6ff7e1b8852d9a4dca1ff36021414d 22-Jan-2008 Trond Myklebust <Trond.Myklebust@netapp.com> NFS: Clean up the write request locking.

Ensure that we set/clear NFS_PAGE_TAG_LOCKED when the nfs_page is hashed.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
9cccef95052c7169040c3577e17d4f6fa230cc28 22-Jul-2007 Trond Myklebust <Trond.Myklebust@netapp.com> NFS: Clean up write code...

The addition of nfs_page_mkwrite means that We should no longer need to
create requests inside nfs_writepage()

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2aefa104313996d1a9582476cee53d1296c834bf 17-Jun-2007 Trond Myklebust <Trond.Myklebust@netapp.com> NFS: Remove the redundant 'dirty' and 'commit' lists from nfs_inode

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
5c36968343fcd013a3f7ae93f246c2e75596780b 17-Jun-2007 Trond Myklebust <Trond.Myklebust@netapp.com> NFS cleanup: speed up nfs_scan_commit using radix tree tags

Add a tag for requests that are waiting for a COMMIT

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
9fd367f0f376ccfb2592eed9be0eece70429894f 17-Jun-2007 Trond Myklebust <Trond.Myklebust@netapp.com> NFS cleanup: Rename NFS_PAGE_TAG_WRITEBACK to NFS_PAGE_TAG_LOCKED

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
c03b40246123b2ced79e2620d1d2c089bb12369a 17-Jun-2007 Trond Myklebust <Trond.Myklebust@netapp.com> NFS: Convert struct nfs_page to use krefs

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
7fe7f8487ae742239dd8c66596e2311c30d057d1 20-May-2007 Trond Myklebust <Trond.Myklebust@netapp.com> NFS: Avoid a deadlock situation on write

When processes are allowed to attempt to lock a non-contiguous range of nfs
write requests, it is possible for generic_writepages to 'wrap round' the
address space, and call writepage() on a request that is already locked by
the same process.

We avoid the deadlock by checking if the page index is contiguous with the
list of nfs write requests that is already held in our
nfs_pageio_descriptor prior to attempting to lock a new request.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
ca52fec152282ef73e5e882b847b36b1febbb1c6 17-Apr-2007 Trond Myklebust <Trond.Myklebust@netapp.com> NFS: Use pgoff_t in structures and functions that pass page cache offsets

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
8d5658c949e6d89edc579a1f112aeee3bc232a8e 10-Apr-2007 Trond Myklebust <Trond.Myklebust@netapp.com> NFS: Fix a buffer overflow in the allocation of struct nfs_read/writedata

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
c63c7b051395368573779c8309aa5c990dcf2f96 03-Apr-2007 Trond Myklebust <Trond.Myklebust@netapp.com> NFS: Fix a race when doing NFS write coalescing

Currently we do write coalescing in a very inefficient manner: one pass in
generic_writepages() in order to lock the pages for writing, then one pass
in nfs_flush_mapping() and/or nfs_sync_mapping_wait() in order to gather
the locked pages for coalescing into RPC requests of size "wsize".

In fact, it turns out there is actually a deadlock possible here since we
only start I/O on the second pass. If the user signals the process while
we're in nfs_sync_mapping_wait(), for instance, then we may exit before
starting I/O on all the requests that have been queued up.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
8b09bee3083897e375bd0bf9d60f48daedfab3e0 03-Apr-2007 Trond Myklebust <Trond.Myklebust@netapp.com> NFS: Cleanup for nfs_readpages()

Do the coalescing of read requests into block sized requests at start of
I/O as we scan through the pages instead of going through a second pass.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
bcb71bba7e64f0442d0ca339d7d3117a7060589f 03-Apr-2007 Trond Myklebust <Trond.Myklebust@netapp.com> NFS: Another cleanup of the read/write request coalescing code

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
d8a5ad75cc4d577987964e37a4c43b1c648c201e 03-Apr-2007 Trond Myklebust <Trond.Myklebust@netapp.com> NFS: Cleanup the coalescing code

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
8e821cad12e80cd1a8a3fbadf91f62f17f32549e 20-Apr-2007 Trond Myklebust <Trond.Myklebust@netapp.com> NFS: clean up the unstable write code

Get rid of the inlined #ifdefs.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5a6d41b32a17ca902ef50fdfa170d7f23264bad5 15-Apr-2007 Trond Myklebust <Trond.Myklebust@netapp.com> NFS: Ensure PG_writeback is cleared when writeback fails

If the writebacks are cancelled via nfs_cancel_dirty_list, or due to the
memory allocation failing in nfs_flush_one/nfs_flush_multi, then we must
ensure that the PG_writeback flag is cleared.

Also ensure that we actually own the PG_writeback flag whenever we
schedule a new writeback by making nfs_set_page_writeback() return the
value of test_set_page_writeback().
The PG_writeback page flag ends up replacing the functionality of the
PG_FLUSHING nfs_page flag, so we rip that out too.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
e261f51f25b98c213e0b3d7f2109b117d714f69d 05-Dec-2006 Trond Myklebust <Trond.Myklebust@netapp.com> NFS: Make nfs_updatepage() mark the page as dirty.

This will ensure that we can call set_page_writeback() from within
nfs_writepage(), which is always called with the page lock set.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
1a54533ec8d92a5edae97ec6ae10023ee71c4b46 05-Dec-2006 Trond Myklebust <Trond.Myklebust@netapp.com> NFS: Add nfs_set_page_dirty()

We will want to allow nfs_writepage() to distinguish between pages that
have been marked as dirty by the VM, and those that have been marked as
dirty by nfs_updatepage().
In the former case, the entire page will want to be written out, and so any
requests that were pending need to be flushed out first.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
3f442547b76bf9fb70d7aecc41cf1980459253c9 17-Sep-2006 Trond Myklebust <Trond.Myklebust@netapp.com> NFS: Clean up nfs_scan_dirty()

Pass down struct writeback control.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
8b4bdcf8995dd92b23d2ec22b32aee8fbbb50e1c 09-Jun-2006 Trond Myklebust <Trond.Myklebust@netapp.com> NFS: Store the file system "fsid" value in the NFS super block.

This should enable us to detect if we are crossing a mountpoint in the
case where the server is exporting "nohide" mounts.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
d2ccddf042c403b146159beea438c6bfc4a445e2 31-May-2006 Trond Myklebust <Trond.Myklebust@netapp.com> NFS: Flesh out nfs_invalidate_page()

In the case of a call to truncate_inode_pages(), we should really try to
cancel any pending writes on the page.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
a911fd9a6046200e439b4af172e8379c0942eec3 01-Dec-2005 Chuck Lever <cel@netapp.com> NFS: simplify inlined bit ops in nfs_page.h

Minor cleanup: inlined bit ops in nfs_page.h can be simpler.

Test plan:
Write-intensive workload against a server that requires COMMITs.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
3da28eb1c6545fe73263a24eba0996217490e1eb 22-Jun-2005 Trond Myklebust <Trond.Myklebust@netapp.com> [PATCH] NFS: Replace nfs_page insertion sort with a radix sort

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
c6a556b88adfacd2af90be84357c8165d716c27d 22-Jun-2005 Trond Myklebust <Trond.Myklebust@netapp.com> [PATCH] NFS: Make searching and waiting on busy writeback requests more efficient.

Basically copies the VFS's method for tracking writebacks and applies
it to the struct nfs_page.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 17-Apr-2005 Linus Torvalds <torvalds@ppc970.osdl.org> Linux-2.6.12-rc2

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!