History log of /fs/xfs/xfs_dquot.c
Revision Date Author Comments
5fd364fee81a7888af806e42ed8a91c845894f2d 03-Aug-2014 Dave Chinner <dchinner@redhat.com> xfs: quotacheck leaves dquot buffers without verifiers

When running xfs/305, I noticed that quotacheck was flushing dquot
buffers that did not have the xfs_dquot_buf_ops verifiers attached:

XFS (vdb): _xfs_buf_ioapply: no ops on block 0x1dc8/0x1dc8
ffff880052489000: 44 51 01 04 00 00 65 b8 00 00 00 00 00 00 00 00 DQ....e.........
ffff880052489010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
ffff880052489020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
ffff880052489030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
CPU: 1 PID: 2376 Comm: mount Not tainted 3.16.0-rc2-dgc+ #306
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
ffff88006fe38000 ffff88004a0ffae8 ffffffff81cf1cca 0000000000000001
ffff88004a0ffb88 ffffffff814d50ca 000010004a0ffc70 0000000000000000
ffff88006be56dc4 0000000000000021 0000000000001dc8 ffff88007c773d80
Call Trace:
[<ffffffff81cf1cca>] dump_stack+0x45/0x56
[<ffffffff814d50ca>] _xfs_buf_ioapply+0x3ca/0x3d0
[<ffffffff810db520>] ? wake_up_state+0x20/0x20
[<ffffffff814d51f5>] ? xfs_bdstrat_cb+0x55/0xb0
[<ffffffff814d513b>] xfs_buf_iorequest+0x6b/0xd0
[<ffffffff814d51f5>] xfs_bdstrat_cb+0x55/0xb0
[<ffffffff814d53ab>] __xfs_buf_delwri_submit+0x15b/0x220
[<ffffffff814d6040>] ? xfs_buf_delwri_submit+0x30/0x90
[<ffffffff814d6040>] xfs_buf_delwri_submit+0x30/0x90
[<ffffffff8150f89d>] xfs_qm_quotacheck+0x17d/0x3c0
[<ffffffff81510591>] xfs_qm_mount_quotas+0x151/0x1e0
[<ffffffff814ed01c>] xfs_mountfs+0x56c/0x7d0
[<ffffffff814f0f12>] xfs_fs_fill_super+0x2c2/0x340
[<ffffffff811c9fe4>] mount_bdev+0x194/0x1d0
[<ffffffff814f0c50>] ? xfs_finish_flags+0x170/0x170
[<ffffffff814ef0f5>] xfs_fs_mount+0x15/0x20
[<ffffffff811ca8c9>] mount_fs+0x39/0x1b0
[<ffffffff811e4d67>] vfs_kern_mount+0x67/0x120
[<ffffffff811e757e>] do_mount+0x23e/0xad0
[<ffffffff8117abde>] ? __get_free_pages+0xe/0x50
[<ffffffff811e71e6>] ? copy_mount_options+0x36/0x150
[<ffffffff811e8103>] SyS_mount+0x83/0xc0
[<ffffffff81cfd40b>] tracesys+0xdd/0xe2

This was caused by dquot buffer readahead not attaching a verifier
structure to the buffer when readahead was issued, resulting in the
followup read of the buffer finding a valid buffer and so not
attaching new verifiers to the buffer as part of the read.

Also, when a verifier failure occurs, we then read the buffer
without verifiers. Attach the verifiers manually after this read so
that if the buffer is then written it will be verified that the
corruption has been repaired.

Further, when flushing a dquot we don't ask for a verifier when
reading in the dquot buffer the dquot belongs to. Most of the time
this isn't an issue because the buffer is still cached, but when it
is not cached it will result in writing the dquot buffer without
having the verfier attached.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2451337dd043901b5270b7586942abe564443e3d 25-Jun-2014 Dave Chinner <dchinner@redhat.com> xfs: global error sign conversion

Convert all the errors the core XFs code to negative error signs
like the rest of the kernel and remove all the sign conversion we
do in the interface layers.

Errors for conversion (and comparison) found via searches like:

$ git grep " E" fs/xfs
$ git grep "return E" fs/xfs
$ git grep " E[A-Z].*;$" fs/xfs

Negation points found via searches like:

$ git grep "= -[a-z,A-Z]" fs/xfs
$ git grep "return -[a-z,A-D,F-Z]" fs/xfs
$ git grep " -[a-z].*;" fs/xfs

[ with some bits I missed from Brian Foster ]

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
b474c7ae4395ba684e85fde8f55c8cf44a39afaf 22-Jun-2014 Eric Sandeen <sandeen@sandeen.net> xfs: Nuke XFS_ERROR macro

XFS_ERROR was designed long ago to trap return values, but it's not
runtime configurable, it's not consistently used, and we can do
similar error trapping with ftrace scripts and triggers from
userspace.

Just nuke XFS_ERROR and associated bits.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
d99831ff393ff2e28d6110b41f24d9fecf986222 22-Jun-2014 Eric Sandeen <sandeen@sandeen.net> xfs: return is not a function

return is not a function. "return(EIO);" is silly;
"return (EIO);" moreso. return is not a function.
Nuke the pointless parens.

[dchinner: catch a couple of extra cases in xfs_attr_list.c,
xfs_acl.c and xfs_linux.h.]

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
36de95567f910f5544060f50346d8677ae13ad22 06-Jun-2014 Dave Chinner <dchinner@redhat.com> xfs: kill xfs_buf_geterror()

Most of the callers are just calling ASSERT(!xfs_buf_geterror())
which means they are checking for bp->b_error == 0. If bp is null in
this case, we will assert fail, and hence it's no different in
result to oopsing because of a null bp. In some cases, errors have
already been checked for or the function returning the buffer can't
return a buffer with an error, so it's just a redundant assert.
Either way, the assert can either be removed.

The other two non-assert callers can just test for a buffer and
error properly.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
3c353375761d81abfb66eb054aacceef31658e24 05-May-2014 Dave Chinner <dchinner@redhat.com> xfs: remove dquot hints

group and project quota hints are currently stored on the user
dquot. If we are attaching quotas to the inode, then the group and
project dquots are stored as hints on the user dquot to save having
to look them up again later.

The thing is, the hints are not used for that inode for the rest of
the life of the inode - the dquots are attached directly to the
inode itself - so the only time the hints are used is when an inode
first has dquots attached.

When the hints on the user dquot don't match the dquots being
attache dto the inode, they are then removed and replaced with the
new hints. If a user is concurrently modifying files in different
group and/or project contexts, then this leads to thrashing of the
hints attached to user dquot.

If user quotas are not enabled, then hints are never even used.

So, if the hints are used to avoid the cost of the lookup, is the
cost of the lookup significant enough to justify the hint
infrstructure? Maybe it was once, when there was a global quota
manager shared between all XFS filesystems and was hash table based.

However, lookups are now much simpler, requiring only a single lock and
radix tree lookup local to the filesystem and no hash or LRU
manipulations to be made. Hence the cost of lookup is much lower
than when hints were implemented. Turns out that benchmarks show
that, too, with thir being no differnce in performance when doing
file creation workloads as a single user with user, group and
project quotas enabled - the hints do not make the code go any
faster. In fact, removing the hints shows a 2-3% reduction in the
time it takes to create 50 million inodes....

So, let's just get rid of the hints and the complexity around them.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
410b11a675dca827e893f07c3155691eda3b5887 07-Feb-2014 Brian Foster <bfoster@redhat.com> xfs: use tr_qm_dqalloc log reservation for dquot alloc

The dquot allocation path in xfs_qm_dqread() currently uses the
attribute set log reservation, which appears to be incorrect. We
have reports of transaction reservation overruns with the current
code. E.g., a repeated run of xfstests test generic/270 on a 512b
block size fs occassionally produces the following in dmesg:

XFS (sdN): xlog_write: reservation summary:
trans type = QM_DQALLOC (30)
unit res = 7080 bytes
current res = -632 bytes
total reg = 0 bytes (o/flow = 0 bytes)
ophdrs = 0 (ophdr space = 0 bytes)
ophdr + reg = 0 bytes
num regions = 0

XFS (sdN): xlog_write: reservation ran out. Need to up reservation

The dquot allocation case should consist of a write reservation
(i.e., we are allocating a range of the internal quota file) plus
the size of the actual dquots. We already have a log reservation
definition for this operation (tr_qm_dqalloc). Use it in
xfs_qm_dqread() and update the log reservation calculation function
to use the write res. calculation function rather than reading the
assumed to be pre-calculated value directly.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Jie Liu <jeff.liu@oracle.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
f4df8adc8325127ff015ef9c2a8f005edaaedd07 06-Dec-2013 Christoph Hellwig <hch@infradead.org> xfs: use xfs_ilock_data_map_shared in xfs_qm_dqtobp

We might not have read in the extent list at this point, so make sure we
take the ilock exclusively if we have to do so.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
a4fbe6ab1e7abecf42b75e9c73701ed33b4ab03b 23-Oct-2013 Dave Chinner <dchinner@redhat.com> xfs: decouple inode and bmap btree header files

Currently the xfs_inode.h header has a dependency on the definition
of the BMAP btree records as the inode fork includes an array of
xfs_bmbt_rec_host_t objects in it's definition.

Move all the btree format definitions from xfs_btree.h,
xfs_bmap_btree.h, xfs_alloc_btree.h and xfs_ialloc_btree.h to
xfs_format.h to continue the process of centralising the on-disk
format definitions. With this done, the xfs inode definitions are no
longer dependent on btree header files.

The enables a massive culling of unnecessary includes, with close to
200 #include directives removed from the XFS kernel code base.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
239880ef6454ccff2ba8d762c3f86e8278f0ce1c 23-Oct-2013 Dave Chinner <dchinner@redhat.com> xfs: decouple log and transaction headers

xfs_trans.h has a dependency on xfs_log.h for a couple of
structures. Most code that does transactions doesn't need to know
anything about the log, but this dependency means that they have to
include xfs_log.h. Decouple the xfs_trans.h and xfs_log.h header
files and clean up the includes to be in dependency order.

In doing this, remove the direct include of xfs_trans_reserve.h from
xfs_trans.h so that we remove the dependency between xfs_trans.h and
xfs_mount.h. Hence the xfs_trans.h include can be moved to the
indicate the actual dependencies other header files have on it.

Note that these are kernel only header files, so this does not
translate to any userspace changes at all.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
9aede1d81b12a95771db0e2f9196dd37e10aaf20 15-Oct-2013 Dave Chinner <dchinner@redhat.com> xfs: split dquot buffer operations out

Parts of userspace want to be able to read and modify dquot buffers
(e.g. xfs_db) so we need to split out the reading and writing of
these buffers so it is easy to shared code with libxfs in userspace.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
70a9883c5f34b215b8a77665cefd0398edc5a9ef 23-Oct-2013 Dave Chinner <dchinner@redhat.com> xfs: create a shared header file for format-related information

All of the buffer operations structures are needed to be exported
for xfs_db, so move them all to a common location rather than
spreading them all over the place. They are verifying the on-disk
format, so while xfs_format.h might be a good place, it is not part
of the on disk format.

Hence we need to create a new header file that we centralise these
related definitions. Start by moving the bffer operations
structures, and then also move all the other definitions that have
crept into xfs_log_format.h and xfs_format.h as there was no other
shared header file to put them in.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
89c6c89af2ef41cb127c9694ef7783e585e96337 30-Sep-2013 Dave Chinner <dchinner@redhat.com> xfs: lockdep needs to know about 3 dquot-deep nesting

Michael Semon reported that xfs/299 generated this lockdep warning:

=============================================
[ INFO: possible recursive locking detected ]
3.12.0-rc2+ #2 Not tainted
---------------------------------------------
touch/21072 is trying to acquire lock:
(&xfs_dquot_other_class){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64

but task is already holding lock:
(&xfs_dquot_other_class){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(&xfs_dquot_other_class);
lock(&xfs_dquot_other_class);

*** DEADLOCK ***

May be due to missing lock nesting notation

7 locks held by touch/21072:
#0: (sb_writers#10){++++.+}, at: [<c11185b6>] mnt_want_write+0x1e/0x3e
#1: (&type->i_mutex_dir_key#4){+.+.+.}, at: [<c11078ee>] do_last+0x245/0xe40
#2: (sb_internal#2){++++.+}, at: [<c122c9e0>] xfs_trans_alloc+0x1f/0x35
#3: (&(&ip->i_lock)->mr_lock/1){+.+...}, at: [<c126cd1b>] xfs_ilock+0x100/0x1f1
#4: (&(&ip->i_lock)->mr_lock){++++-.}, at: [<c126cf52>] xfs_ilock_nowait+0x105/0x22f
#5: (&dqp->q_qlock){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64
#6: (&xfs_dquot_other_class){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64

The lockdep annotation for dquot lock nesting only understands
locking for user and "other" dquots, not user, group and quota
dquots. Fix the annotations to match the locking heirarchy we now
have.

Reported-by: Michael L. Semon <mlsemon35@gmail.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

(cherry picked from commit f112a049712a5c07de25d511c3c6587a2b1a015e)
f112a049712a5c07de25d511c3c6587a2b1a015e 30-Sep-2013 Dave Chinner <dchinner@redhat.com> xfs: lockdep needs to know about 3 dquot-deep nesting

Michael Semon reported that xfs/299 generated this lockdep warning:

=============================================
[ INFO: possible recursive locking detected ]
3.12.0-rc2+ #2 Not tainted
---------------------------------------------
touch/21072 is trying to acquire lock:
(&xfs_dquot_other_class){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64

but task is already holding lock:
(&xfs_dquot_other_class){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(&xfs_dquot_other_class);
lock(&xfs_dquot_other_class);

*** DEADLOCK ***

May be due to missing lock nesting notation

7 locks held by touch/21072:
#0: (sb_writers#10){++++.+}, at: [<c11185b6>] mnt_want_write+0x1e/0x3e
#1: (&type->i_mutex_dir_key#4){+.+.+.}, at: [<c11078ee>] do_last+0x245/0xe40
#2: (sb_internal#2){++++.+}, at: [<c122c9e0>] xfs_trans_alloc+0x1f/0x35
#3: (&(&ip->i_lock)->mr_lock/1){+.+...}, at: [<c126cd1b>] xfs_ilock+0x100/0x1f1
#4: (&(&ip->i_lock)->mr_lock){++++-.}, at: [<c126cf52>] xfs_ilock_nowait+0x105/0x22f
#5: (&dqp->q_qlock){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64
#6: (&xfs_dquot_other_class){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64

The lockdep annotation for dquot lock nesting only understands
locking for user and "other" dquots, not user, group and quota
dquots. Fix the annotations to match the locking heirarchy we now
have.

Reported-by: Michael L. Semon <mlsemon35@gmail.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
cd56a39a59868911bbf8832725630c1cf43a7b09 28-Aug-2013 Dave Chinner <dchinner@redhat.com> xfs: convert dquot cache lru to list_lru

Convert the XFS dquot lru to use the list_lru construct and convert the
shrinker to being node aware.

[glommer@openvz.org: edited for conflicts + warning fixes]
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Glauber Costa <glommer@openvz.org>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Carlos Maiolino <cmaiolino@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Rientjes <rientjes@google.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: J. Bruce Fields <bfields@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Kent Overstreet <koverstreet@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3d3c8b5222b92447bffaa4127ee18c757f32a460 12-Aug-2013 Jie Liu <jeff.liu@oracle.com> xfs: refactor xfs_trans_reserve() interface

With the new xfs_trans_res structure has been introduced, the log
reservation size, log count as well as log flags are pre-initialized
at mount time. So it's time to refine xfs_trans_reserve() interface
to be more neat.

Also, introduce a new helper M_RES() to return a pointer to the
mp->m_resv structure to simplify the input.

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
6898811459ff523ed256c07d1d5c54ba915d51ac 12-Aug-2013 Dave Chinner <dchinner@redhat.com> xfs: create xfs_bmap_util.[ch]

There is a bunch of code in xfs_bmap.c that is kernel specific and
not shared with userspace. To minimise the difference between the
kernel and userspace code, shift this unshared code to
xfs_bmap_util.c, and the declarations to xfs_bmap_util.h.

The biggest issue here is xfs_bmap_finish() - userspace has it's own
definition of this function, and so we need to move it out of
xfs_bmap.[ch]. This means several other files need to include
xfs_bmap_util.h as well.

It also introduces and interesting dance for the stack switching
code in xfs_bmapi_allocate(). The stack switching/workqueue code is
actually moved to xfs_bmap_util.c, so that userspace can simply use
a #define in a header file to connect the dots without needing to
know about the stack switch code at all.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
6ca1c9063d1952b20c61136e01e6a8987371616b 12-Aug-2013 Dave Chinner <dchinner@redhat.com> xfs: separate dquot on disk format definitions out of xfs_quota.h

The on disk format definitions of the on-disk dquot, log formats and
quota off log formats are all intertwined with other definitions for
quotas. Separate them out into their own header file so they can
easily be shared with userspace.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
92f8ff73f18672b03ec8b92197cdddf2b5de7ea0 11-Jul-2013 Chandra Seetharaman <sekharan@us.ibm.com> xfs: Add pquota fields where gquota is used.

Add project quota changes to all the places where group quota field
is used:
* add separate project quota members into various structures
* split project quota and group quotas so that instead of overriding
the group quota members incore, the new project quota members are
used instead
* get rid of usage of the OQUOTA flag incore, in favor of separate
group and project quota flags.
* add a project dquot argument to various functions.

Not using the pquotino field from superblock yet.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
113a56835d938d5cf9b4599053da7afb80d6f710 28-Jun-2013 Chandra Seetharaman <sekharan@us.ibm.com> xfs: Code cleanup and removal of some typedef usage

In preparation for combined pquota/gquota support, for the sake
of readability, do some code cleanup surrounding the affected
code.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
995961c4510460d9eef9b5ae46789aa2315545fe 28-Jun-2013 Chandra Seetharaman <sekharan@us.ibm.com> xfs: Replace macro XFS_DQ_TO_QIP with a function

In preparation for combined pquota/gquota support, for the sake
of readability, change the macro to an inline function.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
329e0875286984df9053d410df83f839f85bea6e 28-Jun-2013 Chandra Seetharaman <sekharan@us.ibm.com> xfs: Replace macro XFS_DQUOT_TREE with a function

In preparation for combined pquota/gquota support, for the sake
of readability, change the macro to an inline function.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
bb9b8e86ad083ecb2567ae909c1d6cb0bbaa60fe 03-Jun-2013 Dave Chinner <dchinner@redhat.com> xfs: rework dquot CRCs

Calculating dquot CRCs when the backing buffer is written back just
doesn't work reliably. There are several places which manipulate
dquots directly in the buffers, and they don't calculate CRCs
appropriately, nor do they always set the buffer up to calculate
CRCs appropriately.

Firstly, if we log a dquot buffer (e.g. during allocation) it gets
logged without valid CRC, and so on recovery we end up with a dquot
that is not valid.

Secondly, if we recover/repair a dquot, we don't have a verifier
attached to the buffer and hence CRCs are not calculated on the way
down to disk.

Thirdly, calculating the CRC after we've changed the contents means
that if we re-read the dquot from the buffer, we cannot verify the
contents of the dquot are valid, as the CRC is invalid.

So, to avoid all the dquot CRC errors that are being detected by the
read verifier, change to using the same model as for inodes. That
is, dquot CRCs are calculated and written to the backing buffer at
the time the dquot is flushed to the backing buffer. If we modify
the dquot directly in the backing buffer, calculate the CRC
immediately after the modification is complete. Hence the dquot in
the on-disk buffer should always have a valid CRC.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

(cherry picked from commit 6fcdc59de28817d1fbf1bd58cc01f4f3fac858fb)
6fcdc59de28817d1fbf1bd58cc01f4f3fac858fb 03-Jun-2013 Dave Chinner <dchinner@redhat.com> xfs: rework dquot CRCs

Calculating dquot CRCs when the backing buffer is written back just
doesn't work reliably. There are several places which manipulate
dquots directly in the buffers, and they don't calculate CRCs
appropriately, nor do they always set the buffer up to calculate
CRCs appropriately.

Firstly, if we log a dquot buffer (e.g. during allocation) it gets
logged without valid CRC, and so on recovery we end up with a dquot
that is not valid.

Secondly, if we recover/repair a dquot, we don't have a verifier
attached to the buffer and hence CRCs are not calculated on the way
down to disk.

Thirdly, calculating the CRC after we've changed the contents means
that if we re-read the dquot from the buffer, we cannot verify the
contents of the dquot are valid, as the CRC is invalid.

So, to avoid all the dquot CRC errors that are being detected by the
read verifier, change to using the same model as for inodes. That
is, dquot CRCs are calculated and written to the backing buffer at
the time the dquot is flushed to the backing buffer. If we modify
the dquot directly in the backing buffer, calculate the CRC
immediately after the modification is complete. Hence the dquot in
the on-disk buffer should always have a valid CRC.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
3fe58f30b4fc3f8a9084b035a02bc0c67bee8d00 03-Apr-2013 Christoph Hellwig <hch@lst.de> xfs: add CRC checks for quota blocks

Use the reserved space in struct xfs_dqblk to store a UUID and a crc
for the quota blocks.

[dchinner@redhat.com] Add a LSN field and update for current verifier
infrastructure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
b136645116e5471cf0b037a1759dc83620236631 18-Mar-2013 Brian Foster <bfoster@redhat.com> xfs: xfs_dquot prealloc throttling watermarks and low free space

Enable tracking of high and low watermarks for preallocation
throttling of files under quota restrictions. These values are
calculated when the quota limit is read from disk or modified and
cached for later use by the throttling algorithm.

The high watermark specifies when preallocation is disabled, the
low watermark specifies when throttling is enabled and the low free
space data structure contains precalculated low free space limits
to serve as input to determine the level of throttling required.

Note that the low free space data structure is based on the
existing global low free space data structure with the exception of
using three stages (5%, 3% and 1%) rather than five to reduce the
impact of xfs_dquot memory overhead.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
4b6eae2e6ac8a6671839ccaea1c2e3dd5684f5df 18-Mar-2013 Brian Foster <bfoster@redhat.com> xfs: pass xfs_dquot to xfs_qm_adjust_dqlimits() instead of xfs_disk_dquot_t

Modify xfs_qm_adjust_dqlimits() to take the xfs_dquot as a
parameter instead of just the xfs_disk_dquot_t so we can update
in-memory fields if necessary.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
4800104438a4467ffa5ae1e51d5a59c0f64e5f9a 28-Jan-2013 Jeff Liu <jeff.liu@oracle.com> xfs: calculate XFS_TRANS_QM_DQALLOC space log reservation at mount time

The disk quota allocation log space reservation is calcuated at runtime,
this patch does it at mount time.

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
CC: Dave Chinner <david@fromorbit.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
1813dd64057490e7a0678a885c4fe6d02f78bdc1 14-Nov-2012 Dave Chinner <dchinner@redhat.com> xfs: convert buffer verifiers to an ops structure.

To separate the verifiers from iodone functions and associate read
and write verifiers at the same time, introduce a buffer verifier
operations structure to the xfs_buf.

This avoids the need for assigning the write verifier, clearing the
iodone function and re-running ioend processing in the read
verifier, and gets rid of the nasty "b_pre_io" name for the write
verifier function pointer. If we ever need to, it will also be
easier to add further content specific callbacks to a buffer with an
ops structure in place.

We also avoid needing to export verifier functions, instead we
can simply export the ops structures for those that are needed
outside the function they are defined in.

This patch also fixes a directory block readahead verifier issue
it exposed.

This patch also adds ops callbacks to the inode/alloc btree blocks
initialised by growfs. These will need more work before they will
work with CRCs.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Phil White <pwhite@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
b0f539de9fcc543a3ffa40bc22bf51aca6ea6183 14-Nov-2012 Dave Chinner <dchinner@redhat.com> xfs: connect up write verifiers to new buffers

Metadata buffers that are read from disk have write verifiers
already attached to them, but newly allocated buffers do not. Add
appropriate write verifiers to all new metadata buffers.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
612cfbfe174a89d565363fff7f3961a2dda5fb71 14-Nov-2012 Dave Chinner <dchinner@redhat.com> xfs: add pre-write metadata buffer verifier callbacks

These verifiers are essentially the same code as the read verifiers,
but do not require ioend processing. Hence factor the read verifier
functions and add a new write verifier wrapper that is used as the
callback.

This is done as one large patch for all verifiers rather than one
patch per verifier as the change is largely mechanical. This
includes hooking up the write verifier via the read verifier
function.

Hooking up the write verifier for buffers obtained via
xfs_trans_get_buf() will be done in a separate patch as that touches
code in many different places rather than just the verifier
functions.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
c6319198702350a2215a8c0cacd6cc4283728a1b 14-Nov-2012 Dave Chinner <dchinner@redhat.com> xfs: verify dquot blocks as they are read from disk

Add a dquot buffer verify callback function and pass it into the
buffer read functions. This checks all the dquots in a buffer, but
cannot completely verify the dquot ids are correct. Also, errors
cannot be repaired, so an additional function is added to repair bad
dquots in the buffer if such an error is detected in a context where
repair is allowed.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Phil White <pwhite@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
c3f8fc73ac97b76a12692088ef9cace9af8422c0 12-Nov-2012 Dave Chinner <dchinner@redhat.com> xfs: make buffer read verication an IO completion function

Add a verifier function callback capability to the buffer read
interfaces. This will be used by the callers to supply a function
that verifies the contents of the buffer when it is read from disk.
This patch does not provide callback functions, but simply modifies
the interfaces to allow them to be called.

The reason for adding this to the read interfaces is that it is very
difficult to tell fom the outside is a buffer was just read from
disk or whether we just pulled it out of cache. Supplying a callbck
allows the buffer cache to use it's internal knowledge of the buffer
to execute it only when the buffer is read from disk.

It is intended that the verifier functions will mark the buffer with
an EFSCORRUPTED error when verification fails. This allows the
reading context to distinguish a verification error from an IO
error, and potentially take further actions on the buffer (e.g.
attempt repair) based on the error reported.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Phil White <pwhite@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
60a34607b26b60d6b5c5c928ede7fc84b0f06b85 23-Apr-2012 Dave Chinner <dchinner@redhat.com> xfs: move xfsagino_t to xfs_types.h

Untangle the header file includes a bit by moving the definition of
xfs_agino_t to xfs_types.h. This removes the dependency that xfs_ag.h has on
xfs_inum.h, meaning we don't need to include xfs_inum.h everywhere we include
xfs_ag.h.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
04913fdd91f342e537005ef1233f98068b925a7f 23-Apr-2012 Dave Chinner <dchinner@redhat.com> xfs: pass shutdown method into xfs_trans_ail_delete_bulk

xfs_trans_ail_delete_bulk() can be called from different contexts so
if the item is not in the AIL we need different shutdown for each
context. Pass in the shutdown method needed so the correct action
can be taken.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
43ff2122e6492bcc88b065c433453dce88223b30 23-Apr-2012 Christoph Hellwig <hch@infradead.org> xfs: on-stack delayed write buffer lists

Queue delwri buffers on a local on-stack list instead of a per-buftarg one,
and write back the buffers per-process instead of by waking up xfsbufd.

This is now easily doable given that we have very few places left that write
delwri buffers:

- log recovery:
Only done at mount time, and already forcing out the buffers
synchronously using xfs_flush_buftarg

- quotacheck:
Same story.

- dquot reclaim:
Writes out dirty dquots on the LRU under memory pressure. We might
want to look into doing more of this via xfsaild, but it's already
more optimal than the synchronous inode reclaim that writes each
buffer synchronously.

- xfsaild:
This is the main beneficiary of the change. By keeping a local list
of buffers to write we reduce latency of writing out buffers, and
more importably we can remove all the delwri list promotions which
were hitting the buffer cache hard under sustained metadata loads.

The implementation is very straight forward - xfs_buf_delwri_queue now gets
a new list_head pointer that it adds the delwri buffers to, and all callers
need to eventually submit the list using xfs_buf_delwi_submit or
xfs_buf_delwi_submit_nowait. Buffers that already are on a delwri list are
skipped in xfs_buf_delwri_queue, assuming they already are on another delwri
list. The biggest change to pass down the buffer list was done to the AIL
pushing. Now that we operate on buffers the trylock, push and pushbuf log
item methods are merged into a single push routine, which tries to lock the
item, and if possible add the buffer that needs writeback to the buffer list.
This leads to much simpler code than the previous split but requires the
individual IOP_PUSH instances to unlock and reacquire the AIL around calls
to blocking routines.

Given that xfsailds now also handle writing out buffers, the conditions for
log forcing and the sleep times needed some small changes. The most
important one is that we consider an AIL busy as long we still have buffers
to push, and the other one is that we do increment the pushed LSN for
buffers that are under flushing at this moment, but still count them towards
the stuck items for restart purposes. Without this we could hammer on stuck
items without ever forcing the log and not make progress under heavy random
delete workloads on fast flash storage devices.

[ Dave Chinner:
- rebase on previous patches.
- improved comments for XBF_DELWRI_Q handling
- fix XBF_ASYNC handling in queue submission (test 106 failure)
- rename delwri submit function buffer list parameters for clarity
- xfs_efd_item_push() should return XFS_ITEM_PINNED ]

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
fe7257fd4b8ae9a3e354d9edb61890973e373ef0 23-Apr-2012 Christoph Hellwig <hch@infradead.org> xfs: do not write the buffer from xfs_qm_dqflush

Instead of writing the buffer directly from inside xfs_qm_dqflush return it
to the caller and let the caller decide what to do with the buffer. Also
remove the pincount check in xfs_qm_dqflush that all non-blocking callers
already implement and the now unused flags parameter and the XFS_DQ_IS_DIRTY
check that all callers already perform.

[ Dave Chinner: fixed build error cause by missing '{'. ]

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
32ce90a4b79155a155de2b284d8b69023e5e8fea 23-Apr-2012 Christoph Hellwig <hch@infradead.org> xfs: remove log item from AIL in xfs_iflush after a shutdown

If a filesystem has been forced shutdown we are never going to write inodes
to disk, which means the inode items will stay in the AIL until we free
the inode. Currently that is not a problem, but a pending change requires us
to empty the AIL before shutting down the filesystem. In that case leaving
the inode in the AIL is lethal. Make sure to remove the log item from the AIL
to allow emptying the AIL on shutdown filesystems.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
dea9609527a55b65638a6323894269334dfe6ec5 23-Apr-2012 Christoph Hellwig <hch@infradead.org> xfs: remove log item from AIL in xfs_qm_dqflush after a shutdown

If a filesystem has been forced shutdown we are never going to write dquots
to disk, which means the dquot items will stay in the AIL forever.
Currently that is not a problem, but a pending chance requires us to
empty the AIL before shutting down the filesystem, in which case this
behaviour is lethal. Make sure to remove the log item from the AIL
to allow emptying the AIL on shutdown filesystems.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
1c2ccc66bcef992bec7bad6d52cade66d632d7fb 16-Mar-2012 Gerard Snitselaar <dev@snitselaar.org> fs: xfs: fix section mismatch in linux-next

xfs_qm_exit() is called in init_xfs_fs().

Signed-off-by: Gerard Snitselaar <dev@snitselaar.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
a05931ceb0160deadbd7798d60d01b17f2d81b09 13-Mar-2012 Christoph Hellwig <hch@infradead.org> xfs: remove the global xfs_Gqm structure

If we initialize the slab caches for the quota code when XFS is loaded there
is no need for a global and reference counted quota manager structure. Drop
all this overhead and also fix the error handling during quota initialization.

Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
b84a3a96751f93071c1863f2962273973c8b8f5e 14-Mar-2012 Christoph Hellwig <hch@lst.de> xfs: remove the per-filesystem list of dquots

Instead of keeping a separate per-filesystem list of dquots we can walk
the radix tree for the two places where we need to iterate all quota
structures.

Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
9f920f116426806bfa34c1422742e1bf7b7a2b4b 13-Mar-2012 Christoph Hellwig <hch@infradead.org> xfs: use per-filesystem radix trees for dquot lookup

Replace the global hash tables for looking up in-memory dquot structures
with per-filesystem radix trees to allow scaling to a large number of
in-memory dquot structures.

Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
f8739c3ce2ac9a01515b56026b6a066c0808234b 13-Mar-2012 Christoph Hellwig <hch@infradead.org> xfs: per-filesystem dquot LRU lists

Replace the global dquot lru lists with a per-filesystem one.

Note that the shrinker isn't wire up to the per-superblock VFS shrinker
infrastructure as would have problems summing up and splitting the counts
for inodes and dquots. I don't think this is a major problem as the quota
cache isn't as interwinded with the inode cache as the dentry cache is,
because an inode that is dropped from the cache will generally release
a dquot reference, but most of the time it won't be the last one.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
48776fd22344ad80adcbac0abc9c0da60c6481d2 13-Mar-2012 Christoph Hellwig <hch@infradead.org> xfs: use common code for quota statistics

Switch the quota code over to use the generic XFS statistics infrastructure.
While the legacy /proc/fs/xfs/xqm and /proc/fs/xfs/xqmstats interfaces are
preserved for now the statistics that still have a meaning with the current
code are now also available from /proc/fs/xfs/stats.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
5b03ff1b2444ddf7b8084b7505101e97257aff5a 20-Feb-2012 Christoph Hellwig <hch@infradead.org> xfs: remove xfs_trans_unlocked_item

There is no reason to wake up log space waiters when unlocking inodes or
dquots, and the commit log has no explanation for this function either.

Given that we now have exact log space wakeups everywhere we can assume
the reason for this function was to paper over log space races in earlier
XFS versions.

Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
d0a3fe67e30261bb2018d2a06f33ff3303438c8e 06-Feb-2012 Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com> xfs: change available ranges of softlimit and hardlimit in quota check

In general, quota allows us to use disk blocks and inodes up to each
limit, that is, they are available if they don't exceed their limitations.
Current xfs sets their available ranges to lower than them except disk
inode quota check. So, this patch changes the ranges to not beyond them.

Signed-off-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
Cc: Ben Myers <bpm@sgi.com>
Cc: Alex Elder <elder@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

(cherry picked from commit 20f12d8ac01917d96860f352f67eddd912df0afb)
20f12d8ac01917d96860f352f67eddd912df0afb 06-Feb-2012 Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com> xfs: change available ranges of softlimit and hardlimit in quota check

In general, quota allows us to use disk blocks and inodes up to each
limit, that is, they are available if they don't exceed their limitations.
Current xfs sets their available ranges to lower than them except disk
inode quota check. So, this patch changes the ranges to not beyond them.

Signed-off-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
Cc: Ben Myers <bpm@sgi.com>
Cc: Alex Elder <elder@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
92b2e5b31dd2ad2c9273578c2289d17f417fe32d 01-Feb-2012 Christoph Hellwig <hch@infradead.org> xfs: use a normal shrinker for the dquot freelist

Stop reusing dquots from the freelist when allocating new ones directly, and
implement a shrinker that actually follows the specifications for the
interface. The shrinker implementation is still highly suboptimal at this
point, but we can gradually work on it.

This also fixes an bug in the previous lock ordering, where we would take
the hash and dqlist locks inside of the freelist lock against the normal
lock ordering. This is only solvable by introducing the dispose list,
and thus not when using direct reclaim of unused dquots for new allocations.

As a side-effect the quota upper bound and used to free ratio values in
/proc/fs/xfs/xqm are set to 0 as these values don't make any sense in the
new world order.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>

(cherry picked from commit 04da0c8196ac0b12fb6b84f4b7a51ad2fa56d869)
04da0c8196ac0b12fb6b84f4b7a51ad2fa56d869 01-Feb-2012 Christoph Hellwig <hch@infradead.org> xfs: use a normal shrinker for the dquot freelist

Stop reusing dquots from the freelist when allocating new ones directly, and
implement a shrinker that actually follows the specifications for the
interface. The shrinker implementation is still highly suboptimal at this
point, but we can gradually work on it.

This also fixes an bug in the previous lock ordering, where we would take
the hash and dqlist locks inside of the freelist lock against the normal
lock ordering. This is only solvable by introducing the dispose list,
and thus not when using direct reclaim of unused dquots for new allocations.

As a side-effect the quota upper bound and used to free ratio values in
/proc/fs/xfs/xqm are set to 0 as these values don't make any sense in the
new world order.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
36731410834e08c7d15c3980abd6cc4c563c2e87 23-Jan-2012 Chandra Seetharaman <sekharan@us.ibm.com> Define a new function xfs_inode_dquot()

Define a new function xfs_inode_dquot() that takes a inode pointer
and a disk quota type and returns the quota pointer for the specified
quota type.

This simplifies the xfs_qm_dqget() error path significantly.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
6967b964c1012231f338445f20f877e680cd4cb8 23-Jan-2012 Chandra Seetharaman <sekharan@us.ibm.com> Define a new function xfs_this_quota_on()

Create a new function xfs_this_quota_on() that takes a xfs_mount
data structure and a disk quota type and returns true if the specified
type of quota is ON in the xfs_mount data structure.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
7ae4440723a413c7a52edd27f654c34680dd4ea2 06-Dec-2011 Christoph Hellwig <hch@infradead.org> xfs: remove XFS_QMOPT_DQSUSER

Just read the id 0 dquot from disk directly in xfs_qm_init_quotainfo instead
of going through dqget and requiring a special flag to not add the dquot to
any lists.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
97e7ade506cdd7157d8b64c77696c082fb997476 06-Dec-2011 Christoph Hellwig <hch@infradead.org> xfs: kill xfs_qm_idtodq

This function doesn't help the code flow, so merge the dquot allocation and
transaction handling into xfs_qm_dqread.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
49d35a5cf115d9273edb8aa7e527502411b77712 06-Dec-2011 Christoph Hellwig <hch@infradead.org> xfs: merge xfs_qm_dqinit_core into the only caller

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
78e55892d65ea69fbf252e086375d0d8f081b6c8 06-Dec-2011 Christoph Hellwig <hch@infradead.org> xfs: add a xfs_dqhold helper

Factor the common pattern of:

xfs_dqlock(dqp);
XFS_DQHOLD(dqp);
xfs_dqunlock(dqp);

into a new helper, and remove XFS_DQHOLD now that only one other caller
is left.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
bf72de3194e73fa210a904b0bd951135286bb385 06-Dec-2011 Christoph Hellwig <hch@infradead.org> xfs: nest qm_dqfrlist_lock inside the dquot qlock

Allow xfs_qm_dqput to work without trylock loops by nesting the freelist lock
inside the dquot qlock. In turn that requires trylocks in the reclaim path
instead, but given it's a classic tradeoff between fast and slow path, and
we follow the model of the inode and dentry caches.

Document our new lock order now that it has settled.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
92678554abfc2a2f2727ad168da87d8d434ac904 06-Dec-2011 Christoph Hellwig <hch@infradead.org> xfs: flatten the dquot lock ordering

Introduce a new XFS_DQ_FREEING flag that tells lookup and mplist walks
to skip a dquot that is beeing freed, and use this avoid the trylock
on the hash and mplist locks in xfs_qm_dqreclaim_one. Also simplify
xfs_dqpurge by moving the inodes to a dispose list after marking them
XFS_DQ_FREEING and avoid the locker ordering constraints.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
be7ffc38a80a78e6b68d0f51fae8e8d57b55324c 06-Dec-2011 Christoph Hellwig <hch@infradead.org> xfs: implement lazy removal for the dquot freelist

Do not remove dquots from the freelist when we grab a reference to them in
xfs_qm_dqlookup, but leave them on the freelist util scanning notices that
they have a reference. This speeds up the lookup fastpath, and greatly
simplifies the lock ordering constraints. Note that the same scheme is
used by the VFS inode and dentry caches.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
80a376bfb7f8ff8f1942cb1bdd0052e908918252 06-Dec-2011 Christoph Hellwig <hch@infradead.org> xfs: remove XFS_DQ_INACTIVE

Free dquots when purging them during umount instead of keeping them around
on the freelist in a degraded state. The out of order locking in
xfs_qm_dqpurge will be removed again later in this series.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
497507b9ee63499d364ad7149c584285cd925dfc 06-Dec-2011 Christoph Hellwig <hch@infradead.org> xfs: cleanup xfs_qm_dqlookup

Rearrange the code to avoid the conditional locking around the flist_locked
variable. This means we lose a (rather pointless) assert, and hold the
freelist lock a bit longer for one corner case.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
800b484ec0262946262ad20561a8081fd31f4a6f 06-Dec-2011 Christoph Hellwig <hch@infradead.org> xfs: cleanup dquot locking helpers

Mark the trivial lock wrappers as inline, and make the naming consistent
for all of them.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
fdedf28b9492d69976110d12cc0d02d33c8ea7ea 06-Dec-2011 Christoph Hellwig <hch@infradead.org> xfs: untangle SYNC_WAIT and SYNC_TRYLOCK meanings for xfs_qm_dqflush

Only skip pinned dquots if SYNC_TRYLOCK is specified, and adjust the callers
to keep the behaviour unchanged.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
38f23232449c9d2c0bc8e9541cb8ab08b7c2b9ce 10-Oct-2011 Christoph Hellwig <hch@infradead.org> xfs: remove XFS_BUF_SET_VTYPE and XFS_BUF_SET_VTYPE_REF

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
ddc3415aba1cb2f86d1fcad720cea834ee178f54 19-Sep-2011 Christoph Hellwig <hch@infradead.org> xfs: simplify xfs_trans_ijoin* again

There is no reason to keep a reference to the inode even if we unlock
it during transaction commit because we never drop a reference between
the ijoin and commit. Also use this fact to merge xfs_trans_ijoin_ref
back into xfs_trans_ijoin - the third argument decides if an unlock
is needed now.

I'm actually starting to wonder if allowing inodes to be unlocked
at transaction commit really is worth the effort. The only real
benefit is that they can be unlocked earlier when commiting a
synchronous transactions, but that could be solved by doing the
log force manually after the unlock, too.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
c0dc7828af6952643219292be29e482ef74cb261 18-Sep-2011 Dave Chinner <dchinner@redhat.com> xfs: rename xfs_bmapi to xfs_bmapi_write

Now that all the read-only users of xfs_bmapi have been converted to
use xfs_bmapi_read(), we can remove all the read-only handling cases
from xfs_bmapi().

Once this is done, rename xfs_bmapi to xfs_bmapi_write to reflect
the fact it is for allocation only. This enables us to kill the
XFS_BMAPI_WRITE flag as well.

Also clean up xfs_bmapi_write to the style used in the newly added
xfs_bmapi_read/delay functions.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
5c8ed2021ff291f5e399a9b43c4f699b2fc58fbb 18-Sep-2011 Dave Chinner <dchinner@redhat.com> xfs: introduce xfs_bmapi_read()

xfs_bmapi() currently handles both extent map reading and
allocation. As a result, the code is littered with "if (wr)"
branches to conditionally do allocation operations if required.
This makes the code much harder to follow and causes significant
indent issues with the code.

Given that read mapping is much simpler than allocation, we can
split out read mapping from xfs_bmapi() and reuse the logic that
we have already factored out do do all the hard work of handling the
extent map manipulations. The results in a much simpler function for
the common extent read operations, and will allow the allocation
code to be simplified in another commit.

Once xfs_bmapi_read() is implemented, convert all the callers of
xfs_bmapi() that are only reading extents to use the new function.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
2a30f36d9069b0646dcdd73def5fd7ab674bffd6 20-Sep-2011 Chandra Seetharaman <sekharan@us.ibm.com> xfs: Check the return value of xfs_trans_get_buf()

Check the return value of xfs_trans_get_buf() and fail
appropriately.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
c2b006c1da1602551def200e4661535f02b82488 23-Aug-2011 Christoph Hellwig <hch@infradead.org> xfs: let xfs_bwrite callers handle the xfs_buf_relse

Remove the xfs_buf_relse from xfs_bwrite and let the caller handle it to
mirror the delwri and read paths.

Also remove the mount pointer passed to xfs_bwrite, which is superflous now
that we have a mount pointer in the buftarg.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
61551f1ee536289084a4a8f1c4f187e2f371c440 23-Aug-2011 Christoph Hellwig <hch@infradead.org> xfs: call xfs_buf_delwri_queue directly

Unify the ways we add buffers to the delwri queue by always calling
xfs_buf_delwri_queue directly. The xfs_bdwrite functions is removed and
opencoded in its callers, and the two places setting XBF_DELWRI while a
buffer is locked and expecting xfs_buf_unlock to pick it up are converted
to call xfs_buf_delwri_queue directly, too. Also replace the
XFS_BUF_UNDELAYWRITE macro with direct calls to xfs_buf_delwri_dequeue
to make the explicit queuing/dequeuing more obvious.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
c59d87c460767bc35dafd490139d3cfe78fb8da4 12-Aug-2011 Christoph Hellwig <hch@infradead.org> xfs: remove subdirectories

Use the move from Linux 2.6 to Linux 3.x as an excuse to kill the
annoying subdirectories in the XFS source code. Besides the large
amount of file rename the only changes are to the Makefile, a few
files including headers with the subdirectory prefix, and the binary
sysctl compat code that includes a header under fs/xfs/ from
kernel/.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>