History log of /drivers/md/dm-snap.c
Revision Date Author Comments
743162013d40ca612b4cb53d3a200dff2d9ab26e 07-Jul-2014 NeilBrown <neilb@suse.de> sched: Remove proliferation of wait_on_bit() action functions

The current "wait_on_bit" interface requires an 'action'
function to be provided which does the actual waiting.
There are over 20 such functions, many of them identical.
Most cases can be satisfied by one of just two functions, one
which uses io_schedule() and one which just uses schedule().

So:
Rename wait_on_bit and wait_on_bit_lock to
wait_on_bit_action and wait_on_bit_lock_action
to make it explicit that they need an action function.

Introduce new wait_on_bit{,_lock} and wait_on_bit{,_lock}_io
which are *not* given an action function but implicitly use
a standard one.
The decision to error-out if a signal is pending is now made
based on the 'mode' argument rather than being encoded in the action
function.

All instances of the old wait_on_bit and wait_on_bit_lock which
can use the new version have been changed accordingly and their
action functions have been discarded.
wait_on_bit{_lock} does not return any specific error code in the
event of a signal so the caller must check for non-zero and
interpolate their own error code as appropriate.

The wait_on_bit() call in __fscache_wait_on_invalidate() was
ambiguous as it specified TASK_UNINTERRUPTIBLE but used
fscache_wait_bit_interruptible as an action function.
David Howells confirms this should be uniformly
"uninterruptible"

The main remaining user of wait_on_bit{,_lock}_action is NFS
which needs to use a freezer-aware schedule() call.

A comment in fs/gfs2/glock.c notes that having multiple 'action'
functions is useful as they display differently in the 'wchan'
field of 'ps'. (and /proc/$PID/wchan).
As the new bit_wait{,_io} functions are tagged "__sched", they
will not show up at all, but something higher in the stack. So
the distinction will still be visible, only with different
function names (gds2_glock_wait versus gfs2_glock_dq_wait in the
gfs2/glock.c case).

Since first version of this patch (against 3.15) two new action
functions appeared, on in NFS and one in CIFS. CIFS also now
uses an action function that makes the same freezer aware
schedule call as NFS.

Signed-off-by: NeilBrown <neilb@suse.de>
Acked-by: David Howells <dhowells@redhat.com> (fscache, keys)
Acked-by: Steven Whitehouse <swhiteho@redhat.com> (gfs2)
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steve French <sfrench@samba.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140707051603.28027.72349.stgit@notabene.brown
Signed-off-by: Ingo Molnar <mingo@kernel.org>
298eaa89b02e88dc9081f8761a957f7cd5e8b201 14-Mar-2014 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: do not split read bios sent to snapshot-origin target

Change the snapshot-origin target so that only write bios are split on
chunk boundary. Read bios are passed unchanged to the underlying
device, so they don't have to be split.

Later, we could change the target so that it accepts a larger write bio
if it spans an area that is completely covered by snapshot exceptions.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
599cdf3bfbe21fe94f4416c9e54363b285c9532a 14-Mar-2014 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: allocate a per-target structure for snapshot-origin target

Allocate a per-target dm_origin structure. This is a prerequisite for
the next commit ("dm snapshot: do not split read bios sent to
snapshot-origin target") which adds a new member to this structure.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
4e857c58efeb99393cba5a5d0d8ec7117183137c 17-Mar-2014 Peter Zijlstra <peterz@infradead.org> arch: Mass conversion of smp_mb__*()

Mostly scripted conversion of the smp_mb__* barriers.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-arch@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
119bc547362e5252074f81f56b8fcdac45cedff4 14-Jan-2014 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: use GFP_KERNEL when initializing exceptions

The list of initial exceptions is loaded in the target constructor. We
are allowed to allocate memory with GFP_KERNEL at this point. So,
change alloc_completed_exception to use GFP_KERNEL when being called
from the constructor.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
230c83afdd9cd384348475bea1e14b80b3b6b1b8 30-Nov-2013 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: avoid snapshot space leak on crash

There is a possible leak of snapshot space in case of crash.

The reason for space leaking is that chunks in the snapshot device are
allocated sequentially, but they are finished (and stored in the metadata)
out of order, depending on the order in which copying finished.

For example, supposed that the metadata contains the following records
SUPERBLOCK
METADATA (blocks 0 ... 250)
DATA 0
DATA 1
DATA 2
...
DATA 250

Now suppose that you allocate 10 new data blocks 251-260. Suppose that
copying of these blocks finish out of order (block 260 finished first
and the block 251 finished last). Now, the snapshot device looks like
this:
SUPERBLOCK
METADATA (blocks 0 ... 250, 260, 259, 258, 257, 256)
DATA 0
DATA 1
DATA 2
...
DATA 250
DATA 251
DATA 252
DATA 253
DATA 254
DATA 255
METADATA (blocks 255, 254, 253, 252, 251)
DATA 256
DATA 257
DATA 258
DATA 259
DATA 260

Now, if the machine crashes after writing the first metadata block but
before writing the second metadata block, the space for areas DATA 250-255
is leaked, it contains no valid data and it will never be used in the
future.

This patch makes dm-snapshot complete exceptions in the same order they
were allocated, thus fixing this bug.

Note: when backporting this patch to the stable kernel, change the version
field in the following way:
* if version in the stable kernel is {1, 11, 1}, change it to {1, 12, 0}
* if version in the stable kernel is {1, 10, 0} or {1, 10, 1}, change it
to {1, 10, 2}
Userspace reads the version to determine if the bug was fixed, so the
version change is needed.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
196d38bccfcfa32faed8c561868336fdfa0fe8e4 24-Nov-2013 Kent Overstreet <kmo@daterainc.com> block: Generic bio chaining

This adds a generic mechanism for chaining bio completions. This is
going to be used for a bio_split() replacement, and it turns out to be
very useful in a fair amount of driver code - a fair number of drivers
were implementing this in their own roundabout ways, often painfully.

Note that this means it's no longer to call bio_endio() more than once
on the same bio! This can cause problems for drivers that save/restore
bi_end_io. Arguably they shouldn't be saving/restoring bi_end_io at all
- in all but the simplest cases they'd be better off just cloning the
bio, and immutable biovecs is making bio cloning cheaper. But for now,
we add a bio_endio_nodec() for these cases.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
4f024f3797c43cb4b73cd2c50cec728842d0e49e 12-Oct-2013 Kent Overstreet <kmo@daterainc.com> block: Abstract out bvec iterator

Immutable biovecs are going to require an explicit iterator. To
implement immutable bvecs, a later patch is going to add a bi_bvec_done
member to this struct; for now, this patch effectively just renames
things.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Ed L. Cashin" <ecashin@coraid.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Yehuda Sadeh <yehuda@inktank.com>
Cc: Sage Weil <sage@inktank.com>
Cc: Alex Elder <elder@inktank.com>
Cc: ceph-devel@vger.kernel.org
Cc: Joshua Morris <josh.h.morris@us.ibm.com>
Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: dm-devel@redhat.com
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux390@de.ibm.com
Cc: Boaz Harrosh <bharrosh@panasas.com>
Cc: Benny Halevy <bhalevy@tonian.com>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Chris Mason <chris.mason@fusionio.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Dave Kleikamp <shaggy@kernel.org>
Cc: Joern Engel <joern@logfs.org>
Cc: Prasad Joshi <prasadjoshi.linux@gmail.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Ben Myers <bpm@sgi.com>
Cc: xfs@oss.sgi.com
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Guo Chao <yan@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Jerome Marchand <jmarchand@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Peng Tao <tao.peng@emc.com>
Cc: Andy Adamson <andros@netapp.com>
Cc: fanchaoting <fanchaoting@cn.fujitsu.com>
Cc: Jie Liu <jeff.liu@oracle.com>
Cc: Sunil Mushran <sunil.mushran@gmail.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Namjae Jeon <namjae.jeon@samsung.com>
Cc: Pankaj Kumar <pankaj.km@samsung.com>
Cc: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: Mel Gorman <mgorman@suse.de>6
60e356f381954d79088d0455e357db48cfdd6857 19-Sep-2013 Mikulas Patocka <mpatocka@redhat.com> dm-snapshot: fix performance degradation due to small hash size

LVM2, since version 2.02.96, creates origin with zero size, then loads
the snapshot driver and then loads the origin. Consequently, the
snapshot driver sees the origin size zero and sets the hash size to the
lower bound 64. Such small hash table causes performance degradation.

This patch changes it so that the hash size is determined by the size of
snapshot volume, not minimum of origin and snapshot size. It doesn't
make sense to set the snapshot size significantly larger than the origin
size, so we do not need to take origin size into account when
calculating the hash size.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
09e8b813897a0f85bb401435d009228644c81214 10-May-2013 Wei Yongjun <yongjun_wei@trendmicro.com.cn> dm snapshot: fix error return code in snapshot_ctr

Return -ENOMEM instead of success if unable to allocate pending
exception mempool in snapshot_ctr.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Cc: stable@vger.kernel.org
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
df5d2e9089c7d5b8c46f767e4278610ea3e815b9 01-Mar-2013 Mikulas Patocka <mpatocka@redhat.com> dm kcopyd: introduce configurable throttling

This patch allows the administrator to reduce the rate at which kcopyd
issues I/O.

Each module that uses kcopyd acquires a throttle parameter that can be
set in /sys/module/*/parameters.

We maintain a history of kcopyd usage by each module in the variables
io_period and total_period in struct dm_kcopyd_throttle. The actual
kcopyd activity is calculated as a percentage of time equal to
"(100 * io_period / total_period)". This is compared with the user-defined
throttle percentage threshold and if it is exceeded, we sleep.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
23cb21092eb9dcec9d3604b68d95192b79915890 01-Mar-2013 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: add missing module aliases

Add module aliases so that autoloading works correctly if the user
tries to activate "snapshot-origin" or "snapshot-merge" targets.

Reference: https://bugzilla.redhat.com/889973

Reported-by: Chao Yang <chyang@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
55a62eef8d1b50ceff3b7bf46851103bdcc7e5b0 01-Mar-2013 Alasdair G Kergon <agk@redhat.com> dm: rename request variables to bios

Use 'bio' in the name of variables and functions that deal with
bios rather than 'request' to avoid confusion with the normal
block layer use of 'request'.

No functional changes.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
fd7c092e711ebab55b2688d3859d95dfd0301f73 01-Mar-2013 Mikulas Patocka <mpatocka@redhat.com> dm: fix truncated status strings

Avoid returning a truncated table or status string instead of setting
the DM_BUFFER_FULL_FLAG when the last target of a table fills the
buffer.

When processing a table or status request, the function retrieve_status
calls ti->type->status. If ti->type->status returns non-zero,
retrieve_status assumes that the buffer overflowed and sets
DM_BUFFER_FULL_FLAG.

However, targets don't return non-zero values from their status method
on overflow. Most targets returns always zero.

If a buffer overflow happens in a target that is not the last in the
table, it gets noticed during the next iteration of the loop in
retrieve_status; but if a buffer overflow happens in the last target, it
goes unnoticed and erroneously truncated data is returned.

In the current code, the targets behave in the following way:
* dm-crypt returns -ENOMEM if there is not enough space to store the
key, but it returns 0 on all other overflows.
* dm-thin returns errors from the status method if a disk error happened.
This is incorrect because retrieve_status doesn't check the error
code, it assumes that all non-zero values mean buffer overflow.
* all the other targets always return 0.

This patch changes the ti->type->status function to return void (because
most targets don't use the return code). Overflow is detected in
retrieve_status: if the status method fills up the remaining space
completely, it is assumed that buffer overflow happened.

Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
b67bfe0d42cac56c512dd5da4b1b347a23f4b70a 28-Feb-2013 Sasha Levin <sasha.levin@oracle.com> hlist: drop the node parameter from iterators

I'm not sure why, but the hlist for each entry iterators were conceived

list_for_each_entry(pos, head, member)

The hlist ones were greedy and wanted an extra parameter:

hlist_for_each_entry(tpos, pos, head, member)

Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.

Besides the semantic patch, there was some manual work required:

- Fix up the actual hlist iterators in linux/list.h
- Fix up the declaration of other iterators based on the hlist ones.
- A very small amount of places were using the 'node' parameter, this
was modified to use 'obj->member' instead.
- Coccinelle didn't handle the hlist_for_each_entry_safe iterator
properly, so those had to be fixed up manually.

The semantic patch which is mostly the work of Peter Senna Tschudin is here:

@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

type T;
expression a,c,d,e;
identifier b;
statement S;
@@

-T b;
<+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
...+>

[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin <peter.senna@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7de3ee57da4b717050e79c9313a9bf66ccc72519 21-Dec-2012 Mikulas Patocka <mpatocka@redhat.com> dm: remove map_info

This patch removes map_info from bio-based device mapper targets.
map_info is still used for request-based targets.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
ee18026ac69efba804144541171299efd41747d2 21-Dec-2012 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: do not use map_context

Eliminate struct map_info from dm-snap.

map_info->ptr was used in dm-snap to indicate if the bio was tracked.
If map_info->ptr was non-NULL, the bio was linked in tracked_chunk_hash.

This patch removes the use of map_info->ptr. We determine if the bio was
tracked based on hlist_unhashed(&c->node). If hlist_unhashed is true,
the bio is not tracked, if it is false, the bio is tracked.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
ddbd658f6446a35e4d6ba84812fd71023320cae2 21-Dec-2012 Mikulas Patocka <mpatocka@redhat.com> dm: move target request nr to dm_target_io

This patch moves target_request_nr from map_info to dm_target_io and
makes it accessible with dm_bio_get_target_request_nr.

This patch is a preparation for the next patch that removes map_info.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
42bc954f2a4525c9034667dedc9bd1c342208013 21-Dec-2012 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: use per_bio_data

Replace tracked_chunk_pool with per_bio_data in dm-snap.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
9aa0c0e60ffc2594acaad23127dbea9f3b61821c 21-Dec-2012 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: optimize track_chunk

track_chunk is always called with interrupts enabled. Consequently, we
do not need to save and restore interrupt state in "flags" variable.
This patch changes spin_lock_irqsave to spin_lock_irq and
spin_unlock_irqrestore to spin_unlock_irq.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
1f4e0ff07980820977f45d6a5dbc81d3bb9ce4d3 27-Jul-2012 Alasdair G Kergon <agk@redhat.com> dm thin: commit before gathering status

Commit outstanding metadata before returning the status for a dm thin
pool so that the numbers reported are as up-to-date as possible.

The commit is not performed if the device is suspended or if
the DM_NOFLUSH_FLAG is supplied by userspace and passed to the target
through a new 'status_flags' parameter in the target's dm_status_fn.

The userspace dmsetup tool will support the --noflush flag with the
'dmsetup status' and 'dmsetup wait' commands from version 1.02.76
onwards.

Tested-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
542f90381422676544382d4071ba44a2de90a0c1 27-Jul-2012 Mike Snitzer <snitzer@redhat.com> dm: support non power of two target max_io_len

Remove the restriction that limits a target's specified maximum incoming
I/O size to be a power of 2.

Rename this setting from 'split_io' to the less-ambiguous 'max_io_len'.
Change it from sector_t to uint32_t, which is plenty big enough, and
introduce a wrapper function dm_set_target_max_io_len() to set it.
Use sector_div() to process it now that it is not necessarily a power of 2.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
70c48611024791ccf83aca6195b58a5db9325485 27-Jul-2012 Alasdair G Kergon <agk@redhat.com> dm snapshot: remove redundant assignment in merge fn

Remove redundant bvm->bi_sector self-assignment in dm snapshot's
origin_merge().

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
a6e50b409d3f9e0833e69c3c9cca822e8fa4adbb 02-Aug-2011 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: skip reading origin when overwriting complete chunk

If we write a full chunk in the snapshot, skip reading the origin device
because the whole chunk will be overwritten anyway.

This patch changes the snapshot write logic when a full chunk is written.
In this case:
1. allocate the exception
2. dispatch the bio (but don't report the bio completion to device mapper)
3. write the exception record
4. report bio completed

Callbacks must be done through the kcopyd thread, because callbacks must not
race with each other. So we create two new functions:

dm_kcopyd_prepare_callback: allocate a job structure and prepare the callback.
(This function must not be called from interrupt context.)

dm_kcopyd_do_callback: submit callback.
(This function may be called from interrupt context.)

Performance test (on snapshots with 4k chunk size):
without the patch:
non-direct-io sequential write (dd): 17.7MB/s
direct-io sequential write (dd): 20.9MB/s
non-direct-io random write (mkfs.ext2): 0.44s

with the patch:
non-direct-io sequential write (dd): 26.5MB/s
direct-io sequential write (dd): 33.2MB/s
non-direct-io random write (mkfs.ext2): 0.27s

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
a2d2b0345a0f30c169b7d08b8cebdd4853fcb0f8 02-Aug-2011 Jonathan Brassow <jbrassow@redhat.com> dm snapshot: style cleanups

Coding style cleanups.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
aa3f0794d279cd154ac100f92ff3904ea1f56de2 02-Aug-2011 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: remove unused definitions

Remove a couple of unused #defines.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
fa34ce73072f90ecd90dcc43f29d82e70e5f8676 29-May-2011 Mikulas Patocka <mpatocka@redhat.com> dm kcopyd: return client directly and not through a pointer

Return client directly from dm_kcopyd_client_create, not through a
parameter, making it consistent with dm_io_client_create.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
5f43ba2950414dc0abf4ac44c397d88069056746 29-May-2011 Mikulas Patocka <mpatocka@redhat.com> dm kcopyd: reserve fewer pages

Reserve just the minimum of pages needed to process one job.

Because we allocate pages from page allocator, we don't need to reserve
a large number of pages. The maximum job size is SUB_JOB_SIZE and we
calculate the number of reserved pages based on this.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
024d37e95ec4a7ccc256973ab2feab01f4fbdd2d 24-Mar-2011 Milan Broz <mbroz@redhat.com> dm: fix opening log and cow devices for read only tables

If a table is read-only, also open any log and cow devices it uses read-only.

Previously, even read-only devices were opened read-write internally.
After patch 75f1dc0d076d1c1168f2115f1941ea627d38bd5a
block: check bdev_read_only() from blkdev_get()
was applied, loading such tables began to fail. The patch
was reverted by e51900f7d38cbcfb481d84567fd92540e7e1d23a
block: revert block_dev read-only check
but this patch fixes this part of the code to work with the original patch.

Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
b83b2f295aec418c7501c05df4dfd168a79d165a 13-Jan-2011 Mike Snitzer <snitzer@redhat.com> dm snapshot: avoid storing private suspended state

Use dm_suspended() rather than having each snapshot target maintain a
private 'suspended' flag in struct dm_snapshot.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
fecec20e55ec117a09857ac1a455e2e6e2f17df4 13-Jan-2011 Tejun Heo <tj@kernel.org> dm snapshot: remove unused dm_snapshot queued_bios_work

dm_snapshot->queued_bios_work isn't used. Remove ->queued_bios[_work]
from dm_snapshot structure, the flush_queued_bios work function and
ksnapd workqueue.

The DM snapshot changes that were going to use the ksnapd workqueue were
either superseded (fix for origin write races) or never completed
(deallocation of invalid snapshot's memory via workqueue).

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
c8bf1336824ebd698d37b71763e1c43190f2229a 10-Sep-2010 Martin K. Petersen <martin.petersen@oracle.com> Consolidate min_not_zero

We have several users of min_not_zero, each of them using their own
definition. Move the define to kernel.h.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@carl.home.kernel.dk>
d87f4c14f27dc82d215108d8392a7d26687148a1 03-Sep-2010 Tejun Heo <tj@kernel.org> dm: implement REQ_FLUSH/FUA support for bio-based dm

This patch converts bio-based dm to support REQ_FLUSH/FUA instead of
now deprecated REQ_HARDBARRIER.

* -EOPNOTSUPP handling logic dropped.

* Preflush is handled as before but postflush is dropped and replaced
with passing down REQ_FUA to member request_queues. This replaces
one array wide cache flush w/ member specific FUA writes.

* __split_and_process_bio() now calls __clone_and_map_flush() directly
for flushes and guarantees all FLUSH bio's going to targets are zero
` length.

* It's now guaranteed that all FLUSH bio's which are passed onto dm
targets are zero length. bio_empty_barrier() tests are replaced
with REQ_FLUSH tests.

* Empty WRITE_BARRIERs are replaced with WRITE_FLUSHes.

* Dropped unlikely() around REQ_FLUSH tests. Flushes are not unlikely
enough to be marked with unlikely().

* Block layer now filters out REQ_FLUSH/FUA bio's if the request_queue
doesn't support cache flushing. Advertise REQ_FLUSH | REQ_FUA
capability.

* Request based dm isn't converted yet. dm_init_request_based_queue()
resets flush support to 0 for now. To avoid disturbing request
based dm code, dm->flush_error is added for bio based dm while
requested based dm continues to use dm->barrier_error.

Lightly tested linear, stripe, raid1, snap and crypt targets. Please
proceed with caution as I'm not familiar with the code base.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: dm-devel@redhat.com
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
57cba5d3658d9fdc019c6af14a2d80aefa651e56 12-Aug-2010 Mike Snitzer <snitzer@redhat.com> dm: rename map_info flush_request to target_request_nr

'target_request_nr' is a more generic name that reflects the fact that
it will be used for both flush and discard support.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
b1d5552838334c600b068c9c8cc18638e5a8cb47 12-Aug-2010 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: implement merge

Implement merge method for the snapshot origin to improve read
performance.

Without merge method, dm asks the upper layers to submit smallest possible
bios --- one page. Submitting such small bios impacts performance negatively
when reading or writing the origin device.

Without this patch, CPU consumption when reading the origin on lvm on md-raid0
was 6 to 12%, with this patch, it drops to 1 to 4%.

Note: in my testing, it actually degraded performance in some settings, I
traced it to Maxtor disks having problems with > 512-sector requests.
Reducing the number of sectors to /sys/block/sd*/queue/max_sectors_kb to
256 fixed the read performance. I think we don't have to care about weird
disks that actually degrade performance because of large requests being
sent to them.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
c24110450650f17f7d3ba4fbe01f01ac5a115456 12-Aug-2010 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: test chunk size against both origin and snapshot

Validate chunk size against both origin and snapshot sector size

Don't allow chunk size smaller than either origin or snapshot logical
sector size. Reading or writing data not aligned to sector size is not
allowed and causes immediate errors.

This requires us to open the origin before initialising the
exception store and to export dm_snap_origin.

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
1e5554c8428bc7209a83e2d07ca724be4d981ce3 12-Aug-2010 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: iterate origin and cow devices

Iterate both origin and snapshot devices

iterate_devices method should call the callback for all the devices where
the bio may be remapped. Thus, snapshot_iterate_devices should call the callback
for both snapshot and origin underlying devices because it remaps some bios
to the snapshot and some to the origin.

snapshot_iterate_devices called the callback only for the origin device.
This led to badly calculated device limits if snapshot and origin were placed
on different types of disks.

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
924e600d417ead9ef67043988055ba236f114718 06-Mar-2010 Mike Snitzer <snitzer@redhat.com> dm: eliminate some holes data structures

Eliminate a 4-byte hole in 'struct dm_io_memory' by moving 'offset' above the
'ptr' to which it applies (size reduced from 24 to 16 bytes). And by
association, 1-4 byte hole is eliminated in 'struct dm_io_request' (size
reduced from 56 to 48 bytes).

Eliminate all 6 4-byte holes and 1 cache-line in 'struct dm_snapshot' (size
reduced from 392 to 368 bytes).

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
8215d6ec5fee1e76545decea2cd73717efb5cb42 06-Mar-2010 Nikanth Karthikesan <knikanth@novell.com> dm table: remove unused dm_get_device range parameters

Remove unused parameters(start and len) of dm_get_device()
and fix the callers.

Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
d2fdb776e08d4231d7e86a879cc663a93913c202 11-Dec-2009 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: use merge origin if snapshot invalid

If the snapshot we are merging became invalid (e.g. it ran out of
space) redirect all I/O directly to the origin device.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
d8ddb1cfff0070479c1f4a07c1d4a48ef8cb188e 11-Dec-2009 Mike Snitzer <snitzer@redhat.com> dm snapshot: report merge failure in status

Set 'merge_failed' flag if a snapshot fails to merge. Update
snapshot_status() to report "Merge failed" if 'merge_failed' is set.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
8a2d528620e228ddfd0df9cec0a16e034ff8db1d 11-Dec-2009 Mike Snitzer <snitzer@redhat.com> dm snapshot: merge consecutive chunks together

s->store->type->prepare_merge returns the number of chunks that can be
copied linearly working backwards from the returned chunk number.

For example, if it returns 3 chunks with old_chunk == 10 and new_chunk
== 20, then chunk 20 can be copied to 10, chunk 19 to 9 and 18 to 8.

Until now kcopyd only copied one chunk at a time. This patch now copies
the full set at once.

Consequently, snapshot_merge_process() needs to delay the merging of all
chunks if any have writes in progress, not just the first chunk in the
region that is to be merged.

snapshot-merge's performance is now comparable to the original
snapshot-origin target.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
73dfd078cf8bfee4018fb22f1e2a24f2e05b69dc 11-Dec-2009 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: trigger exceptions in remaining snapshots during merge

When there is one merging snapshot and other non-merging snapshots,
snapshot_merge_process() must make exceptions in the non-merging
snapshots.

Use a sequence count to resolve the race between I/O to chunks that are
about to be merged. The count increases each time an exception
reallocation finishes. Use wait_event() to wait until the count
changes.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
17aa03326d40614db94bc51fbbc92df628a5c97c 11-Dec-2009 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: delay merging a chunk until writes to it complete

Track writes to chunks that are currently being merged and delay merging
a chunk until all writes to that chunk finish.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
9fe862548821b0c206c58e8057b782530a173703 11-Dec-2009 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: queue writes to chunks being merged

While a set of chunks is being merged, any overlapping writes need to be
queued.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
1e03f97e4301f75a2f3b649787f7876516764929 11-Dec-2009 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: add merging

Merging is started when origin is resumed and it is stopped when
origin is suspended or when the merging snapshot is destroyed or
errors are detected.

Merging is not yet interlocked with writes: this will be handled in
subsequent patches.

The code relies on callbacks from a private kcopyd thread.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
9d3b15c4c776b041f9ee81810cbd375275411829 11-Dec-2009 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: permit only one merge at once

Merging more than one snapshot is not supported, so prevent
this happening.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
10b8106a70433e14153469ebbdd189776500e238 11-Dec-2009 Mike Snitzer <snitzer@redhat.com> dm snapshot: support barriers in snapshot merge target

Sets num_flush_requests=2 to support flushing both the origin and cow
devices used by the snapshot-merge target.

Also, snapshot_ctr() now gets the origin device using FMODE_WRITE if the
target is snapshot-merge (which writes to the origin device).

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
3452c2a1eb5b93c1b9fb0d22bd5b07c0cee4dc29 11-Dec-2009 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: avoid allocating exceptions in merge

The snapshot-merge target should not allocate new exceptions because the
intent is to merge all of its exceptions as quickly and safely as
possible.

This patch introduces the snapshot-merge mapping function and updates
__origin_write() so that it doesn't allocate exceptions on any snapshots
that are being merged.

If a write request to a merging snapshot device is to be dispatched
directly to the origin (because the chunk is not remapped or was already
merged), snapshot_merge_map() must make exceptions in other snapshots so
calls do_origin().

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
515ad66cc4c82f210d726340349c8f7c1ec6b125 11-Dec-2009 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: rework writing to origin

To track the completion of exceptions relating to the same location on
the device, the current code selects one exception as primary_pe, links
the other exceptions to it and uses reference counting to wait until all
the reallocations are complete.

It is considered too complicated to extend this code to handle the new
snapshot-merge target, where sets of non-overlapping chunks would also
need to become linked.

Instead, a simpler (but less efficient) approach is taken. Bios are
linked to one exception. When it completes, bios are simply retried,
and if other related exceptions are still outstanding, they'll get
queued again to wait for another one.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
d698aa4500aa3ca9559142060caf0f79da998744 11-Dec-2009 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: add merge target

The snapshot-merge target allows a snapshot to be merged back into the
snapshot's origin device.

One anticipated use of snapshot merging is the rollback of filesystems
to back out problematic system upgrades.

This patch adds snapshot-merge target management to both
dm_snapshot_init() and dm_snapshot_exit(). As an initial place-holder,
snapshot-merge is identical to the snapshot target. Documentation is
provided.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
615d1eb9cad9b34ed17c18c604254e8db533ac6f 11-Dec-2009 Mike Snitzer <snitzer@redhat.com> dm snapshot: create function for chunk_is_tracked wait

Move the __chunk_is_tracked() loop into a separate function as we will
also need to call it from the write path in the rare case of conflicting
writes to the same chunk.

Originally introduced in commit a8d41b59f3f5a7ac19452ef442a7fc1b5fa17366
("dm snapshot: fix race during exception creation").

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
9eaae8ffbc340fc034fed1e5d0dc9ca0e943f817 11-Dec-2009 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: make bio optional in __origin_write

To support the merging of snapshots back into their origin we need
to trigger exceptions in other snapshots not being merged without
any incoming bio on the origin device. The bio parameter to
__origin_write() becomes optional and the sector needs supplying
separately.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
c1f0c183f6acc6d32c5a1d0249ec68bf783af7b1 11-Dec-2009 Mike Snitzer <snitzer@redhat.com> dm snapshot: allow live exception store handover between tables

Permit in-use snapshot exception data to be 'handed over' from one
snapshot instance to another. This is a pre-requisite for patches
that allow the changes made in a snapshot device to be merged back into
its origin device and also allows device resizing.

The basic call sequence is:

dmsetup load new_snapshot (referencing the existing in-use cow device)
- the ctr code detects that the cow is already in use and allows the
two snapshot target instances to be linked together
dmsetup suspend original_snapshot
dmsetup resume new_snapshot
- the new_snapshot becomes live, and if anything now tries to access
the original one it will receive -EIO
dmsetup remove original_snapshot

(There can only be two snapshot targets referencing the same cow device
simultaneously.)

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
c26655ca3ca7550740a63820ee981e5c7c797523 11-Dec-2009 Mike Snitzer <snitzer@redhat.com> dm snapshot: track suspended state in target

Keep track of whether or not the device is suspended within the snapshot
target module, the same as we do in dm-raid1.

We will use this later to enforce the correct sequence of ioctls to
transfer the in-core exceptions from a snapshot target instance in
one table to a replacement one capable of merging them back
into the origin.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
fc56f6fbcca3672c63c93c65f45105faacfc13cb 11-Dec-2009 Mike Snitzer <snitzer@redhat.com> dm snapshot: move cow ref from exception store to snap core

Store the reference to the snapshot cow device in the core snapshot
code instead of each exception store. It can be accessed through the
new function dm_snap_cow(). Exception stores should each now maintain a
reference to their parent snapshot struct.

This is cleaner and makes part of the forthcoming snapshot merge code simpler.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Reviewed-by: Jonathan Brassow <jbrassow@redhat.com>
Cc: Mikulas Patocka <mpatocka@redhat.com>
985903bb3a6d98623360ab6c855417f638840029 11-Dec-2009 Mike Snitzer <snitzer@redhat.com> dm snapshot: add allocated metadata to snapshot status

Add number of sectors used by metadata to the end of the snapshot's status
line.

Renamed dm_exception_store_type's 'fraction_full' to 'usage'. Renamed
arguments to be clearer about what is being returned. Also added
'metadata_sectors'.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
3510cb94ff7b04b016bd22bfee913e2c1c05c066 11-Dec-2009 Jon Brassow <jbrassow@redhat.com> dm snapshot: rename exception functions

Rename exception functions. Preparing to pull them out of
dm-snap.c for broader use.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
191437a53c8269df3a2c6199206781e742c57bb5 11-Dec-2009 Jon Brassow <jbrassow@redhat.com> dm snapshot: rename exception_table to dm_exception_table

Rename exception_table for broader use outside dm-snap.c

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
1d4989c858093bda0426be536fc7f9c415857836 11-Dec-2009 Jon Brassow <jbrassow@redhat.com> dm snapshot: rename dm_snap_exception to dm_exception

The exception structure is not necessarily just a snapshot
element (especially after we pull it out of dm-snap.c).

Renaming appropriately.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
d32a6ea65fbc33621f9c790da3dff10201640b2a 11-Dec-2009 Jon Brassow <jbrassow@redhat.com> dm snapshot: consolidate insert exception functions

Consolidate the insert_*exception functions. 'insert_completed_exception'
already contains all the logic to handle 'insert_exception' (via
check for a hash_shift of 0), so remove redundant function.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
7e201b35132a1f02c931a0a06760766c846bb49b 11-Dec-2009 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: abstract minimum_chunk_size fn

The origin needs to find minimum chunksize of all snapshots. This logic is
moved to a separate function because it will be used at another place in
the snapshot merge patches.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
8e87b9b81b3c370f7e53c1ab6e1c3519ef37a644 11-Dec-2009 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: cope with chunk size larger than origin

Under some special conditions the snapshot hash_size is calculated as zero.
This patch instead sets a minimum value of 64, the same as for the
pending exception table.

rounddown_pow_of_two(0) is an undefined operation (it expands to shift
by -1). init_exception_table with an argument of 0 would fail with -ENOMEM.

The way to trigger the problem is to create a snapshot with a chunk size
that is larger than the origin device.

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
94e76572b5dd37b1f0f4b3742ee8a565daead932 11-Dec-2009 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: only take lock for statustype info not table

Take snapshot lock only for STATUSTYPE_INFO, not STATUSTYPE_TABLE.

Commit 4c6fff445d7aa753957856278d4d93bcad6e2c14
(dm-snapshot-lock-snapshot-while-supplying-status.patch)
introduced this use of the lock, but userspace applications using
libdevmapper have been found to request STATUSTYPE_TABLE while the device
is suspended and the lock is already held, leading to deadlock. Since
the lock is not necessary in this case, don't try to take it.

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
df96eee679ba28c98cf722fa7c9f4286ee1ed0bd 17-Oct-2009 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: use unsigned integer chunk size

Use unsigned integer chunk size.

Maximum chunk size is 512kB, there won't ever be need to use 4GB chunk size,
so the number can be 32-bit. This fixes compiler failure on 32-bit systems
with large block devices.

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
4c6fff445d7aa753957856278d4d93bcad6e2c14 17-Oct-2009 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: lock snapshot while supplying status

This patch locks the snapshot when returning status. It fixes a race
when it could return an invalid number of free chunks if someone
was simultaneously modifying it.

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
3f2412dc85260e5aae7ebb03bf50d5b1407e3083 17-Oct-2009 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: require non zero chunk size by end of ctr

If we are creating snapshot with memory-stored exception store, fail if
the user didn't specify chunk size. Zero chunk size would probably crash
a lot of places in the rest of snapshot code.

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Jonathan Brassow <jbrassow@redhat.com>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
034a186d29dbcef099e57ab23ec39440596be911 17-Oct-2009 Jonathan Brassow <jbrassow@redhat.com> dm snapshot: free exception store on init failure

While initializing the snapshot module, if we fail to register
the snapshot target then we must back-out the exception store
module initialization.

Cc: stable@kernel.org
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
6d45d93ead319423099b82a4efd775bc0f159121 17-Oct-2009 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: sort by chunk size to fix race

Avoid a race causing corruption when snapshots of the same origin have
different chunk sizes by sorting the internal list of snapshots by chunk
size, largest first.
https://bugzilla.redhat.com/show_bug.cgi?id=182659

For example, let's have two snapshots with different chunk sizes. The
first snapshot (1) has small chunk size and the second snapshot (2) has
large chunk size. Let's have chunks A, B, C in these snapshots:
snapshot1: ====A==== ====B====
snapshot2: ==========C==========

(Chunk size is a power of 2. Chunks are aligned.)

A write to the origin at a position within A and C comes along. It
triggers reallocation of A, then reallocation of C and links them
together using A as the 'primary' exception.

Then another write to the origin comes along at a position within B and
C. It creates pending exception for B. C already has a reallocation in
progress and it already has a primary exception (A), so nothing is done
to it: B and C are not linked.

If the reallocation of B finishes before the reallocation of C, because
there is no link with the pending exception for C it does not know to
wait for it and, the second write is dispatched to the origin and causes
data corruption in the chunk C in snapshot2.

To avoid this situation, we maintain snapshots sorted in descending
order of chunk size. This leads to a guaranteed ordering on the links
between the pending exceptions and avoids the problem explained above -
both A and B now get linked to C.

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
8811f46c1f9386fc7017150de9d52359e5b1826e 04-Sep-2009 Mike Snitzer <snitzer@redhat.com> dm snapshot: implement iterate devices

Implement the .iterate_devices for the origin and snapshot targets.
dm-snapshot's lack of .iterate_devices resulted in the inability to
properly establish queue_limits for both targets.

With 4K sector drives: an unfortunate side-effect of not establishing
proper limits in either targets' DM device was that IO to the devices
would fail even though both had been created without error.

Commit af4874e03ed82f050d5872d8c39ce64bf16b5c38 ("dm target:s introduce
iterate devices fn") in 2.6.31-rc1 should have implemented .iterate_devices
for dm-snap.c's origin and snapshot targets.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
494b3ee7d4f69210def80aecce28d08c3f0755d5 22-Jun-2009 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: support barriers

Flush support for dm-snapshot target.

This patch just forwards the flush request to either the origin or the snapshot
device. (It doesn't flush exception store metadata.)

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
8f3d8ba20e67991b531e9c0227dcd1f99271a32c 07-Apr-2009 Christoph Hellwig <hch@lst.de> block: move bio list helpers into bio.h

It's used by DM and MD and generally useful, so move the bio list
helpers into bio.h.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
1e302a929e2da6e8448e2058e4b07b07252b57fe 02-Apr-2009 Jonathan Brassow <jbrassow@redhat.com> dm snapshot: move status to exception store

Let the exception store types print out their status through
the new API, rather than having the snapshot code do it.

Adjust the buffer position to allow for the preceding DMEMIT in the
arguments to type->status().

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
fee1998e9c690f9920671e1e0ef187a48cfbbde4 02-Apr-2009 Jonathan Brassow <jbrassow@redhat.com> dm snapshot: move ctr parsing to exception store

First step of having the exception stores parse their own arguments -
generalizing the interface.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2e4a31df2b10cbcaf43c333112f6f7440a035c69 02-Apr-2009 Jonathan Brassow <jbrassow@redhat.com> dm snapshot: use DMEMIT macro for status

Use DMEMIT in place of snprintf. This makes it easier later when
other modules are helping to populate our status output.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
ccc45ea8aeffec49fa5985efc3649aa67bb4fcb7 02-Apr-2009 Jonathan Brassow <jbrassow@redhat.com> dm snapshot: remove dm_snap header

Move some of the last bits from dm-snap.h into dm-snap.c where they
belong and remove dm-snap.h.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
71fab00a6bef7fb53119271a8abdbaf40970d28a 02-Apr-2009 Jonathan Brassow <jbrassow@redhat.com> dm snapshot: remove dm_snap header use

Move useful functions out of dm-snap.h and stop using dm-snap.h.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
49beb2b87a972a994ff77633234ca3bf0d30a1d8 02-Apr-2009 Jonathan Brassow <jbrassow@redhat.com> dm exception store: move cow pointer

Move COW device from snapshot to exception store.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
d0216849519bec8dc96301a3cd80316e71243839 02-Apr-2009 Jonathan Brassow <jbrassow@redhat.com> dm exception store: move chunk_fields

Move chunk fields from snapshot to exception store.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
0cea9c78270cdf1d2ad74ce0e083d5555a0842e8 02-Apr-2009 Jonathan Brassow <jbrassow@redhat.com> dm exception store: move dm_target pointer

Move target pointer from snapshot to exception store.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
493df71c6420b211a68ae82b889c1e8a5fe701be 02-Apr-2009 Jonathan Brassow <jbrassow@redhat.com> dm exception store: introduce registry

Move exception stores into a registry.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
b2a114652940ccf7e9668ad447ca78bf16a31139 02-Apr-2009 Jonathan Brassow <jbrassow@redhat.com> dm exception store: separate type from instance

Introduce struct dm_exception_store_type.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
35bf659b008e83e725dcd30f542e38461dbb867c 02-Apr-2009 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: avoid having two exceptions for the same chunk

We need to check if the exception was completed after dropping the lock.

After regaining the lock, __find_pending_exception checks if the exception
was already placed into &s->pending hash.

But we don't check if the exception was already completed and placed into
&s->complete hash. If the process waiting in alloc_pending_exception was
delayed at this point because of a scheduling latency and the exception
was meanwhile completed, we'd miss that and allocate another pending
exception for already completed chunk.

It would lead to a situation where two records for the same chunk exist
and potential data corruption because multiple snapshot I/Os to the
affected chunk could be redirected to different locations in the
snapshot.

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
c66213921c816f6b1b16a84911618ba9a363b134 02-Apr-2009 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: avoid dropping lock in __find_pending_exception

It is uncommon and bug-prone to drop a lock in a function that is called with
the lock held, so this is moved to the caller.

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2913808eb56a6445a7b277eb8d17651c8defb035 02-Apr-2009 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: refactor __find_pending_exception

Move looking-up of a pending exception from __find_pending_exception to another
function.

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
a159c1ac5f33c6cf0f5aa3c9d1ccdc82c907ee46 06-Jan-2009 Jonathan Brassow <jbrassow@redhat.com> dm snapshot: extend exception store functions

Supply dm_add_exception as a callback to the read_metadata function.
Add a status function ready for a later patch and name the functions
consistently.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
4db6bfe02bdc7dc5048f46dd682a94801d029adc 06-Jan-2009 Alasdair G Kergon <agk@redhat.com> dm snapshot: split out exception store implementations

Move the existing snapshot exception store implementations out into
separate files. Later patches will place these behind a new
interface in preparation for alternative implementations.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
aea53d92f70eeb00ae480e399a997dd55fd5055d 06-Jan-2009 Jonathan Brassow <jbrassow@redhat.com> dm snapshot: separate out exception store interface

Pull structures that bridge the gap between snapshot and
exception store out of dm-snap.h and put them in a new
.h file - dm-exception-store.h. This file will define the
API for new exception stores.

Ultimately, dm-snap.h is unnecessary, since only dm-snap.c
should be using it.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
10d3bd09a3c25df114f74f7f86e1b58d070bef32 06-Jan-2009 Mikulas Patocka <mpatocka@redhat.com> dm: consolidate target deregistration error handling

Change dm_unregister_target to return void and use BUG() for error
reporting.

dm_unregister_target can only fail because of programming bug in the
target driver. It can't fail because of user's behavior or disk errors.

This patch changes unregister_target to return void and use BUG if
someone tries to unregister non-registered target or unregister target
that is in use.

This patch removes code duplication (testing of error codes in all dm
targets) and reports bugs in just one place, in dm_unregister_target. In
some target drivers, these return codes were ignored, which could lead
to a situation where bugs could be missed.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
90fa1527bddc7147dc0d590ee6184ca88bc50ecf 06-Jan-2009 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: change yield to msleep

Change yield() to msleep(1). If the thread had realtime priority,
yield() doesn't really yield, so the yielding process would loop
indefinitely and cause machine lockup.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
879129d208f725267366296b631aef31409cf304 30-Oct-2008 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: wait for chunks in destructor

If there are several snapshots sharing an origin and one is removed
while the origin is being written to, the snapshot's mempool may get
deleted while elements are still referenced.

Prior to dm-snapshot-use-per-device-mempools.patch the pending
exceptions may still have been referenced after the snapshot was
destroyed, but this was not a problem because the shared mempool
was still there.

This patch fixes the problem by tracking the number of mempool elements
in use.

The scenario:
- You have an origin and two snapshots 1 and 2.
- Someone writes to the origin.
- It creates two exceptions in the snapshots, snapshot 1 will be primary
exception, snapshot 2's pending_exception->primary_pe will point to the
exception in snapshot 1.
- The exceptions are being relocated, relocation of exception 1 finishes
(but it's pending_exception is still allocated, because it is referenced
by an exception from snapshot 2)
- The user lvremoves snapshot 1 --- it calls just suspend (does nothing)
and destructor. md->pending is zero (there is no I/O submitted to the
snapshot by md layer), so it won't help us.
- The destructor waits for kcopyd jobs to finish on snapshot 1 --- but
there are none.
- The destructor on snapshot 1 cleans up everything.
- The relocation of exception on snapshot 2 finishes, it drops reference
on primary_pe. This frees its primary_pe pointer. Primary_pe points to
pending exception created for snapshot 1. So it frees memory into
non-existing mempool.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
60c856c8e2f57a3f69c505735ef66e3719ea0bd6 30-Oct-2008 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: fix register_snapshot deadlock

register_snapshot() performs a GFP_KERNEL allocation while holding
_origins_lock for write, but that could write out dirty pages onto a
device that attempts to acquire _origins_lock for read, resulting in
deadlock.

So move the allocation up before taking the lock.

This path is not performance-critical, so it doesn't matter that we
allocate memory and free it if we find that we won't need it.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
f68d4f3d394da5b1f69d855b8513f4256ccc803e 21-Oct-2008 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: drop unused last_percent

The last_percent field is unused - remove it.
(It dates from when events were triggered as each X% filled up.)

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
7c5f78b9d7f21937e46c26db82976df4b459c95c 21-Oct-2008 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: fix primary_pe race

Fix a race condition with primary_pe ref_count handling.

put_pending_exception runs under dm_snapshot->lock, it does atomic_dec_and_test
on primary_pe->ref_count, and later does atomic_read primary_pe->ref_count.

__origin_write does atomic_dec_and_test on primary_pe->ref_count without holding
dm_snapshot->lock.

This opens the following race condition:
Assume two CPUs, CPU1 is executing put_pending_exception (and holding
dm_snapshot->lock). CPU2 is executing __origin_write in parallel.
primary_pe->ref_count == 2.

CPU1:
if (primary_pe && atomic_dec_and_test(&primary_pe->ref_count))
origin_bios = bio_list_get(&primary_pe->origin_bios);
... decrements primary_pe->ref_count to 1. Doesn't load origin_bios

CPU2:
if (first && atomic_dec_and_test(&primary_pe->ref_count)) {
flush_bios(bio_list_get(&primary_pe->origin_bios));
free_pending_exception(primary_pe);
/* If we got here, pe_queue is necessarily empty. */
return r;
}
... decrements primary_pe->ref_count to 0, submits pending bios, frees
primary_pe.

CPU1:
if (!primary_pe || primary_pe != pe)
free_pending_exception(pe);
... this has no effect.
if (primary_pe && !atomic_read(&primary_pe->ref_count))
free_pending_exception(primary_pe);
... sees ref_count == 0 (written by CPU 2), does double free !!

This bug can happen only if someone is simultaneously writing to both the
origin and the snapshot.

If someone is writing only to the origin, __origin_write will submit kcopyd
request after it decrements primary_pe->ref_count (so it can't happen that the
finished copy races with primary_pe->ref_count decrementation).

If someone is writing only to the snapshot, __origin_write isn't invoked at all
and the race can't happen.

The race happens when someone writes to the snapshot --- this creates
pending_exception with primary_pe == NULL and starts copying. Then, someone
writes to the same chunk in the snapshot, and __origin_write races with
termination of already submitted request in pending_complete (that calls
put_pending_exception).

This race may be reason for bugs:
http://bugzilla.kernel.org/show_bug.cgi?id=11636
https://bugzilla.redhat.com/show_bug.cgi?id=465825

The patch fixes the code to make sure that:
1. If atomic_dec_and_test(&primary_pe->ref_count) returns false, the process
must no longer dereference primary_pe (because someone else may free it under
us).
2. If atomic_dec_and_test(&primary_pe->ref_count) returns true, the process
is responsible for freeing primary_pe.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Cc: stable@kernel.org
92e868122edf08b9fc06b112e7e0c80ab94c1f93 21-Jul-2008 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: use per device mempools

Change snapshot per-module mempool to per-device mempool.

Per-module mempools could cause a deadlock if multiple
snapshot devices are stacked above each other.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
a8d41b59f3f5a7ac19452ef442a7fc1b5fa17366 21-Jul-2008 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: fix race during exception creation

Fix a race condition that returns incorrect data when a write causes an
exception to be allocated whilst a read is still in flight.

The race condition happens as follows:
* A read to non-reallocated sector in the snapshot is submitted so that the
read is routed to the original device.
* A write to the original device is submitted. The write causes an exception
that reallocates the block. The write proceeds.
* The original read is dequeued and reads the wrong data.

This race can be triggered with CFQ scheduler and one thread writing and
multiple threads reading simultaneously.

(This patch relies upon the earlier dm-kcopyd-per-device.patch to avoid a
deadlock.)

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
cd45daffd1f7b53aac0835b23e97f814ec3f10dc 21-Jul-2008 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: track snapshot reads

Whenever a snapshot read gets mapped through to the origin, track it in
a per-snapshot hash table indexed by chunk number, using memory allocated
from a new per-snapshot mempool.

We need to track these reads to avoid race conditions which will be fixed
by patches that follow.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
a765e20eeb423d0fa6a02ffab51141e53bbd93cb 24-Apr-2008 Alasdair G Kergon <agk@redhat.com> dm: move include files

Publish the dm-io, dm-log and dm-kcopyd headers in include/linux.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
eb69aca5d3370b81450d68edeebc2bb9a3eb9689 24-Apr-2008 Heinz Mauelshagen <hjm@redhat.com> dm kcopyd: clean interface

Clean up the kcopyd interface to prepare for publishing it in include/linux.

Signed-off-by: Heinz Mauelshagen <hjm@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
22a1ceb1e6a7fbce95a1531ff10bb4fb036d4a37 24-Apr-2008 Heinz Mauelshagen <hjm@redhat.com> dm io: clean interface

Clean up the dm-io interface to prepare for publishing it in include/linux.

Signed-off-by: Heinz Mauelshagen <hjm@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
72727bad544b4ce0a3f7853bfd7ae939f398007d 24-Apr-2008 Mikulas Patocka <mpatocka@redhat.com> dm snapshot: store pointer to target instance

Save pointer to dm_target in dm_snapshot structure.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
8ee2767a5903fde549fd3597690f64c8d951051a 24-Apr-2008 Milan Broz <mbroz@redhat.com> dm snapshot: reduce default memory allocation

Limit the amount of memory allocated per snapshot on systems
with a large page size. (The larger default chunk size on
these systems compensates for the smaller number of pages reserved.)

Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
4cdc1d1fa5c5ac14dc21be19832f02fd0b83867e 28-Mar-2008 Alasdair G Kergon <agk@redhat.com> dm io: write error bits form long not int

write_err is an unsigned long used with set_bit() so should not be passed
around as unsigned int.

http://bugzilla.kernel.org/show_bug.cgi?id=10271

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
d74f81f8adc504a23be3babf347b9f69e9389924 08-Feb-2008 Milan Broz <mbroz@redhat.com> dm snapshot: combine consecutive exceptions in memory

Provided sector_t is 64 bits, reduce the in-memory footprint of the
snapshot exception table by the simple method of using unused bits of
the chunk number to combine consecutive entries.

Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
8defd83084c3ce46d314c038f7c0f0ed7156d6f8 08-Feb-2008 Robert P. J. Day <rpjday@crashcourse.ca> dm snapshot: use rounddown_pow_of_two

Since the source file already includes the log2.h header file, it
seems pointless to re-invent the necessary routine.

Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
6f3c3f0afa50782dc1742c968646c491657d255a 19-Oct-2007 vignesh babu <vignesh.babu@wipro.com> dm: use is_power_of_2

Replacing n & (n - 1) for power of 2 check by is_power_of_2(n)

Signed-off-by: vignesh babu <vignesh.babu@wipro.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
6712ecf8f648118c3363c142196418f89a510b90 27-Sep-2007 NeilBrown <neilb@suse.de> Drop 'size' argument from bio_endio and bi_end_io

As bi_end_io is only called once when the reqeust is complete,
the 'size' argument is now redundant. Remove it.

Now there is no need for bio_endio to subtract the size completed
from bi_size. So don't do that either.

While we are at it, change bi_end_io to return void.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
07a83c47cfc00ba5f0f090ccddd3a0703be0eec9 12-Jul-2007 Stefan Bader <shbader@de.ibm.com> dm: disable barriers

This patch causes device-mapper to reject any barrier requests. This is done
since most of the targets won't handle this correctly anyway. So until the
situation improves it is better to reject these requests at the first place.
Since barrier requests won't get to the targets, the checks there can be
removed.

Cc: stable@kernel.org
Signed-off-by: Stefan Bader <shbader@de.ibm.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
0764147b111b8ca886e4f2e9c9e019106b09b657 12-Jul-2007 Milan Broz <mbroz@redhat.com> dm snapshot: permit invalid activation

Allow invalid snapshots to be activated instead of failing.

This allows userspace to reinstate any given snapshot state - for
example after an unscheduled reboot - and clean up the invalid snapshot
at its leisure.

Cc: stable@kernel.org
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
028867ac28e51afc834a5931e7545c022557eded 12-Jul-2007 Alasdair G Kergon <agk@redhat.com> dm: use kmem_cache macro

Use new KMEM_CACHE() macro and make the newly-exposed structure names more
meaningful. Also remove some superfluous casts and inlines (let a modern
compiler be the judge).

Acked-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
c642f9e03b3ca04fc806ba5d8d34cc821382e525 08-Dec-2006 Adrian Bunk <bunk@stusta.de> [PATCH] make drivers/md/dm-snap.c:ksnapd static

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
31c93a0c29bf96efd806ccf4ee81cacf04f255de 08-Dec-2006 Milan Broz <mbroz@redhat.com> [PATCH] dm: snapshot: abstract memory release

Move the code that releases memory used by a snapshot into a separate function.

Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Cc: dm-devel@redhat.com
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
d2a7ad29a810441e9dacbaddcc2f0c6045390008 08-Dec-2006 Kiyoshi Ueda <k-ueda@ct.jp.nec.com> [PATCH] dm: map and endio symbolic return codes

Update existing targets to use the new symbols for return values from target
map and end_io functions.

There is no effect on behaviour.

Test results:
Done build test without errors.

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Cc: dm-devel@redhat.com
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
e18b890bb0881bbab6f4f1a6cd20d9c60d66b003 07-Dec-2006 Christoph Lameter <clameter@sgi.com> [PATCH] slab: remove kmem_cache_t

Replace all uses of kmem_cache_t with struct kmem_cache.

The patch was generated using the following script:

#!/bin/sh
#
# Replace one string by another in all the kernel sources.
#

set -e

for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
quilt add $file
sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
mv /tmp/$$ $file
quilt refresh
done

The script was run like this

sh replace kmem_cache_t "struct kmem_cache"

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
c4028958b6ecad064b1a6303a6a5906d4fe48d73 22-Nov-2006 David Howells <dhowells@redhat.com> WorkStruct: make allyesconfig

Fix up for make allyesconfig.

Signed-Off-By: David Howells <dhowells@redhat.com>
695368ac3302174531429a90d55c3f7f9b83906e 03-Oct-2006 Alasdair G Kergon <agk@redhat.com> [PATCH] dm snapshot: fix freeing pending exception

If a snapshot became invalid while there are outstanding pending_exceptions,
when pending_complete() processes each one it forgets to remove the
corresponding exception from its exception table before freeing it.

Fix this by moving the 'out:' label up one statement so that
remove_exception() is always called. Then __invalidate_exception() no longer
needs to call it and its 'pe' argument become superfluous.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
4b832e8de22726206eb886f6dbff47a0f3fe5168 03-Oct-2006 Alasdair G Kergon <agk@redhat.com> [PATCH] dm snapshot: tidy pe ref counting

Rename sibling_count to ref_count and introduce get and put functions.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
ca3a931fd33b841cbcc5932f8eac7c43e0909242 03-Oct-2006 Alasdair G Kergon <agk@redhat.com> [PATCH] dm snapshot: add workqueue

Add a workqueue so that I/O can be queued up to be flushed from a separate
thread (e.g. if local interrupts are disabled).

A new per-snapshot spinlock pe_lock is introduced to protect queued_bios.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
9d493fa8c943ed4ec6e42b7ebfd8f0b7657d54f8 03-Oct-2006 Alasdair G Kergon <agk@redhat.com> [PATCH] dm snapshot: tidy pending_complete

This patch rearranges the pending_complete() code so that the functional
changes in subsequent patches are clearer.

By consolidating the error and the non-error paths, we can move
error_snapshot_bios() and __flush_bios() in line.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
ba40a2aa6e6f3d084cf35c8b872fc9f18f91231f 03-Oct-2006 Alasdair G Kergon <agk@redhat.com> [PATCH] dm snapshot: tidy snapshot_map

This patch rearranges the snapshot_map code so that the functional changes in
subsequent patches are clearer.

The only functional change is to replace the existing read lock with a write
lock which the next patch needs.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
f9cea4f70734f743e0beb55552a9794fa5032645 03-Oct-2006 Mark McLoughlin <markmc@redhat.com> [PATCH] dm snapshot: fix metadata error handling

Fix the error handling when store.read_metadata is called: the error should be
returned immediately.

Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
4c7e3bf44d0ae227ea1ee87c2197212e65d043d7 03-Oct-2006 Mark McLoughlin <markmc@redhat.com> [PATCH] dm snapshot: allow zero chunk_size

The chunk size of snapshots cannot be changed so it is redundant to require it
as a parameter when activating an existing snapshot. Allow a value of zero in
this case and ignore it. For a new snapshot, use a default value if zero is
specified.

Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
92c060a692a0c3482cdfcaf346cb2f7572368895 03-Oct-2006 Milan Broz <mbroz@redhat.com> [PATCH] dm snapshot: fix invalidation ENOMEM

Fix ENOMEM error sign.

Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
6ab3d5624e172c553004ecc862bfeac16d9d68b7 30-Jun-2006 Jörn Engel <joern@wohnheim.fh-wedel.de> Remove obsolete #include <linux/config.h>

Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
72d9486169a2a8353e022813185ba2f32d7dde69 26-Jun-2006 Alasdair G Kergon <agk@redhat.com> [PATCH] dm: improve error message consistency

Tidy device-mapper error messages to include context information
automatically.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
c51c2752491e5e771de6c8861a85ba46752d7888 26-Jun-2006 Alasdair G Kergon <agk@redhat.com> [PATCH] dm snapshot: unify chunk_size

Persistent snapshots currently store a private copy of the chunk size.
Userspace also supplies the chunk size when loading a snapshot. Ensure
consistency by only storing the chunk_size in one place instead of two.

Currently the two sizes will differ if the chunk size supplied by userspace
does not match the chunk size an existing snapshot actually uses. Amongst
other problems, this causes an incorrect 'percentage full' to be reported.

The patch ensures consistency by only storing the chunk_size in one place,
removing it from struct pstore. Some initialisation is delayed until the
correct chunk_size is known. If read_header() discovers that the wrong chunk
size was supplied, the 'area' buffer (which the header already got read into)
is reinitialised to the correct size.

[akpm: too late for 2.6.17 - suitable for 2.6.17.x after it has settled]

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
138728dc96529f20dfe970c470e51885a60e329f 27-Mar-2006 Alasdair G Kergon <agk@redhat.com> [PATCH] dm snapshot: fix kcopyd destructor

Before removing a snapshot, wait for the completion of any kcopyd jobs using
it.

Do this by maintaining a count (nr_jobs) of how many outstanding jobs each
kcopyd_client has.

The snapshot destructor first unregisters the snapshot so that no new kcopyd
jobs (created by writes to the origin) will reference that particular
snapshot. kcopyd_client_destroy() is now run next to wait for the completion
of any outstanding jobs before the snapshot exception structures (that those
jobs reference) are freed.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
4ee218cd67b385759993a6c840ea45f0ee0a8b30 27-Mar-2006 Andrew Morton <akpm@osdl.org> [PATCH] dm: remove SECTOR_FORMAT

We don't know what type sector_t has. Sometimes it's unsigned long, sometimes
it's unsigned long long. For example on ppc64 it's unsigned long with
CONFIG_LBD=n and on x86_64 it's unsigned long long with CONFIG_LBD=n.

The way to handle all of this is to always use unsigned long long and to
always typecast the sector_t when printing it.

Acked-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
76df1c651b66bdf07d60b3d60789feb5f58d73e3 27-Mar-2006 Alasdair G Kergon <agk@redhat.com> [PATCH] device-mapper snapshot: fix invalidation

When a snapshot becomes invalid, s->valid is set to 0. In this state, a
snapshot can no longer be accessed.

When s->lock is acquired, before doing anything else, s->valid must be checked
to ensure the snapshot remains valid.

This patch eliminates some races (that may cause panics) by adding some
missing checks. At the same time, some unnecessary levels of indentation are
removed and snapshot invalidation is moved into a single function that always
generates a device-mapper event.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
b4b610f684d13bf8691feeae5d4d7a8bd1f1033e 27-Mar-2006 Alasdair G Kergon <agk@redhat.com> [PATCH] device-mapper snapshot: replace sibling list

The siblings "list" is used unsafely at the moment.

Firstly, only the element on the list being changed gets locked (via the
snapshot lock), not the next and previous elements which have pointers that
are also being changed.

Secondly, if you have two or more snapshots and write to the same chunk a
second time before every snapshot has finished making its private copy of the
data, if you're unlucky, _origin_write() could attempt its list_merge() and
dereference a 'last' pointer to a pending_exception structure that has just
been freed.

Analysis reveals that the list is actually only there for reference counting.
If 5 pending_exceptions are needed in origin_write, then the 5 are joined
together into a 5-element list - without a separate list head because there's
nowhere suitable to store it. As the pending_exceptions complete, they are
removed from the list one-by-one and any contents of origin_bios get moved
across to one of the remaining pending_exceptions on the list. Whichever one
is last is detected because list_empty() is then true and the origin_bios get
submitted.

The fix proposed here uses an alternative reference counting mechanism by
choosing one of the pending_exceptions as primary and maintaining an atomic
counter there.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
eccf081799be8d83852f183838bf26e1ca099db4 27-Mar-2006 Alasdair G Kergon <agk@redhat.com> [PATCH] device-mapper snapshot: fix origin_write pending_exception submission

Say you have several snapshots of the same origin and then you issue a write
to some place in the origin for the first time.

Before the device-mapper snapshot target lets the write go through to the
underlying device, it needs to make a copy of the data that is about to be
overwritten. Each snapshot is independent, so it makes one copy for each
snapshot.

__origin_write() loops through each snapshot and checks to see whether a copy
is needed for that snapshot. (A copy is only needed the first time that data
changes.)

If a copy is needed, the code allocates a 'pending_exception' structure
holding the details. It links these together for all the snapshots, then
works its way through this list and submits the copying requests to the kcopyd
thread by calling start_copy(). When each request is completed, the original
pending_exception structure gets freed in pending_complete().

If you're very unlucky, this structure can get freed *before* the submission
process has finished walking the list.

This patch:

1) Creates a new temporary list pe_queue to hold the pending exception
structures;

2) Does all the bookkeeping up-front, then walks through the new list
safely and calls start_copy() for each pending_exception that needed it;

3) Avoids attempting to add pe->siblings to the list if it's already
connected.

[NB This does not fix all the races in this code. More patches will follow.]

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
93d2341c750cda0df48a6cc67b35fe25f1ec47df 26-Mar-2006 Matthew Dobson <colpatch@us.ibm.com> [PATCH] mempool: use mempool_create_slab_pool()

Modify well over a dozen mempool users to call mempool_create_slab_pool()
rather than calling mempool_create() with extra arguments, saving about 30
lines of code and increasing readability.

Signed-off-by: Matthew Dobson <colpatch@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
4aac0a63fe8d418a2b74e43708f59380ba379a3b 01-Feb-2006 Alasdair G Kergon <agk@redhat.com> [PATCH] device-mapper snapshot: barriers not supported

The snapshot and origin targets are incapable of handling barriers and need to
indicate this.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
aa14edeb994f8f7e223d02ad14780bf2fa719f6d 01-Feb-2006 Alasdair G Kergon <agk@redhat.com> [PATCH] device-mapper snapshot: load metadata on creation

Move snapshot metadata loading to happen when the table is created instead of
when the device is resumed. Writes to the origin device don't trigger
exceptions until each snapshot table becomes active when resume() is called on
each snapshot.

If you're using lvm2, for this patch to work properly you should update to
lvm2 version 2.02.01 or later and device-mapper version 1.02.02 or later.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
858119e159384308a5dde67776691a2ebf70df0f 14-Jan-2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Unlinline a bunch of other functions

Remove the "inline" keyword from a bunch of big functions in the kernel with
the goal of shrinking it by 30kb to 40kb

Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2d38fe204461dc542bb38f2b01a9cd115b367b36 06-Jan-2006 Alasdair G Kergon <agk@redhat.com> [PATCH] device-mapper snapshot: metadata reading separation

More snapshot metadata reading into separate function, to prepare for changing
the place it gets called from.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
d5e404c10a98fc2979643476851e9cbdb1944812 13-Jul-2005 Alasdair G Kergon <agk@redhat.com> [PATCH] device-mapper snapshots: Handle origin extension

Handle writes to a snapshot-origin device that has been extended since the
snapshot was taken.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
f6a80ea8ed44de0b19c42d41928be37a186a3f41 13-Jul-2005 Alasdair G Kergon <agk@redhat.com> [PATCH] device-mapper multipath: Barriers not supported

dm multipath will report barriers as not supported with this patch.

Signed-off-by: Lars Marowsky-Bree <lmb@suse.de>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 17-Apr-2005 Linus Torvalds <torvalds@ppc970.osdl.org> Linux-2.6.12-rc2

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!