6b708de64adb6dc8319e7aeac922b46904fbeeec |
|
03-Jun-2014 |
Kent Overstreet <kmo@daterainc.com> |
bcache: Fix an infinite loop in journal replay When running with multiple cache devices, if one of the devices has a completely empty journal but we'd already found some journal entries on a previosu device we'd go into an infinite loop. Change-Id: I1dcdc0d738192746de28f40e8b08825b0dea5e2b Signed-off-by: Kent Overstreet <kmo@daterainc.com>
|
dbd810ab678d262d3772d29b65844d7b20dc47bc |
|
11-Apr-2014 |
Surbhi Palande <sap@daterainc.com> |
bcache: Fix to remove the rcu_sched stalls. while loop was executing infinitely. This fix ends the while loop gracefully. Signed-off-by: Surbhi Palande <sap@daterainc.com> Signed-off-by: Kent Overstreet <kmo@daterainc.com>
|
9aa61a992acceeec0d1de2cd99938421498659d5 |
|
11-Apr-2014 |
Kent Overstreet <kmo@daterainc.com> |
bcache: Fix a journal replay bug journal replay wansn't validating pointers with bch_extent_invalid() before derefing, fixed Signed-off-by: Kent Overstreet <kmo@daterainc.com>
|
2a285686c109816ba71a00b9278262cf02648258 |
|
05-Mar-2014 |
Kent Overstreet <kmo@daterainc.com> |
bcache: btree locking rework Add a new lock, b->write_lock, which is required to actually modify - or write - a btree node; this lock is only held for short durations. This means we can write out a btree node without taking b->lock, which _is_ held for long durations - solving a deadlock when btree_flush_write() (from the journalling code) is called with a btree node locked. Right now just occurs in bch_btree_set_root(), but with an upcoming journalling rework is going to happen a lot more. This also turns b->lock is now more of a read/intent lock instead of a read/write lock - but not completely, since it still blocks readers. May turn it into a real intent lock at some point in the future. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
|
c13f3af9247db929fe1be86c0442ef161e615ac4 |
|
09-Jan-2014 |
Kent Overstreet <kmo@daterainc.com> |
bcache: Add bch_keylist_init_single() This will potentially save us an allocation when we've got inode/dirent bkeys that don't fit in the keylist's inline keys. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
|
487dded86ea065317aea121bec8f1816f2f235c9 |
|
17-Mar-2014 |
Kent Overstreet <kmo@daterainc.com> |
bcache: Fix another bug recovering from unclean shutdown The on disk bucket gens are allowed to be out of date, when we reuse buckets that didn't have any live data in them. To deal with this, the initial gc has to update the bucket gen when we find a pointer gen newer than the bucket's gen. Unfortunately we weren't doing this for pointers in the journal that we're about to replay. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
|
27201cfdaa2aeb571191494c1bae6863ffb04108 |
|
13-Mar-2014 |
Kent Overstreet <kmo@daterainc.com> |
bcache: Fix a journalling reclaim after recovery bug On recovery we weren't correctly keeping track of what journal buckets had open journal entries, thus it was possible for them to be overwritten until we'd written all new journal entries. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
|
65ddf45a3102916fb622c71f7af158b19d49dc7f |
|
25-Feb-2014 |
Kent Overstreet <kmo@daterainc.com> |
bcache: Fix a null ptr deref in journal replay Signed-off-by: Kent Overstreet <kmo@daterainc.com>
|
dabb44334060b4b84051b34c58573e57cc7432b2 |
|
20-Feb-2014 |
Kent Overstreet <kmo@daterainc.com> |
bcache: Fix a shutdown bug Shutdown wasn't cancelling/waiting on journal_write_work() Signed-off-by: Kent Overstreet <kmo@daterainc.com>
|
ee811287c9f241641899788cbfc9d70ed96ba3a5 |
|
18-Dec-2013 |
Kent Overstreet <kmo@daterainc.com> |
bcache: Rename/shuffle various code around More work to disentangle bset.c from the rest of the code: Signed-off-by: Kent Overstreet <kmo@daterainc.com>
|
fafff81cead78157099df1ee10af16cc51893ddc |
|
18-Dec-2013 |
Kent Overstreet <kmo@daterainc.com> |
bcache: Bkey indexing renaming More refactoring: node() -> bset_bkey_idx() end() -> bset_bkey_last() Signed-off-by: Kent Overstreet <kmo@daterainc.com>
|
cb7a583e6a6ace661a5890803e115d2292a293df |
|
17-Dec-2013 |
Kent Overstreet <kmo@daterainc.com> |
bcache: kill closure locking usage Signed-off-by: Kent Overstreet <kmo@daterainc.com>
|
5775e2133dfa0dc1f4c7f233e2144d32cb516f54 |
|
11-Dec-2013 |
Kent Overstreet <kmo@daterainc.com> |
bcache: Performance fix for when journal entry is full We were unnecessarily waiting on a journal write to complete when we just needed to start a journal write and start setting up the next one. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
|
b3fa7e77e67e647db3db2166b65083a427d84ed3 |
|
05-Aug-2013 |
Kent Overstreet <kmo@daterainc.com> |
bcache: Minor journal fix The real fix is where we check the bytes we need against how much is remaining - we also need to check for a journal entry bigger than our buffer, we'll never write those and it would be bad if we tried to read one. Also improve the diagnostic messages. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
|
4f024f3797c43cb4b73cd2c50cec728842d0e49e |
|
12-Oct-2013 |
Kent Overstreet <kmo@daterainc.com> |
block: Abstract out bvec iterator Immutable biovecs are going to require an explicit iterator. To implement immutable bvecs, a later patch is going to add a bi_bvec_done member to this struct; for now, this patch effectively just renames things. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "Ed L. Cashin" <ecashin@coraid.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Lars Ellenberg <drbd-dev@lists.linbit.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Geoff Levand <geoff@infradead.org> Cc: Yehuda Sadeh <yehuda@inktank.com> Cc: Sage Weil <sage@inktank.com> Cc: Alex Elder <elder@inktank.com> Cc: ceph-devel@vger.kernel.org Cc: Joshua Morris <josh.h.morris@us.ibm.com> Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Neil Brown <neilb@suse.de> Cc: Alasdair Kergon <agk@redhat.com> Cc: Mike Snitzer <snitzer@redhat.com> Cc: dm-devel@redhat.com Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux390@de.ibm.com Cc: Boaz Harrosh <bharrosh@panasas.com> Cc: Benny Halevy <bhalevy@tonian.com> Cc: "James E.J. Bottomley" <JBottomley@parallels.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Nicholas A. Bellinger" <nab@linux-iscsi.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Chris Mason <chris.mason@fusionio.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Jaegeuk Kim <jaegeuk.kim@samsung.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Dave Kleikamp <shaggy@kernel.org> Cc: Joern Engel <joern@logfs.org> Cc: Prasad Joshi <prasadjoshi.linux@gmail.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Ben Myers <bpm@sgi.com> Cc: xfs@oss.sgi.com Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Len Brown <len.brown@intel.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Guo Chao <yan@linux.vnet.ibm.com> Cc: Tejun Heo <tj@kernel.org> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Cc: "Roger Pau Monné" <roger.pau@citrix.com> Cc: Jan Beulich <jbeulich@suse.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Ian Campbell <Ian.Campbell@citrix.com> Cc: Sebastian Ott <sebott@linux.vnet.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Jerome Marchand <jmarchand@redhat.com> Cc: Joe Perches <joe@perches.com> Cc: Peng Tao <tao.peng@emc.com> Cc: Andy Adamson <andros@netapp.com> Cc: fanchaoting <fanchaoting@cn.fujitsu.com> Cc: Jie Liu <jeff.liu@oracle.com> Cc: Sunil Mushran <sunil.mushran@gmail.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Namjae Jeon <namjae.jeon@samsung.com> Cc: Pankaj Kumar <pankaj.km@samsung.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Mel Gorman <mgorman@suse.de>6
|
81ab4190ac17df41686a37c97f701623276b652a |
|
31-Oct-2013 |
Kent Overstreet <kmo@daterainc.com> |
bcache: Pull on disk data structures out into a separate header Now, the on disk data structures are in a header that can be exported to userspace - and having them all centralized is nice too. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
|
cc7b8819212f437fc82f0f9cdc24deb0fb5d775f |
|
25-Jul-2013 |
Kent Overstreet <kmo@daterainc.com> |
bcache: Convert bch_btree_insert() to bch_btree_map_leaf_nodes() Last of the btree_map() conversions. Main visible effect is bch_btree_insert() is no longer taking a struct btree_op as an argument anymore - there's no fancy state machine stuff going on, it's just a normal function. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
|
1b207d80d5b986fb305bc899357435d319319513 |
|
11-Sep-2013 |
Kent Overstreet <kmo@daterainc.com> |
bcache: Kill op->replace This is prep work for converting bch_btree_insert to bch_btree_map_leaf_nodes() - we have to convert all its arguments to actual arguments. Bunch of churn, but should be straightforward. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
|
b54d6934da7857f87b092df9b77dc1f42818ba94 |
|
25-Jul-2013 |
Kent Overstreet <kmo@daterainc.com> |
bcache: Kill op->cl This isn't used for waiting asynchronously anymore - so this is a fairly trivial refactoring. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
|
c18536a72ddd7fe30d63e6c1500b5c930ac14594 |
|
25-Jul-2013 |
Kent Overstreet <kmo@daterainc.com> |
bcache: Prune struct btree_op Eventual goal is for struct btree_op to contain only what is necessary for traversing the btree. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
|
2c1953e201a05ddfb1ea53f23d81a492c6513028 |
|
25-Jul-2013 |
Kent Overstreet <kmo@daterainc.com> |
bcache: Convert bch_btree_read_async() to bch_btree_map_keys() This is a fairly straightforward conversion, mostly reshuffling - op->lookup_done goes away, replaced by MAP_DONE/MAP_CONTINUE. And the code for handling cache hits and misses wasn't really btree code, so it gets moved to request.c. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
|
0b93207abb40d3c42bb83eba1e1e7edc1da77810 |
|
25-Jul-2013 |
Kent Overstreet <kmo@daterainc.com> |
bcache: Move keylist out of btree_op Slowly working on pruning struct btree_op - the aim is for it to only contain things that are actually necessary for traversing the btree. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
|
a34a8bfd4e6358c646928320d37b0425c0762f8a |
|
25-Oct-2013 |
Kent Overstreet <kmo@daterainc.com> |
bcache: Refactor journalling flow control Making things less asynchronous that don't need to be - bch_journal() only has to block when the journal or journal entry is full, which is emphatically not a fast path. So make it a normal function that just returns when it finishes, to make the code and control flow easier to follow. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
|
c2f95ae2ebbe1ab61b1d4437f5923fdf720d4d4d |
|
25-Jul-2013 |
Kent Overstreet <kmo@daterainc.com> |
bcache: Clean up keylist code More random refactoring. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
|
4f3d40147b8d0ce7055e241e1d263e0aa2b2b46d |
|
11-Sep-2013 |
Kent Overstreet <kmo@daterainc.com> |
bcache: Add explicit keylist arg to btree_insert() Some refactoring - better to explicitly pass stuff around instead of having it all in the "big bag of state", struct btree_op. Going to prune struct btree_op quite a bit over time. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
|
77c320eb46e216c17aee5c943949229ccfed6904 |
|
12-Jul-2013 |
Kent Overstreet <kmo@daterainc.com> |
bcache: Add on error panic/unregister setting Works kind of like the ext4 setting, to panic or remount read only on errors. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
|
7857d5d470ec53bae187d144c69065ad3c0ebc21 |
|
09-Oct-2013 |
Kent Overstreet <kmo@daterainc.com> |
bcache: Fix a journalling performance bug
|
1394d6761b6e9e15ee7c632a6d48791188727b40 |
|
24-Sep-2013 |
Kent Overstreet <kmo@daterainc.com> |
bcache: Fix a flush/fua performance bug bch_journal_meta() was missing the flush to make the journal write actually go down (instead of waiting up to journal_delay_ms)... Whoops Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
c426c4fd46f709ade2bddd51c5738729c7ae1db5 |
|
24-Sep-2013 |
Kent Overstreet <kmo@daterainc.com> |
bcache: Fix for when no journal entries are found The journal replay code didn't handle this case, causing it to go into an infinite loop... Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
6d9d21e35fbfa2934339e96934f862d118abac23 |
|
24-Sep-2013 |
Kent Overstreet <kmo@daterainc.com> |
bcache: Fix a dumb journal discard bug That switch statement was obviously wrong, leading to some sort of weird spinning on rare occasion with discards enabled... Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
faa5673617656ee58369a3cfe4a312cfcdc59c81 |
|
12-Jul-2013 |
Kent Overstreet <kmo@daterainc.com> |
bcache: Journal replay fix The journal replay code starts by finding something that looks like a valid journal entry, then it does a binary search over the unchecked region of the journal for the journal entries with the highest sequence numbers. Trouble is, the logic was wrong - journal_read_bucket() returns true if it found journal entries we need, but if the range of journal entries we're looking for loops around the end of the journal - in that case journal_read_bucket() could return true when it hadn't found the highest sequence number we'd seen yet, and in that case the binary search did the wrong thing. Whoops. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10
|
e49c7c374e7aacd1f04ecbc21d9dbbeeea4a77d6 |
|
27-Jun-2013 |
Kent Overstreet <koverstreet@google.com> |
bcache: FUA fixes Journal writes need to be marked FUA, not just REQ_FLUSH. And btree node writes have... weird ordering requirements. Signed-off-by: Kent Overstreet <koverstreet@google.com>
|
c37511b863f36c1cc6e18440717fd4cc0e881b8a |
|
27-Apr-2013 |
Kent Overstreet <koverstreet@google.com> |
bcache: Fix/revamp tracepoints The tracepoints were reworked to be more sensible, and fixed a null pointer deref in one of the tracepoints. Converted some of the pr_debug()s to tracepoints - this is partly a performance optimization; it used to be that with DEBUG or CONFIG_DYNAMIC_DEBUG pr_debug() was an empty macro; but at some point it was changed to an empty inline function. Some of the pr_debug() statements had rather expensive function calls as part of the arguments, so this code was getting run unnecessarily even on non debug kernels - in some fast paths, too. Signed-off-by: Kent Overstreet <koverstreet@google.com>
|
5794351146199b9ac67a5ab1beab82be8bfd7b5d |
|
25-Apr-2013 |
Kent Overstreet <koverstreet@google.com> |
bcache: Refactor btree io The most significant change is that btree reads are now done synchronously, instead of asynchronously and doing the post read stuff from a workqueue. This was originally done because we can't block on IO under generic_make_request(). But - we already have a mechanism to punt cache lookups to workqueue if needed, so if we just use that we don't have to deal with the complexity of doing things asynchronously. The main benefit is this makes the locking situation saner; we can hold our write lock on the btree node until we're finished reading it, and we don't need that btree_node_read_done() flag anymore. Also, for writes, btree_write() was broken out into btree_node_write() and btree_leaf_dirty() - the old code with the boolean argument was dumb and confusing. The prio_blocked mechanism was improved a bit too, now the only counter is in struct btree_write, we don't mess with transfering a count from struct btree anymore. This required changing garbage collection to block prios at the start and unblock when it finishes, which is cleaner than what it was doing anyways (the old code had mostly the same effect, but was doing it in a convoluted way) And the btree iter btree_node_read_done() uses was converted to a real mempool. Signed-off-by: Kent Overstreet <koverstreet@google.com>
|
c19ed23a0b1848eca6b6f22c1ee233abe54d37f9 |
|
26-Mar-2013 |
Kent Overstreet <koverstreet@google.com> |
bcache: Sparse fixes Signed-off-by: Kent Overstreet <koverstreet@google.com>
|
169ef1cf6171d35550fef85645b83b960e241cff |
|
28-Mar-2013 |
Kent Overstreet <koverstreet@google.com> |
bcache: Don't export utility code, prefix with bch_ Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: linux-bcache@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
b1a67b0f4c747ca10c96ebb24f04e2a74b3c298d |
|
25-Mar-2013 |
Kent Overstreet <koverstreet@google.com> |
bcache: Style/checkpatch fixes Took out some nested functions, and fixed some more checkpatch complaints. Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: linux-bcache@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
cafe563591446cf80bfbc2fe3bc72a2e36cf1060 |
|
24-Mar-2013 |
Kent Overstreet <koverstreet@google.com> |
bcache: A block layer cache Does writethrough and writeback caching, handles unclean shutdown, and has a bunch of other nifty features motivated by real world usage. See the wiki at http://bcache.evilpiepirate.org for more. Signed-off-by: Kent Overstreet <koverstreet@google.com>
|