24b961f811a3e790a9b93604d2594bfb6cce4fa4 |
|
01-Apr-2012 |
Jes Sorensen <Jes.Sorensen@redhat.com> |
md: Avoid OOPS when reshaping raid1 to raid0 raid1 arrays do not have the notion of chunk size. Calculate the largest chunk sector size we can use to avoid a divide by zero OOPS when aligning the size of the new array to the chunk size. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>
|
0366ef847581d692e197b88825867ca9ee00e358 |
|
02-Apr-2012 |
majianpeng <majianpeng@gmail.com> |
md/raid0: If md_integrity_register() fails, raid0_run() must free the mem. Signed-off-by: majianpeng <majianpeng@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>
|
ba13da47ffa202784355561f72160a41350e95cc |
|
18-Mar-2012 |
NeilBrown <neilb@suse.de> |
md: add proper merge_bvec handling to RAID0 and Linear. These personalities currently set a max request size of one page when any member device has a merge_bvec_fn because they don't bother to call that function. This causes extra works in splitting and combining requests. So make the extra effort to call the merge_bvec_fn when it exists so that we end up with larger requests out the bottom. Signed-off-by: NeilBrown <neilb@suse.de>
|
dafb20fa34320a472deb7442f25a0c086e0feb33 |
|
18-Mar-2012 |
NeilBrown <neilb@suse.de> |
md: tidy up rdev_for_each usage. md.h has an 'rdev_for_each()' macro for iterating the rdevs in an mddev. However it uses the 'safe' version of list_for_each_entry, and so requires the extra variable, but doesn't include 'safe' in the name, which is useful documentation. Consequently some places use this safe version without needing it, and many use an explicity list_for_each entry. So: - rename rdev_for_each to rdev_for_each_safe - create a new rdev_for_each which uses the plain list_for_each_entry, - use the 'safe' version only where needed, and convert all other list_for_each_entry calls to use rdev_for_each. Signed-off-by: NeilBrown <neilb@suse.de>
|
056075c76417b112b4924e7b6386fdc6dfc9ac03 |
|
03-Jul-2011 |
Paul Gortmaker <paul.gortmaker@windriver.com> |
md: Add module.h to all files using it implicitly A pending cleanup will mean that module.h won't be implicitly everywhere anymore. Make sure the modular drivers in md dir are actually calling out for <module.h> explicitly in advance. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
|
84fc4b56db85cb9e05326424049973a2036c9940 |
|
11-Oct-2011 |
NeilBrown <neilb@suse.de> |
md: rename "mdk_personality" to "md_personality" "mdk" doesn't mean anything any more. Signed-off-by: NeilBrown <neilb@suse.de>
|
e373ab109172abc2d821bd3b5c1b400acddef5a5 |
|
11-Oct-2011 |
NeilBrown <neilb@suse.de> |
md/raid0: typedef removal: raid0_conf_t -> struct r0conf Signed-off-by: NeilBrown <neilb@suse.de>
|
fd01b88c75a718020ff77e7f560d33835e9b58de |
|
11-Oct-2011 |
NeilBrown <neilb@suse.de> |
md: remove typedefs: mddev_t -> struct mddev Having mddev_t and 'struct mddev_s' is ugly and not preferred Signed-off-by: NeilBrown <neilb@suse.de>
|
3cb03002000f133f9f97269edefd73611eafc873 |
|
11-Oct-2011 |
NeilBrown <neilb@suse.de> |
md: removing typedefs: mdk_rdev_t -> struct md_rdev The typedefs are just annoying. 'mdk' probably refers to 'md_k.h' which used to be an include file that defined this thing. Signed-off-by: NeilBrown <neilb@suse.de>
|
50de8df4abca1b27dbf7b2f81a56451bd8b5a7d8 |
|
07-Oct-2011 |
NeilBrown <neilb@suse.de> |
md/raid0: convert some printks to pr_debug. When md assembles a RAID0 array it prints out lots of info which is really just for debugging, so convert that to pr_debug. It also prints out the resulting configuration which could be interesting, so keep that as 'printk' but tidy it up a bit. Signed-off-by: NeilBrown <neilb@suse.de>
|
bdc04e6b15f70a8f96d8cdfe21df159a6466b49a |
|
07-Oct-2011 |
NeilBrown <neilb@suse.de> |
md: remove some old DEBUGging code. This code is not really helpful and is hard to maintain, so just discard it. Signed-off-by: NeilBrown <neilb@suse.de>
|
5a7bbad27a410350e64a2d7f5ec18fc73836c14f |
|
12-Sep-2011 |
Christoph Hellwig <hch@infradead.org> |
block: remove support for bio remapping from ->make_request There is very little benefit in allowing to let a ->make_request instance update the bios device and sector and loop around it in __generic_make_request when we can archive the same through calling generic_make_request from the driver and letting the loop in generic_make_request handle it. Note that various drivers got the return value from ->make_request and returned non-zero values for errors. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: NeilBrown <neilb@suse.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
a91a2785b200864aef2270ed6a3babac7a253a20 |
|
17-Mar-2011 |
Martin K. Petersen <martin.petersen@oracle.com> |
block: Require subsystems to explicitly allocate bio_set integrity mempool MD and DM create a new bio_set for every metadevice. Each bio_set has an integrity mempool attached regardless of whether the metadevice is capable of passing integrity metadata. This is a waste of memory. Instead we defer the allocation decision to MD and DM since we know at metadevice creation time whether integrity passthrough is needed or not. Automatic integrity mempool allocation can then be removed from bioset_create() and we make an explicit integrity allocation for the fs_bio_set. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reported-by: Zdenek Kabelac <zkabelac@redhat.com> Acked-by: Mike Snitzer <snizer@redhat.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
7eaceaccab5f40bbfda044629a6298616aeaed50 |
|
10-Mar-2011 |
Jens Axboe <jaxboe@fusionio.com> |
block: remove per-queue plugging Code has been converted over to the new explicit on-stack plugging, and delay users have been converted to use the new API for that. So lets kill off the old plugging along with aops->sync_page(). Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
da9cf5050a2e3dbc3cf26a8d908482eb4485ed49 |
|
21-Feb-2011 |
NeilBrown <neilb@suse.de> |
md: avoid spinlock problem in blk_throtl_exit blk_throtl_exit assumes that ->queue_lock still exists, so make sure that it does. To do this, we stop redirecting ->queue_lock to conf->device_lock and leave it pointing where it is initialised - __queue_lock. As the blk_plug functions check the ->queue_lock is held, we now take that spin_lock explicitly around the plug functions. We don't need the locking, just the warning removal. This is needed for any kernel with the blk_throtl code, which is which is 2.6.37 and later. Cc: stable@kernel.org Signed-off-by: NeilBrown <neilb@suse.de>
|
f7bee80945155ad0326916486dabc38428c6cdef |
|
14-Feb-2011 |
Krzysztof Wojcik <krzysztof.wojcik@intel.com> |
md: Fix raid1->raid0 takeover Takeover raid1->raid0 not succeded. Kernel message is shown: "md/raid0:md126: too few disks (1 of 2) - aborting!" Problem was that we weren't updating ->raid_disks for that takeover, unlike all the others. Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
|
fc3a08b85b7a4f6c1069e5f71f6ad40d925ff55b |
|
31-Jan-2011 |
Krzysztof Wojcik <krzysztof.wojcik@intel.com> |
Add raid1->raid0 takeover support This patch introduces raid 1 to raid0 takeover operation in kernel space. Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com> Signed-off-by: Neil Brown <neilb@nbeee.brown>
|
e9c7469bb4f502dafc092166201bea1ad5fc0fbf |
|
03-Sep-2010 |
Tejun Heo <tj@kernel.org> |
md: implment REQ_FLUSH/FUA support This patch converts md to support REQ_FLUSH/FUA instead of now deprecated REQ_HARDBARRIER. In the core part (md.c), the following changes are notable. * Unlike REQ_HARDBARRIER, REQ_FLUSH/FUA don't interfere with processing of other requests and thus there is no reason to mark the queue congested while FLUSH/FUA is in progress. * REQ_FLUSH/FUA failures are final and its users don't need retry logic. Retry logic is removed. * Preflush needs to be issued to all member devices but FUA writes can be handled the same way as other writes - their processing can be deferred to request_queue of member devices. md_barrier_request() is renamed to md_flush_request() and simplified accordingly. For linear, raid0 and multipath, the core changes are enough. raid1, 5 and 10 need the following conversions. * raid1: Handling of FLUSH/FUA bio's can simply be deferred to request_queues of member devices. Barrier related logic removed. * raid5: Queue draining logic dropped. FUA bit is propagated through biodrain and stripe resconstruction such that all the updated parts of the stripe are written out with FUA writes if any of the dirtying writes was FUA. preread_active_stripes handling in make_request() is updated as suggested by Neil Brown. * raid10: FUA bit needs to be propagated to write clones. linear, raid0, 1, 5 and 10 tested. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Neil Brown <neilb@suse.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
7b6d91daee5cac6402186ff224c3af39d79f4a0e |
|
07-Aug-2010 |
Christoph Hellwig <hch@lst.de> |
block: unify flags for struct bio and struct request Remove the current bio flags and reuse the request flags for the bio, too. This allows to more easily trace the type of I/O from the filesystem down to the block driver. There were two flags in the bio that were missing in the requests: BIO_RW_UNPLUG and BIO_RW_AHEAD. Also I've renamed two request flags that had a superflous RW in them. Note that the flags are in bio.h despite having the REQ_ name - as blkdev.h includes bio.h that is the only way to go for now. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
049d6c1ef983c9ac43aa423dfd752071a5b0002d |
|
16-Jun-2010 |
Maciej Trela <maciej.trela@intel.com> |
md: enable raid4->raid0 takeover Only level 5 with layout=PARITY_N can be taken over to raid0 now. Lets allow level 4 either. Signed-off-by: Maciej Trela <maciej.trela@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
|
001048a318d48e93cb6a1246f3b20335b2a7c855 |
|
16-Jun-2010 |
Maciej Trela <maciej.trela@intel.com> |
md: clear layout after ->raid0 takeover After takeover from raid5/10 -> raid0 mddev->layout is not cleared. Signed-off-by: Maciej Trela <maciej.trela@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
|
e93f68a1fc6244c05ad8fae28e75835ec74ab34e |
|
15-Jun-2010 |
NeilBrown <neilb@suse.de> |
md: fix handling of array level takeover that re-arranges devices. Most array level changes leave the list of devices largely unchanged, possibly causing one at the end to become redundant. However conversions between RAID0 and RAID10 need to renumber all devices (except 0). This renumbering is currently being done in the ->run method when the new personality takes over. However this is too late as the common code in md.c might already have invalidated some of the devices if they had a ->raid_disk number that appeared to high. Moving it into the ->takeover method is too early as the array is still active at that time and wrong ->raid_disk numbers could cause confusion. So add a ->new_raid_disk field to mdk_rdev_s and use it to communicate the new raid_disk number. Now the common code knows exactly which devices need to be renumbered, and which can be invalidated, and can do it all at a convenient time when the array is suspend. It can also update some symlinks in sysfs which previously were not be updated correctly. Reported-by: Maciej Trela <maciej.trela@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
|
b5a20961f3479dda48bdc340354ee5469997839d |
|
03-May-2010 |
NeilBrown <neilb@suse.de> |
md/raid0: tidy up printk messages. All messages now start md/raid0:md-device-name: Signed-off-by: NeilBrown <neilb@suse.de>
|
21a52c6d05c15f862797736393915bfa8cd40ee9 |
|
01-Apr-2010 |
NeilBrown <neilb@suse.de> |
md: pass mddev to make_request functions rather than request_queue We used to pass the personality make_request function direct to the block layer so the first argument had to be a queue. But now we have the intermediary md_make_request so it makes at lot more sense to pass a struct mddev_s. It makes it possible to have an mddev without its own queue too. Signed-off-by: NeilBrown <neilb@suse.de>
|
490773268cf64f68da2470e07b52c7944da6312d |
|
25-Mar-2010 |
NeilBrown <neilb@suse.de> |
md: move io accounting out of personalities into md_make_request While I generally prefer letting personalities do as much as possible, given that we have a central md_make_request anyway we may as well use it to simplify code. Also this centralises knowledge of ->gendisk which will help later. Signed-off-by: NeilBrown <neilb@suse.de>
|
9af204cf720cedf369cf823bbd806c350201f7ea |
|
08-Mar-2010 |
Trela, Maciej <Maciej.Trela@intel.com> |
md: Add support for Raid5->Raid0 and Raid10->Raid0 takeover Signed-off-by: Maciej Trela <maciej.trela@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
|
84707f38e767ac470fd82af6c45a8cafe2bd1b9a |
|
16-Mar-2010 |
NeilBrown <neilb@suse.de> |
md: don't use mddev->raid_disks in raid0 or raid10 while array is active. In a subsequent patch we will make it possible to change mddev->raid_disks while a RAID0 or RAID10 array is active. This is part of the process of reshaping such an array. This means that we cannot use this value while processes requests (it is OK to use it during initialisation as we are locked against changes then). Both RAID0 and RAID10 have the same value stored in the private data structure, so use that value instead. Signed-off-by: NeilBrown <neilb@suse.de>
|
5a0e3ad6af8660be21ca98a971cd00f331318c05 |
|
24-Mar-2010 |
Tejun Heo <tj@kernel.org> |
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
|
627a2d3c29427637f4c5d31ccc7fcbd8d312cd71 |
|
08-Mar-2010 |
NeilBrown <neilb@suse.de> |
md: deal with merge_bvec_fn in component devices better. If a component device has a merge_bvec_fn then as we never call it we must ensure we never need to. Currently this is done by setting max_sector to 1 PAGE, however this does not stop a bio being created with several sub-page iovecs that would violate the merge_bvec_fn. So instead set max_segments to 1 and set the segment boundary to the same as a page boundary to ensure there is only ever one single-page segment of IO requested at a time. This can particularly be an issue when 'xen' is used as it is known to submit multiple small buffers in a single bio. Signed-off-by: NeilBrown <neilb@suse.de> Cc: stable@kernel.org
|
086fa5ff0854c676ec333760f4c0154b3b242616 |
|
26-Feb-2010 |
Martin K. Petersen <martin.petersen@oracle.com> |
block: Rename blk_queue_max_sectors to blk_queue_max_hw_sectors The block layer calling convention is blk_queue_<limit name>. blk_queue_max_sectors predates this practice, leading to some confusion. Rename the function to appropriately reflect that its intended use is to set max_hw_sectors. Also introduce a temporary wrapper for backwards compability. This can be removed after the merge window is closed. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
0efb9e6191e1d3d34c1db90b829b742bc36d532e |
|
13-Dec-2009 |
NeilBrown <neilb@suse.de> |
md: add MODULE_DESCRIPTION for all md related modules. Suggested by Oren Held <orenhe@il.ibm.com> Signed-off-by: NeilBrown <neilb@suse.de>
|
a2826aa92e2e14db372eda01d333267258944033 |
|
13-Dec-2009 |
NeilBrown <neilb@suse.de> |
md: support barrier requests on all personalities. Previously barriers were only supported on RAID1. This is because other levels requires synchronisation across all devices and so needed a different approach. Here is that approach. When a barrier arrives, we send a zero-length barrier to every active device. When that completes - and if the original request was not empty - we submit the barrier request itself (with the barrier flag cleared) and then submit a fresh load of zero length barriers. The barrier request itself is asynchronous, but any subsequent request will block until the barrier completes. The reason for clearing the barrier flag is that a barrier request is allowed to fail. If we pass a non-empty barrier through a striping raid level it is conceivable that part of it could succeed and part could fail. That would be way too hard to deal with. So if the first run of zero length barriers succeed, we assume all is sufficiently well that we send the request and ignore errors in the second run of barriers. RAID5 needs extra care as write requests may not have been submitted to the underlying devices yet. So we flush the stripe cache before proceeding with the barrier. Note that the second set of zero-length barriers are submitted immediately after the original request is submitted. Thus when a personality finds mddev->barrier to be set during make_request, it should not return from make_request until the corresponding per-device request(s) have been queued. That will be done in later patches. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Andre Noll <maan@systemlinux.org>
|
3fa841d7e7266f6fcc1b3885b905f5153ba897d8 |
|
23-Sep-2009 |
NeilBrown <neilb@suse.de> |
md: report device as congested when suspended This should writeback from coming when the device is temporarily suspended. Signed-off-by: NeilBrown <neilb@suse.de>
|
a9f326ebf22a0de776815240fb76dabe139397ea |
|
23-Sep-2009 |
NeilBrown <neilb@suse.de> |
md: remove sparse waring "symbol xxx shadows an earlier one" Rename some variable and remove some duplicate definitions to avoid there warnings. None of them are actual errors. Signed-off-by: NeilBrown <neilb@suse.de>
|
1f98a13f623e0ef666690a18c1250335fc6d7ef1 |
|
11-Sep-2009 |
Jens Axboe <jens.axboe@oracle.com> |
bio: first step in sanitizing the bio->bi_rw flag testing Get rid of any functions that test for these bits and make callers use bio_rw_flagged() directly. Then it is at least directly apparent what variable and flag they check. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
ac5e7113e74872928844d00085bd47c988f12728 |
|
03-Aug-2009 |
Andre Noll <maan@systemlinux.org> |
md: Push down data integrity code to personalities. This patch replaces md_integrity_check() by two new public functions: md_integrity_register() and md_integrity_add_rdev() which are both personality-independent. md_integrity_register() is called from the ->run and ->hot_remove methods of all personalities that support data integrity. The function iterates over the component devices of the array and determines if all active devices are integrity capable and if their profiles match. If this is the case, the common profile is registered for the mddev via blk_integrity_register(). The second new function, md_integrity_add_rdev() is called from the ->hot_add_disk methods, i.e. whenever a new device is being added to a raid array. If the new device does not support data integrity, or has a profile different from the one already registered, data integrity for the mddev is disabled. For raid0 and linear, only the call to md_integrity_register() from the ->run method is necessary. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
|
8f6c2e4b325a8e9f8f47febb2fd0ed4fae7d45a9 |
|
01-Jul-2009 |
Martin K. Petersen <martin.petersen@oracle.com> |
md: Use new topology calls to indicate alignment and I/O sizes Switch MD over to the new disk_stack_limits() function which checks for aligment and adjusts preferred I/O sizes when stacking. Also indicate preferred I/O sizes where applicable. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>
|
0894cc3066aaa3e75a99383c0d25feebf9b688ac |
|
18-Jun-2009 |
Andre Noll <maan@systemlinux.org> |
md: Move check for bitmap presence to personality code. If the superblock of a component device indicates the presence of a bitmap but the corresponding raid personality does not support bitmaps (raid0, linear, multipath, faulty), then something is seriously wrong and we'd better refuse to run such an array. Currently, this check is performed while the superblocks are examined, i.e. before entering personality code. Therefore the generic md layer must know which raid levels support bitmaps and which do not. This patch avoids this layer violation without adding identical code to various personalities. This is accomplished by introducing a new public function to md.c, md_check_no_bitmap(), which replaces the hard-coded checks in the superblock loading functions. A call to md_check_no_bitmap() is added to the ->run method of each personality which does not support bitmaps and assembly is aborted if at least one component device contains a bitmap. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
|
13f2682b7216ebebd72b3d5868fe7fccec91a92d |
|
18-Jun-2009 |
NeilBrown <neilb@suse.de> |
md: raid0/linear: ensure device sizes are rounded to chunk size. This is currently ensured by common code, but it is more reliable to ensure it where it is needed in personality code. All the other personalities that care already round the size to the chunk_size. raid0 and linear are the only hold-outs. Signed-off-by: NeilBrown <neilb@suse.de>
|
d6e412eaa52db82010f12ea7d2c9b9468e933c44 |
|
18-Jun-2009 |
NeilBrown <neilb@suse.de> |
md: raid0: chunk_sectors cleanups. following the conversion to chunk_sectors, there is room for cleaning up a little. Signed-off-by: NeilBrown <neilb@suse.de>
|
9d8f0363623b3da12c43007cf77f5e1a4e8a5964 |
|
18-Jun-2009 |
Andre Noll <maan@systemlinux.org> |
md: Make mddev->chunk_size sector-based. This patch renames the chunk_size field to chunk_sectors with the implied change of semantics. Since is_power_of_2(chunk_size) = is_power_of_2(chunk_sectors << 9) = is_power_of_2(chunk_sectors) these bits don't need an adjustment for the shift. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
|
fbb704efb784e2c8418e34dc3013af76bdd58101 |
|
16-Jun-2009 |
raz ben yehuda <raziebe@gmail.com> |
md: raid0 :Enables chunk size other than powers of 2. Maintain two flows, one for pow2 chunk sizes (which uses masks and shift), and a flow for the general case (which uses sector_div). This is for the sake of performance. - introduce map_sector and is_io_in_chunk_boundary to encapsulate those two flows better for raid0_make_request - fix blk_mergeable to support the two flows. Signed-off-by: raziebe@gmail.com Signed-off-by: NeilBrown <neilb@suse.de>
|
92e59b6ba21845fadd2cce725010a9351740b76e |
|
16-Jun-2009 |
raz ben yehuda <raziebe@gmail.com> |
md: raid0: chunk size check in raid0_run have raid0 check chunk size in run method instead of in md. This is part of a series moving the checks from common code to the personalities where they belong. hardsect is short and chunksize is an int, so it is safe to use %. Signed-off-by: raziebe@gmail.com Signed-off-by: NeilBrown <neilb@suse.de>
|
46994191ae8fdf1cbcc1f29282576b269a638c69 |
|
16-Jun-2009 |
raz ben yehuda <raziebe@gmail.com> |
md: have raid0 report its formation Report to the user what are the raid zones Signed-off-by: raziebe@gmail.com Signed-off-by: NeilBrown <neilb@suse.de>
|
1b9614291eb319fad96de45392eb4452ad39f0ee |
|
16-Jun-2009 |
raz ben yehuda <raziebe@gmail.com> |
md: have raid0 compile with MD_DEBUG on Because of the removal of the device list from the strips raid0 did not compile with MD_DEBUG flag on Signed-off-by: NeilBrown <neilb@suse.de>
|
070ec55d07157a3041f92654135c3c6e2eaaf901 |
|
16-Jun-2009 |
NeilBrown <neilb@suse.de> |
md: remove mddev_to_conf "helper" macro Having a macro just to cast a void* isn't really helpful. I would must rather see that we are simply de-referencing ->private, than have to know what the macro does. So open code the macro everywhere and remove the pointless cast. Signed-off-by: NeilBrown <neilb@suse.de>
|
a6b3deafe0c50e3e873e8ed5cc8abfcb25c05eff |
|
16-Jun-2009 |
NeilBrown <neilb@suse.de> |
md: raid0: remove setting of segment boundary. This setting doesn't seem to make sense (half the chunk size??) and shouldn't be needed. The segment boundary exported by raid0 should simply be the minimum of the segment boundary of all component devices. And we already get that right. Signed-off-by: NeilBrown <neilb@suse.de>
|
b414579f4573b6dc8583e31b01dcffd13f49fd62 |
|
16-Jun-2009 |
NeilBrown <neilb@suse.de> |
md: raid0: remove ->dev pointer from strip_zone structure If we treat conf->devlist more like a 2 dimensional array, we can get the devlist for a particular zone simply by indexing that array, so we don't need to store the pointers to subarrays in strip_zone. This makes strip_zone smaller and so (hopefully) searches faster. Signed-of-by: NeilBrown <neilb@suse.de>
|
49f357a22b3fa3eeac042dfa0a6cae920c174e48 |
|
16-Jun-2009 |
NeilBrown <neilb@suse.de> |
md: raid0: remove ->sectors from the strip_zone structure. storing ->sectors is redundant as is can be computed from the difference z->zone_end - (z-1)->zone_end The one place where it is used, it is just as efficient to use a zone_end value instead. And removing it makes strip_zone smaller, so they array of these that is searched on every request has a better chance to say in cache. So discard the field and get the value from elsewhere. Signed-off-by: NeilBrown <neilb@suse.de>
|
fb5ab4b5d6e16fd5006c9f800d0116f3547cb760 |
|
16-Jun-2009 |
Andre Noll <maan@systemlinux.org> |
md: raid0: Fix a memory leak when stopping a raid0 array. raid0_stop() removes all references to the raid0 configuration but misses to free the ->devlist buffer. This patch closes this leak, removes a pointless initialization and fixes a coding style issue in raid0_stop(). Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
|
ed7b00380d957ec770b5e90380d012c6062c13cc |
|
16-Jun-2009 |
Andre Noll <maan@systemlinux.org> |
md: raid0: Allocate all buffers for the raid0 configuration in one function. Currently the raid0 configuration is allocated in raid0_run() while the buffers for the strip_zone and the dev_list arrays are allocated in create_strip_zones(). On errors, all three buffers are freed in raid0_run(). It's easier and more readable to do the allocation and cleanup within a single function. So move that code into create_strip_zones(). Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
|
5568a6035d9fca2cd8f1ef7005e215eae4e65fab |
|
16-Jun-2009 |
Andre Noll <maan@systemlinux.org> |
md: raid0: Make raid0_run() return a proper error code. Currently raid0_run() always returns -ENOMEM on errors. This is incorrect as running the array might fail for other reasons, for example because not all component devices were available. This patch changes create_strip_zones() so that it returns a proper error code (either -ENOMEM or -EINVAL) rather than 1 on errors and makes raid0_run(), its single caller, return that value instead of -ENOMEM. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
|
8f79cfcdb65472f1504ade2f53e5f2bfdaeb95da |
|
16-Jun-2009 |
Andre Noll <maan@systemlinux.org> |
md: raid0: Remove hash spacing and sector shift. The "sector_shift" and "spacing" fields of struct raid0_private_data were only used for the hash table lookups. So the removal of the hash table allows get rid of these fields as well which simplifies create_strip_zones() and raid0_run() quite a bit. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
|
09770e0b6ee649313611a2d6a9b44f456072dbd6 |
|
16-Jun-2009 |
Andre Noll <maan@systemlinux.org> |
md: raid0: Remove hash table. The raid0 hash table has become unused due to the changes in the previous patch. This patch removes the hash table allocation and setup code and kills the hash_table field of struct raid0_private_data. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
|
d27a43abd7be0ab4b2337e4587feca8c7340e5f9 |
|
16-Jun-2009 |
NeilBrown <neilb@suse.de> |
md/raid0: two cleanups in create_stripe_zones. 1/ remove current_start. The same value is available in zone->dev_start and storing it separately doesn't gain anything. 2/ rename curr_zone_start to curr_zone_end as we are now more focused on the 'end' of each zone. We end up storing the same number though - the old name was a little confusing (and what does 'current' mean in this context anyway). Signed-off-by: NeilBrown <neilb@suse.de>
|
dc58266385e51420298275c90a616c34f1473a73 |
|
16-Jun-2009 |
Andre Noll <maan@systemlinux.org> |
md: raid0: Replace hash table lookup by looping over all strip_zones. The number of strip_zones of a raid0 array is bounded by the number of drives in the array and is in fact much smaller for typical setups. For example, any raid0 array containing identical disks will have only a single strip_zone. Therefore, the hash tables which are used for quickly finding the strip_zone that holds a particular sector are of questionable value and add quite a bit of unnecessary complexity. This patch replaces the hash table lookup by equivalent code which simply loops over all strip zones to find the zone that holds the given sector. In order to make this loop as fast as possible, the zone->start field of struct strip_zone has been renamed to zone_end, and it now stores the beginning of the next zone in sectors. This allows to save one addition in the loop. Subsequent cleanup patches will remove the hash table structure. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
|
ae03bf639a5027d27270123f5f6e3ee6a412781d |
|
22-May-2009 |
Martin K. Petersen <martin.petersen@oracle.com> |
block: Use accessor functions for queue limits Convert all external users of queue limits to using wrapper functions instead of poking the request queue variables directly. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
b522adcde9c4d3fb7b579cfa9160d8bde7744be8 |
|
31-Mar-2009 |
Dan Williams <dan.j.williams@intel.com> |
md: 'array_size' sysfs attribute Allow userspace to set the size of the array according to the following semantics: 1/ size must be <= to the size returned by mddev->pers->size(mddev, 0, 0) a) If size is set before the array is running, do_md_run will fail if size is greater than the default size b) A reshape attempt that reduces the default size to less than the set array size should be blocked 2/ once userspace sets the size the kernel will not change it 3/ writing 'default' to this attribute returns control of the size to the kernel and reverts to the size reported by the personality Also, convert locations that need to know the default size from directly reading ->array_sectors to <pers>_size. Resync/reshape operations always follow the default size. Finally, fixup other locations that read a number of 1k-blocks from userspace to use strict_blocks_to_sectors() which checks for unsigned long long to sector_t overflow and blocks to sectors overflow. Reviewed-by: Andre Noll <maan@systemlinux.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
1f403624bde3c678a166984b1e6a727a0ce06f2b |
|
31-Mar-2009 |
Dan Williams <dan.j.williams@intel.com> |
md: centralize ->array_sectors modifications Get personalities out of the business of directly modifying ->array_sectors. Lays groundwork to introduce policy on when ->array_sectors can be modified. Reviewed-by: Andre Noll <maan@systemlinux.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
80c3a6ce4ba4470379b9e6a4d9bcd9d2ee26ae03 |
|
18-Mar-2009 |
Dan Williams <dan.j.williams@intel.com> |
md: add 'size' as a personality method In preparation for giving userspace control over ->array_sectors we need to be able to retrieve the 'default' size, and the 'anticipated' size when a reshape is requested. For personalities that do not reshape emit a warning if anything but the default size is requested. In the raid5 case we need to update ->previous_raid_disks to make the new 'default' size available. Reviewed-by: Andre Noll <maan@systemlinux.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
dd8ac336c13fd8afdb082ebacb1cddd5cf727889 |
|
31-Mar-2009 |
Andre Noll <maan@systemlinux.org> |
md: Represent raid device size in sectors. This patch renames the "size" field of struct mdk_rdev_s to "sectors" and changes this field to store sectors instead of blocks. All users of this field, linear.c, raid0.c and md.c, are fixed up accordingly which gets rid of many multiplications and divisions. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
|
43b2e5d86d8bdd77386226db0bc961529492c043 |
|
31-Mar-2009 |
NeilBrown <neilb@suse.de> |
md: move md_k.h from include/linux/raid/ to drivers/md/ It really is nicer to keep related code together.. Signed-off-by: NeilBrown <neilb@suse.de>
|
bff61975b3d6c18ee31457cc5b4d73042f44915f |
|
31-Mar-2009 |
NeilBrown <neilb@suse.de> |
md: move lots of #include lines out of .h files and into .c This makes the includes more explicit, and is preparation for moving md_k.h to drivers/md/md.h Remove include/raid/md.h as its only remaining use was to #include other files. Signed-off-by: NeilBrown <neilb@suse.de>
|
ef740c372dfd80e706dbf955d4e4aedda6c0c148 |
|
31-Mar-2009 |
Christoph Hellwig <hch@lst.de> |
md: move headers out of include/linux/raid/ Move the headers with the local structures for the disciplines and bitmap.h into drivers/md/ so that they are more easily grepable for hacking and not far away. md.h is left where it is for now as there are some uses from the outside. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: NeilBrown <neilb@suse.de>
|
159ec1fc060ab22b157a62364045f5e98749c4d3 |
|
08-Jan-2009 |
Cheng Renquan <crquan@gmail.com> |
md: use list_for_each_entry macro directly The rdev_for_each macro defined in <linux/raid/md_k.h> is identical to list_for_each_entry_safe, from <linux/list.h>, it should be defined to use list_for_each_entry_safe, instead of reinventing the wheel. But some calls to each_entry_safe don't really need a safe version, just a direct list_for_each_entry is enough, this could save a temp variable (tmp) in every function that used rdev_for_each. In this patch, most rdev_for_each loops are replaced by list_for_each_entry, totally save many tmp vars; and only in the other situations that will call list_del to delete an entry, the safe version is used. Signed-off-by: Cheng Renquan <crquan@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>
|
ccacc7d2cf03114a24ab903f710118e9e5d43273 |
|
08-Jan-2009 |
Andre Noll <maan@systemlinux.org> |
md: raid0: make hash_spacing and preshift sector-based. This patch renames the hash_spacing and preshift members of struct raid0_private_data to spacing and sector_shift respectively and changes the semantics as follows: We always have spacing = 2 * hash_spacing. In case sizeof(sector_t) > sizeof(u32) we also have sector_shift = preshift + 1 while sector_shift = preshift = 0 otherwise. Note that the values of nb_zone and zone are unaffected by these changes because in the sector_div() preceeding the assignement of these two variables both arguments double. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
|
83838ed87898e0a8ff8dbf001e54e6c017f0a011 |
|
08-Jan-2009 |
Andre Noll <maan@systemlinux.org> |
md: raid0: Represent the size of strip zones in sectors. This completes the block -> sector conversion of struct strip_zone. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
|
0825b87a7dd9645c7e16489fec839a3cb5c40a08 |
|
08-Jan-2009 |
Andre Noll <maan@systemlinux.org> |
md: raid0 create_strip_zones(): Add KERN_INFO/KERN_ERR to printk's. This patch consists only of these trivial changes. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
|
6b8796cc3decb43c7c67a13a0699b6a21b754c78 |
|
08-Jan-2009 |
Andre Noll <maan@systemlinux.org> |
md: raid0 create_strip_zones(): Make two local variables sector-based. current_offset and curr_zone_offset stored the corresponding offsets as 1K quantities. Rename them to current_start and curr_zone_start to match the naming of struct strip_zone and store the offsets as sector counts. Also, add KERN_INFO to the printk() affected by this change to make checkpatch happy. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
|
6199d3db0fc34f8ada895879d04a353a6ae632bc |
|
08-Jan-2009 |
Andre Noll <maan@systemlinux.org> |
md: raid0: Represent zone->zone_offset in sectors. For the same reason as in the previous patch, rename it from zone_offset to zone_start. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
|
019c4e2f3e02aac4b44003913b54ca4b332e4371 |
|
08-Jan-2009 |
Andre Noll <maan@systemlinux.org> |
md: raid0: Represent device offset in sectors. Rename zone->dev_offset to zone->dev_start to make sure all users have been converted. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
|
e0f06868341700c5c1964a04f6c5b51d0a2d5bca |
|
08-Jan-2009 |
Andre Noll <maan@systemlinux.org> |
md: raid0_make_request(): Replace local variable block by sector. This change already simplifies the code a bit. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
|
a471200595b24fb1907ad12107a6a66db02c63f2 |
|
08-Jan-2009 |
Andre Noll <maan@systemlinux.org> |
md: raid0_make_request(): Remove local variable chunk_size. We might as well use chunk_sects instead. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
|
1b7fdf8ff7c0e3fba9c679def4e98d5701d2949e |
|
08-Jan-2009 |
Andre Noll <maan@systemlinux.org> |
md: raid0_make_request(): Replace chunksize_bits by chunksect_bits. As ffz(~(2 * x)) = ffz(~x) + 1, we have chunksect_bits = chunksize_bits + 1. Fixup all users accordingly. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
|
fb4d8c76e56a887b9eee99fbc55fe82b18625d30 |
|
13-Oct-2008 |
NeilBrown <neilb@suse.de> |
md: Remove unnecessary #includes, #defines, and function declarations. A lot of cruft has gathered over the years. Time to remove it. Signed-off-by: NeilBrown <neilb@suse.de>
|
6feef531f55cf4a20fd9eb39f5352e5745203603 |
|
09-Oct-2008 |
Denis ChengRq <crquan@gmail.com> |
block: mark bio_split_pool static Since all bio_split calls refer the same single bio_split_pool, the bio_split function can use bio_split_pool directly instead of the mempool_t parameter; then the mempool_t parameter can be removed from bio_split param list, and bio_split_pool is only referred in fs/bio.c file, can be marked static. Signed-off-by: Denis ChengRq <crquan@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
074a7aca7afa6f230104e8e65eba3420263714a5 |
|
25-Aug-2008 |
Tejun Heo <tj@kernel.org> |
block: move stats from disk to part0 Move stats related fields - stamp, in_flight, dkstats - from disk to part0 and unify stat handling such that... * part_stat_*() now updates part0 together if the specified partition is not part0. ie. part_stat_*() are now essentially all_stat_*(). * {disk|all}_stat_*() are gone. * part_round_stats() is updated similary. It handles part0 stats automatically and disk_round_stats() is killed. * part_{inc|dec}_in_fligh() is implemented which automatically updates part0 stats for parts other than part0. * disk_map_sector_rcu() is updated to return part0 if no part matches. Combined with the above changes, this makes NULL special case handling in callers unnecessary. * Separate stats show code paths for disk are collapsed into part stats show code paths. * Rename disk_stat_lock/unlock() to part_stat_lock/unlock() While at it, reposition stat handling macros a bit and add missing parentheses around macro parameters. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
c9959059161ddd7bf4670cf47367033d6b2f79c4 |
|
25-Aug-2008 |
Tejun Heo <tj@kernel.org> |
block: fix diskstats access There are two variants of stat functions - ones prefixed with double underbars which don't care about preemption and ones without which disable preemption before manipulating per-cpu counters. It's unclear whether the underbarred ones assume that preemtion is disabled on entry as some callers don't do that. This patch unifies diskstats access by implementing disk_stat_lock() and disk_stat_unlock() which take care of both RCU (for partition access) and preemption (for per-cpu counter access). diskstats access should always be enclosed between the two functions. As such, there's no need for the versions which disables preemption. They're removed and double underbars ones are renamed to drop the underbars. As an extra argument is added, there's no danger of using the old version unconverted. disk_stat_lock() uses get_cpu() and returns the cpu index and all diskstat functions which access per-cpu counters now has @cpu argument to help RT. This change adds RCU or preemption operations at some places but also collapses several preemption ops into one at others. Overall, the performance difference should be negligible as all involved ops are very lightweight per-cpu ones. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
f233ea5c9e0d8b95e4283bf6a3436b88f6fd3586 |
|
21-Jul-2008 |
Andre Noll <maan@systemlinux.org> |
md: Make mddev->array_size sector-based. This patch renames the array_size field of struct mddev_s to array_sectors and converts all instances to use units of 512 byte sectors instead of 1k blocks. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
|
cc371e66e340f35eed8dc4651c7c18e754c7fb26 |
|
03-Jul-2008 |
Alasdair G Kergon <agk@redhat.com> |
Add bvec_merge_data to handle stacked devices and ->merge_bvec() When devices are stacked, one device's merge_bvec_fn may need to perform the mapping and then call one or more functions for its underlying devices. The following bio fields are used: bio->bi_sector bio->bi_bdev bio->bi_size bio->bi_rw using bio_data_dir() This patch creates a new struct bvec_merge_data holding a copy of those fields to avoid having to change them directly in the struct bio when going down the stack only to have to change them back again on the way back up. (And then when the bio gets mapped for real, the whole exercise gets repeated, but that's a problem for another day...) Signed-off-by: Alasdair G Kergon <agk@redhat.com> Cc: Neil Brown <neilb@suse.de> Cc: Milan Broz <mbroz@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
e7e72bf641b1fc7b9df6f40bd2c36dfccd8d647c |
|
15-May-2008 |
Neil Brown <neilb@suse.de> |
Remove blkdev warning triggered by using md As setting and clearing queue flags now requires that we hold a spinlock on the queue, and as blk_queue_stack_limits is called without that lock, get the lock inside blk_queue_stack_limits. For blk_queue_stack_limits to be able to find the right lock, each md personality needs to set q->queue_lock to point to the appropriate lock. Those personalities which didn't previously use a spin_lock, us q->__queue_lock. So always initialise that lock when allocated. With this in place, setting/clearing of the QUEUE_FLAG_PLUGGED bit will no longer cause warnings as it will be clear that the proper lock is held. Thanks to Dan Williams for review and fixing the silly bugs. Signed-off-by: NeilBrown <neilb@suse.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Alistair John Strachan <alistair@devzero.co.uk> Cc: Nick Piggin <npiggin@suse.de> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Jacek Luczak <difrost.kernel@gmail.com> Cc: Prakash Punnoor <prakash@punnoor.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
d089c6af10c2be5988f03667d6d22fe6085fbe5e |
|
06-Feb-2008 |
NeilBrown <neilb@suse.de> |
md: change ITERATE_RDEV to rdev_for_each As this is more in line with common practice in the kernel. Also swap the args around to be more like list_for_each. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
2ad8b1ef11c98c5603580878aebf9f1bc74129e4 |
|
07-Nov-2007 |
Alan D. Brunelle <Alan.Brunelle@hp.com> |
Add UNPLUG traces to all appropriate places Added blk_unplug interface, allowing all invocations of unplugs to result in a generated blktrace UNPLUG. Signed-off-by: Alan D. Brunelle <Alan.Brunelle@hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
8299d7f7c067a30a67ad359d416128c4ff57dcd1 |
|
17-Oct-2007 |
NeilBrown <neilb@suse.de> |
md: fix a bug in some never-used code. http://bugzilla.kernel.org/show_bug.cgi?id=3277 There is a seq_printf here that isn't being passed a 'seq'. Howeve as the code is inside #ifdef MD_DEBUG, nobody noticed. Also remove some extra spaces. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
fd5d806266935179deda1502101624832eacd01f |
|
16-Oct-2007 |
Jens Axboe <jens.axboe@oracle.com> |
block: convert blkdev_issue_flush() to use empty barriers Then we can get rid of ->issue_flush_fn() and all the driver private implementations of that. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
6712ecf8f648118c3363c142196418f89a510b90 |
|
27-Sep-2007 |
NeilBrown <neilb@suse.de> |
Drop 'size' argument from bio_endio and bi_end_io As bi_end_io is only called once when the reqeust is complete, the 'size' argument is now redundant. Remove it. Now there is no need for bio_endio to subtract the size completed from bi_size. So don't do that either. While we are at it, change bi_end_io to return void. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
165125e1e480f9510a5ffcfbfee4e3ee38c05f23 |
|
24-Jul-2007 |
Jens Axboe <jens.axboe@oracle.com> |
[BLOCK] Get rid of request_queue_t typedef Some of the code has been gradually transitioned to using the proper struct request_queue, but there's lots left. So do a full sweet of the kernel and get rid of this typedef and replace its uses with the proper type. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
787f17feb204ed1c6331892fb8124b80dc9fe288 |
|
23-May-2007 |
NeilBrown <neilb@suse.de> |
md: avoid overflow in raid0 calculation with large components If a raid0 has a component device larger than 4TB, and is accessed on a 32bit machines, then as 'chunk' is unsigned long, chunk << chunksize_bits can overflow (this can be as high as the size of the device in KB). chunk itself will not overflow (without triggering a BUG). So change 'chunk' to be 'sector_t, and get rid of the 'BUG' as it becomes impossible to hit. Cc: "Jeff Zheng" <Jeff.Zheng@endace.com> Signed-off-by: Neil Brown <neilb@suse.de> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
26be34dc3a46be983352dd89683db374b0cb73fa |
|
03-Oct-2006 |
NeilBrown <neilb@suse.de> |
[PATCH] md: define backing_dev_info.congested_fn for raid0 and linear Each backing_dev needs to be able to report whether it is congested, either by modulating BDI_*_congested in ->state, or by defining a ->congested_fn. md/raid did neither of these. This patch add a congested_fn which simply checks all component devices to see if they are congested. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
5c4c33318d26620fa552f15bbb6d0f9775a1b4df |
|
23-May-2006 |
NeilBrown <neilb@suse.de> |
[PATCH] md: fix possible oops when starting a raid0 array This loop that sets up the hash_table has problems. Careful examination will show that the last time through, everything but the first line is pointless. This is because all it does is change 'cur' and 'size' and neither of these are used after the loop. This should ring warning bells... That last time through the loop, size += conf->strip_zone[cur].size can index off the end of the strip_zone array. Depending on what it finds there, it might exit the loop cleanly, or it might spin going further and further beyond the array until it hits an unmapped address. This patch rearranges the code so that the last, pointless, iteration of the loop never happens. i.e. the one statement of the last loop that is needed is moved the the end of the previous loop - or to before the loop starts - and the loop counter starts from 1 instead of 0. Cc: "Don Dupuis" <dondster@gmail.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
29fc7e3e70a05e9eea28afb6707a39c1a53e2f66 |
|
03-Feb-2006 |
NeilBrown <neilb@suse.de> |
[PATCH] md: Assorted little md fixes - version-1 superblock + The default_bitmap_offset is in sectors, not bytes. + the 'size' field in the superblock is in sectors, not KB - raid0_run should return a negative number on error, not '1' - raid10_read_balance should not return a valid 'disk' number if ->rdev turned out to be NULL - kmem_cache_destroy doesn't like being passed a NULL. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
a1365647022eb05a5993f270a78e9bef3bf554eb |
|
08-Jan-2006 |
Andrew Morton <akpm@osdl.org> |
[PATCH] remove gcc-2 checks Remove various things which were checking for gcc-1.x and gcc-2.x compilers. From: Adrian Bunk <bunk@stusta.de> Some documentation updates and removes some code paths for gcc < 3.2. Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
d9d166c2a9d5d01af34396793950aa695883eed4 |
|
06-Jan-2006 |
NeilBrown <neilb@suse.de> |
[PATCH] md: allow array level to be set textually via sysfs Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
2604b703b6b3db80e3c75ce472a54dfd0b7bf9f4 |
|
06-Jan-2006 |
NeilBrown <neilb@suse.de> |
[PATCH] md: remove personality numbering from md md supports multiple different RAID level, each being implemented by a 'personality' (which is often in a separate module). These personalities have fairly artificial 'numbers'. The numbers are use to: 1- provide an index into an array where the various personalities are recorded 2- identify the module (via an alias) which implements are particular personality. Neither of these uses really justify the existence of personality numbers. The array can be replaced by a linked list which is searched (array lookup only happens very rarely). Module identification can be done using an alias based on level rather than 'personality' number. The current 'raid5' modules support two level (4 and 5) but only one personality. This slight awkwardness (which was handled in the mapping from level to personality) can be better handled by allowing raid5 to register 2 personalities. With this change in place, the core md module does not need to have an exhaustive list of all possible personalities, so other personalities can be added independently. This patch also moves the check for chunksize being non-zero into the ->run routines for the personalities that need it, rather than having it in core-md. This has a side effect of allowing 'faulty' and 'linear' not to have a chunk-size set. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
9ffae0cf3ea02f75d163922accfd3e592d87adde |
|
06-Jan-2006 |
NeilBrown <neilb@suse.de> |
[PATCH] md: convert md to use kzalloc throughout Replace multiple kmalloc/memset pairs with kzalloc calls. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
2d1f3b5d1b2cd11a162eb29645df749ec0036413 |
|
06-Jan-2006 |
NeilBrown <neilb@suse.de> |
[PATCH] md: clean up 'page' related names in md Substitute: page_cache_get -> get_page page_cache_release -> put_page PAGE_CACHE_SHIFT -> PAGE_SHIFT PAGE_CACHE_SIZE -> PAGE_SIZE PAGE_CACHE_MASK -> PAGE_MASK __free_page -> put_page because we aren't using the page cache, we are just using pages. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
a362357b6cd62643d4dda3b152639303d78473da |
|
01-Nov-2005 |
Jens Axboe <axboe@suse.de> |
[BLOCK] Unify the seperate read/write io stat fields into arrays Instead of having ->read_sectors and ->write_sectors, combine the two into ->sectors[2] and similar for the other fields. This saves a branch several places in the io path, since we don't have to care for what the actual io direction is. On my x86-64 box, that's 200 bytes less text in just the core (not counting the various drivers). Signed-off-by: Jens Axboe <axboe@suse.de>
|
e5dcdd80a60627371f40797426273048630dc8ca |
|
10-Sep-2005 |
NeilBrown <neilb@cse.unsw.edu.au> |
[PATCH] md: fail IO request to md that require a barrier. md does not yet support BIO_RW_BARRIER, so be honest about it and fail (-EOPNOTSUPP) any such requests. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
1eb29128c644581fa51f822545921394ad4f719f |
|
15-Jul-2005 |
Neil Brown <neilb@cse.unsw.edu.au> |
[PATCH] Fix raid0's attempt to divide by 64bit numbers Apparently sector_div is only guaranteed to work with a 32bit divisor, even on 64bit architectures. So allow for this in raid0. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
990a8baf568ca1d0ae65e59783ff821794118d07 |
|
22-Jun-2005 |
Jesper Juhl <juhl-lkml@dif.dk> |
[PATCH] md: remove unneeded NULL checks before kfree This patch removes some unneeded checks of pointers being NULL before calling kfree() on them. kfree() handles NULL pointers just fine, checking first is pointless. Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 |
|
17-Apr-2005 |
Linus Torvalds <torvalds@ppc970.osdl.org> |
Linux-2.6.12-rc2 Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!
|