History log of /drivers/block/drbd/drbd_worker.c
Revision Date Author Comments
729e8b87bac63dee09302ddffc05a7ba0e50c9ad 11-Sep-2014 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: reduce lock contention in drbd_worker

The worker may now dequeue work items in batches.
This should reduce lock contention during busy periods.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
abde9cc6a59cb7f07fda4c77fee2150314e423fa 11-Sep-2014 Lars Ellenberg <lars@linbit.com> drbd: Improve asender performance

Shorten receive path in the asender thread. Reduces CPU utilisation
of asender when receiving packets, and with that increases IOPs.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
b47a06d10561bfe7317b1355b4b8e4168fc6b4b7 11-Sep-2014 Andreas Gruenbacher <andreas.gruenbacher@gmail.com> drbd: Get rid of the WORK_PENDING macro

This macro doesn't add any value; just use test_bit() instead.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
d1b8085356391d1d5151670ab96baae6234d1e20 11-Sep-2014 Andreas Gruenbacher <andreas.gruenbacher@gmail.com> drbd: Get rid of the __no_warn and __cond_lock macros

These macros can easily be replaced with its definition.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
ed15b795092637f61c08fd21dc011b5334d7974c 11-Sep-2014 Andreas Gruenbacher <andreas.gruenbacher@gmail.com> drbd: Use consistent names for all the bi_end_io callbacks

Now they follow the _endio naming sheme.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
944410e97cfcec38369eeb5f77d0e8da91d68afb 06-May-2014 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: debugfs: add callback_history

Add a per-connection worker thread callback_history
with timing details, call site and callback function.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
21ae5d7f95aa1a64f35b03c91f8714ced3ab61a9 05-May-2014 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: track timing details of peer_requests

To be able to present timing details in debugfs,
we need to track preparation/submit times of peer requests.

Track peer request flags early,
before they are put on the epoch_entry lists.

Waiting for activity log transactions may be a major latency factor.
We want to be able to present the peer_request state accurately in
debugfs, and what it is waiting for.

Consistently mark/unmark peer requests with EE_CALL_AL_COMPLETE_IO.
Set it only *after* calling drbd_al_begin_io(),
clear it as soon as we call drbd_al_complete_io().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
ad3fee790088d36ad862e31535b5b99c25adeef4 20-Dec-2013 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: improve throttling decisions of background resynchronisation

Background resynchronisation does some "side-stepping", or throttles
itself, if it detects application IO activity, and the current resync
rate estimate is above the configured "cmin-rate".

What was not detected: if there is no application IO,
because it blocks on activity log transactions.

Introduce a new atomic_t ap_actlog_cnt, tracking such blocked requests,
and count non-zero as application IO activity.
This counter is exposed at proc_details level 2 and above.

Also make sure to release the currently locked resync extent
if we side-step due to such voluntary throttling.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
e5f891b2234dbab8c8797111a61519d0728ef855 22-Nov-2013 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: gather detailed timing statistics for drbd_requests

Record (in jiffies) how much time a request spends in which stages.
Followup commits will use and present this additional timing information
so we can better locate and tackle the root causes of latency spikes,
or present the backlog for asynchronous replication.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
e37d2438d8e5e4c1225cf94d45347fa207835447 01-Apr-2014 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: track meta data IO intent, start and submit time

For diagnostic purposes, track intent, start time
and latest submit time of meta data IO.

Move separate members from struct drbd_device
into the embeded struct drbd_md_io.
s/md_io_(page|in_use)/md_io.\1/

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
b9ed7080d7d29112c898c64bad778b84eec0ed2d 23-Apr-2014 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: consistently use list_add_tail for peer_request tracking

Keep the epoch entry lists (active_ee, read_ee, sync_ee, ...)
consistently "oldest first". That way finding the oldest not yet
successfully processed request is simply list_first_entry_or_null.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
15e26f6a3c6de2c665b4a30b9a70a902111f281f 28-Apr-2014 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: add drbd_queue_work_if_unqueued helper

We sometimes do
if (list_empty(&w.list))
drbd_queue_work(&q, &w.list);

Removal (list_del_init) may happen outside all locks, after all
pending work entries have been moved to an on-stack local work list.

For not dynamically allocated, but embeded, work structs,
we must avoid to re-add until it really was removed.

Move that list_empty check inside the spin_lock(&q->q_lock)
within the helper function, and change to list_empty_careful().

This may have been the reason for a list_add corruption
inside drbd_queue_work().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
7f34f61490ee87a470cf229069d59b0987f42a59 22-Apr-2014 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: drbd_rs_number_requests: fix unit mismatch in comparison

We try to limit the number of "in-flight" resync requests.
One condition for that is the amount of requested data should not exceed
half of what can be covered by our "max-buffers" setting.

However we compared number of 4k pages with number of in-flight 512 Byte
sectors, and this extra throttle triggered much earlier than intended.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
506afb6248af577eb702c73f3da52a12f4c56a38 31-Jan-2014 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: improve resync request throttling due to sendbuf size

If we throttle resync because the socket sendbuffer is filling up,
tell TCP about it, so it may expand the sendbuffer for us.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
aaaba34576407857f6146ff6c330f06e63fb2bf2 18-Mar-2014 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: implement csums-after-crash-only

Checksum based resync trades CPU cycles for network bandwidth,
in situations where we expect much of the to-be-resynced blocks
to be actually identical on both sides already.

In a "network hickup" scenario, it won't help:
all to-be-resynced blocks will typically be different.

The use case is for the resync of *potentially* different blocks
after crash recovery -- the crash recovery had marked larger areas
(those covered by the activity log) as need-to-be-resynced,
just in case. Most of those blocks will be identical.

This option makes it possible to configure checksum based resync,
but only actually use it for the first resync after primary crash.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
4dd726f02928ded116f6c9aaf6392a400ef0d9f7 11-Feb-2014 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: get rid of drbd_queue_work_front

The last user was al_write_transaction, if called with "delegate",
and the last user to call it with "delegate = true" was the receiver
thread, which has no need to delegate, but can call it himself.

Finally drop the delegate parameter, drop the extra
w_al_write_transaction callback, and drop drbd_queue_work_front.

Do not (yet) change dequeue_work_item to dequeue_work_batch, though.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
ac0acb9e39ac41575cc6a344d04295436fd4eb4e 11-Feb-2014 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: use drbd_device_post_work() in more places

This replaces the md_sync_work member of struct drbd_device
by a new MD_SYNC "work bit" in device->flags.

This replaces the resync_start_work member of struct drbd_device
by a new RS_START "work bit" in device->flags.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
e334f55095b908f12c8bad991433f5d609e919d1 11-Feb-2014 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: make sure disk cleanup happens in worker context

The recent fix to put_ldev() (correct ordering of access to local_cnt
and state.disk; memory barrier in __drbd_set_state) guarantees
that the cleanup happens exactly once.

However it does not yet guarantee that the cleanup happens from worker
context, the last put_ldev() may still happen from atomic context,
which must not happen: blkdev_put() may sleep.

Fix this by scheduling the cleanup to the worker instead,
using a couple more bits in device->flags and a new helper,
drbd_device_post_work().

Generalized the "resync progress" work to cover these new work bits.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
5ab7d2c005135849cf0bb1485d954c98f2cca57c 27-Jan-2014 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: fix resync finished detection

This fixes one recent regresion,
and one long existing bug.

The bug:
drbd_try_clear_on_disk_bm() assumed that all "count" bits have to be
accounted in the resync extent corresponding to the start sector.

Since we allow application requests to cross our "extent" boundaries,
this assumption is no longer true, resulting in possible misaccounting,
scary messages
("BAD! sector=12345s enr=6 rs_left=-7 rs_failed=0 count=58 cstate=..."),
and potentially, if the last bit to be cleared during resync would
reside in previously misaccounted resync extent, the resync would never
be recognized as finished, but would be "stalled" forever, even though
all blocks are in sync again and all bits have been cleared...

The regression was introduced by
drbd: get rid of atomic update on disk bitmap works

For an "empty" resync (rs_total == 0), we must not "finish" the
resync on the SyncSource before the SyncTarget knows all relevant
information (sync uuid). We need to wait for the full round-trip,
the SyncTarget will then explicitly notify us.

Also for normal, non-empty resyncs (rs_total > 0), the resync-finished
condition needs to be tested before the schedule() in wait_for_work, or
it is likely to be missed.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
a80ca1ae81fc52e304e753f6de4ef248df364f9e 27-Dec-2013 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: fix a race stopping the worker thread

We may implicitly call drbd_send() from inside wait_for_work(),
via maybe_send_barrier().

If the "stop" signal was send just before that, drbd_send() would call
flush_signals(), and we would run an unbounded schedule() afterwards.

Fix: check for thread_state == RUNNING before we schedule()

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
c7a58db4e9dc523b18bbfbc3aa311d8308acc293 20-Dec-2013 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: get rid of atomic update on disk bitmap works

Just trigger the occasional lazy bitmap write-out during resync
from the central wait_for_work() helper.

Previously, during resync, bitmap pages would be written out separately,
synchronously, one at a time, at least 8 times each (every 512 bytes
worth of bitmap cleared).

Now we trigger "merge friendly" bulk write out of all cleared pages
every two seconds during resync, and once the resync is finished.
Most pages will be written out only once.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
44a4d551846b8c61aa430b9432c1fcdf88444708 22-Nov-2013 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: refactor use of first_peer_device()

Reduce the number of calls to first_peer_device(). Instead, call
first_peer_device() just once to assign a local variable peer_device.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
f9c78128f833ae3057884ca219259c8ae5db8898 28-Apr-2014 Lars Ellenberg <lars@linbit.com> drbd: always implicitly close last epoch when idle

Once our sender thread needs to wait_for_work(),
and actually needs to schedule(), just before we do that,
we already check if it is useful to implicitly close the last epoch.

The condition was too strict: only implicitly close the epoch,
if there have been no new (write) requests at all.

The assumption was that if there were new requests, they would
always be communicated one way or another, and would send necessary
epoch separating barriers explicitly.

This is not always true, e.g. when becoming diskless,
or while explicitly starting a full resync.

The last communicated epoch could stay open for a long time,
locking down corresponding activity log extents.

It is safe to always implicitly send that last barrier, as soon as we
determin that there cannot be more requests in the last communicated
epoch, even if there have been (uncommunicated) new requests in new
epochs meanwhile.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
074f4afeb2277bd5ecb9fa7f91eaffa55e262126 28-Apr-2014 Lars Ellenberg <lars@linbit.com> drbd: fix a race between start_resync and send_and_submit

In the drbd make request function, specifically in
drbd_send_and_submit(), we decide whether we want to send the actual
write request, or only a "set this block out of sync" information.

We do so based on the current connection state, while holding the req_lock.
The connection state is not supposed to change while holding the req_lock.

But in drbd_start_resync, we did change that state anyways,
while only holding the global_state_lock, which is enough to change
sync-after dependencies (paused vs active resync), but
not good enough to change the connection state.

Fix: in drbd_start_resync, first grab the req_lock to serialize with
drbd_send_and_submit(), before grabbing the global_state_lock
to be able to evaluate the sync-after dependencies.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2f632aeb5302da93f760d965e970600b35907026 28-Apr-2014 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: prepare sending side for REQ_DISCARD

Note that I do NOT call __drbd_chk_io_error for failed REQ_DISCARD.
That may be wrong, though, or needs to differ between EOPNOTSUPP and
other errors...

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
a0fb3c47a1aae5d38a88ea858f14d6d088d05e07 28-Apr-2014 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: prepare receiving side for REQ_DISCARD

If the receiver needs to serve a discard request on a queue that does
not announce to be discard cabable, it falls back to do synchronous
blkdev_issue_zeroout().

We expect only "reasonably" large (up to one activity log extent?)
discard requests.

We do this to not to not block the receiver for too long in this
fallback code path, and to not set/clear too many bits inside one
spinlock_irq_save() in drbd_set_in_sync/drbd_set_out_of_sync,

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
0e49d7b014c5d591a053d08888a455bd74a88646 28-Apr-2014 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: fix potential distributed deadlock during verify or resync

If max-buffers and socket buffer sizes are "too small" for the chosen
resync rate, this could lead potentially lead to a distributed deadlock,
which may or may not resolve itself via the "ko-count" and request
timeout mechanism, or could be resolved by forced disconnect.

One option to deal with this is proper configuration:
use larger max-buffer and socket buffers settings,
or reduce the resync rate.

But even with bad configuration we should not deadlock,
but "gracefully" recover.

The issue is avoided by using only up to max-buffers/2 for resync
requests, and by using max-buffers not as a hard limit for data buffer
allocations, but as a throttle threshold only.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
6377b9235056452cd5d592c3739baa379a8735fe 28-Apr-2014 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: resync: fix too large bursts for very slow rates

While merging adjacent dirty blocks into resync requests,
the resync rate throttle was disregarded.
For very low resync rates, the effective rate may have exceeded
the intended rate by a larger margin.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
6780139c0ab96fc9c605bea33db30fc9378016b7 13-Sep-2011 Andreas Gruenbacher <agruen@kernel.org> drbd: Use the right peer device

in w_e_ (peer request) callbacks and in peer request I/O completion handlers

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
6db7e50a8a40d2210544b4a09f3d4988127c20ad 26-Aug-2011 Andreas Gruenbacher <agruen@kernel.org> drbd: In the worker thread, process drbd_work instead of drbd_device_work items

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
d448a2e1e3d02f8f19111191d490b7e0a5eb70ea 25-Aug-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Turn w_make_ov_request and make_resync_request into "normal" functions

These functions are not used as drbd_work callbacks.

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
4d010392f416829005e85c337310b8feb65f877b 25-Aug-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Make w_make_resync_request() static

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
a8cd15ba7919eaf1f416857f983a502cc261af26 25-Aug-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: struct drbd_peer_request: Use drbd_work instead of drbd_device_work

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
84b8c06b6591e73250e6ab4834a02a86c8994b91 28-Jul-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Create a dedicated struct drbd_device_work

drbd_device_work is a work item that has a reference to a device,
while drbd_work is a more generic work item that does not carry
a reference to a device.

All callbacks get a pointer to a drbd_work instance, those callbacks
that expect a drbd_device_work use the container_of macro to get it.

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
8682eae9b4b26d54b9eeac8e17c534197e6d8744 25-Jul-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Rename w_prev_work_done -> w_complete

Also move it to drbd_receiver.c and make it static.

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
69a227731a378f34bc5a8192158bd94d1581ae3d 09-Aug-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Pass a peer device to a number of fuctions

These functions actually operate on a peer device, or
need a peer device.

drbd_prepare_command(), drbd_send_command(), drbd_send_sync_param()
drbd_send_uuids(), drbd_gen_and_send_sync_uuid(), drbd_send_sizes()
drbd_send_state(), drbd_send_current_state(), and drbd_send_state_req()
drbd_send_sr_reply(), drbd_send_ack(), drbd_send_drequest(),
drbd_send_drequest_csum(), drbd_send_ov_request(), drbd_send_dblock()
drbd_send_block(), drbd_send_out_of_sync(), recv_dless_read()
drbd_drain_block(), receive_bitmap_plain(), recv_resync_read()
read_in_block(), read_for_csum(), drbd_alloc_pages(), drbd_alloc_peer_req()
need_peer_seq(), update_peer_seq(), wait_for_and_update_peer_seq()
drbd_sync_handshake(), drbd_asb_recover_{0,1,2}p(), drbd_connected()
drbd_disconnected(), decode_bitmap_c() and recv_bm_rle_bits()

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
79a3c8d38cabd1a900340852e527b0a4ce8a459d 09-Aug-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: drbd_csum_bio(), drbd_csum_ee(): Remove unused device argument

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
0500813fe0c9a617ace86d91344e36839050dad6 07-Jul-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Move conf_mutex from connection to resource

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
0b0ba1efc7b887bc2bd767ef822979fe2dae620e 27-Jun-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Add explicit device parameter to D_ASSERT

The implicit dependency on a variable inside the macro is problematic.

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
1ec861ebd0889263841b822ee3f3eb49caf23656 06-Jul-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Replace and remove the obsolete conn_() macros

With the polymorphic drbd_() macros, we no longer need the connection
specific variants.

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
d01801710265cfb7bd8928ae7c3be4d9d15ceeb0 03-Jul-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Remove the terrible DEV hack

DRBD was using dev_err() and similar all over the code; instead of having to
write dev_err(disk_to_dev(device->vdisk), ...) to convert a drbd_device into a
kernel device, a DEV macro was used which implicitly references the device
variable. This is terrible; introduce separate drbd_err() and similar macros
with an explicit device parameter instead.

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
c06ece6ba6f1bb2e01616e111303c3ae5f80fdbe 21-Jun-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Turn connection->volumes into connection->peer_devices

Let connection->peer_devices point to peer devices; connection->volumes was
pointing to devices.

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
05a10ec7900dbdba008a24bf56b3490c4b568d2c 07-Jun-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Improve some function and variable naming

Rename functions
conn_destroy() -> drbd_destroy_connection(),
drbd_minor_destroy() -> drbd_destroy_device()
drbd_adm_add_minor() -> drbd_adm_add_minor()
drbd_adm_delete_minor() -> drbd_adm_del_minor()

Rename global variable minors to drbd_devices

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
a6b32bc3cebd3fb6848c526763733b9dbc389c02 31-May-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Introduce "peer_device" object between "device" and "connection"

In a setup where a device (aka volume) can replicate to multiple peers and one
connection can be shared between multiple devices, we need separate objects to
represent devices on peer nodes and network connections.

As a first step to introduce multiple connections per device, give each
drbd_device object a single drbd_peer_device object which connects it to a
drbd_connection object.

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
bde89a9e151b482765ed40e04307a6190236b387 30-May-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Rename drbd_tconn -> drbd_connection

sed -i -e 's:all_tconn:connections:g' -e 's:tconn:connection:g'

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
b30ab7913b0a7b1d3b1091c8cb3abb1a9f1e0824 03-Jul-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Rename "mdev" to "device"

sed -i -e 's:mdev:device:g'

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
547616979372b65646d691e8dab90e850be582fe 30-May-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Rename struct drbd_conf -> struct drbd_device

sed -i -e 's:\<drbd_conf\>:drbd_device:g'

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
a3603a6e3b4f2f0fb5529821134424e2eeec88fd 30-May-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Split off on-the-wire protocol definitions

Keep the protocol definitions separate from the kernel code; they are useful in
their own right.

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
a186e47856c7877fa684d75901436c1d35ec35e0 19-Dec-2013 Rashika Kheria <rashika.kheria@gmail.com> drivers: block: Mark the function as static in drbd_worker.c

Mark functions drbd_endio_read_sec_final(), drbd_send_barrier(),
need_to_send_barrier(), dequeue_work_batch(), dequeue_work_item() and
wait_for_work() as static in drbd/drbd_worker.c because they are not
used outside this file.

This eliminates the following warnings in drbd/drbd_worker.c:
drivers/block/drbd/drbd_worker.c:99:6: warning: no previous prototype for ‘drbd_endio_read_sec_final’ [-Wmissing-prototypes]
drivers/block/drbd/drbd_worker.c:1276:5: warning: no previous prototype for ‘drbd_send_barrier’ [-Wmissing-prototypes]
drivers/block/drbd/drbd_worker.c:1774:6: warning: no previous prototype for ‘need_to_send_barrier’ [-Wmissing-prototypes]
drivers/block/drbd/drbd_worker.c:1798:6: warning: no previous prototype for ‘dequeue_work_batch’ [-Wmissing-prototypes]
drivers/block/drbd/drbd_worker.c:1806:6: warning: no previous prototype for ‘dequeue_work_item’ [-Wmissing-prototypes]
drivers/block/drbd/drbd_worker.c:1815:6: warning: no previous prototype for ‘wait_for_work’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
7988613b0e5b2638caf6cd493cc78e9595eba19c 24-Nov-2013 Kent Overstreet <kmo@daterainc.com> block: Convert bio_for_each_segment() to bvec_iter

More prep work for immutable biovecs - with immutable bvecs drivers
won't be able to use the biovec directly, they'll need to use helpers
that take into account bio->bi_iter.bi_bvec_done.

This updates callers for the new usage without changing the
implementation yet.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Ed L. Cashin" <ecashin@coraid.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Paul Clements <Paul.Clements@steeleye.com>
Cc: Jim Paris <jim@jtan.com>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Yehuda Sadeh <yehuda@inktank.com>
Cc: Sage Weil <sage@inktank.com>
Cc: Alex Elder <elder@inktank.com>
Cc: ceph-devel@vger.kernel.org
Cc: Joshua Morris <josh.h.morris@us.ibm.com>
Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux390@de.ibm.com
Cc: Nagalakshmi Nandigama <Nagalakshmi.Nandigama@lsi.com>
Cc: Sreekanth Reddy <Sreekanth.Reddy@lsi.com>
Cc: support@lsi.com
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Guo Chao <yan@linux.vnet.ibm.com>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Quoc-Son Anh <quoc-sonx.anh@intel.com>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Jan Kara <jack@suse.cz>
Cc: linux-m68k@lists.linux-m68k.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: drbd-user@lists.linbit.com
Cc: nbd-general@lists.sourceforge.net
Cc: cbe-oss-dev@lists.ozlabs.org
Cc: xen-devel@lists.xensource.com
Cc: virtualization@lists.linux-foundation.org
Cc: linux-raid@vger.kernel.org
Cc: linux-s390@vger.kernel.org
Cc: DL-MPTFusionLinux@lsi.com
Cc: linux-scsi@vger.kernel.org
Cc: devel@driverdev.osuosl.org
Cc: linux-fsdevel@vger.kernel.org
Cc: cluster-devel@redhat.com
Cc: linux-mm@kvack.org
Acked-by: Geoff Levand <geoff@infradead.org>
a3f8f7dc7ad652cd84c12cb5efa0f7722dff4786 27-Mar-2013 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: validate resync_after dependency on attach already

We validated resync_after dependencies, if changed via disk-options.
But we did not validate them when first created via attach.
We also did not check or cleanup dependencies that used to be correct,
but now point to meanwhile removed minor devices.

If the drbd_resync_after_valid() validation in disk-options tried to
follow a dependency chain in this way, this could lead to NULL pointer
dereference.

Validate resync_after settings in drbd_adm_attach() already, as well as
in drbd_adm_disk_opts(), and and only reject dependency loops.
Depending on non-existing disks is allowed and equivalent to no dependency.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
a700471bf335965e7603273fd51034415553246a 27-Mar-2013 Philipp Reisner <philipp.reisner@linbit.com> drbd: abort start of resync early, if it raced with connection breakage

We've seen a spurious full resync, because a connection breakage
raced with drbd_start_resync(, C_SYNC_TARGET),
and the resulting state change request intended to start the resync
ended up looking like a local invalidate.

Fix:
Double check the state inside the lock,
and don't even request that state change,
if we had connection or IO problems.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
56392d2f40aac4b520fc50bc356f40e07f7e1c7d 19-Mar-2013 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: Clarify when activity log I/O is delegated to the worker thread

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
c04ccaa669e147ffb66e4e74d82c7dbfc100ec5e 19-Mar-2013 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: read meta data early, base on-disk offsets on super block

We used to calculate all on-disk meta data offsets, and then compare
the stored offsets, basically treating them as magic numbers.

Now with the activity log striping, the activity log size is no longer
fixed. We need to first read the super block, then base the activity
log and bitmap offsets on the stored offsets/al stripe settings.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
328e0f125bf41f4f33f684db22015f92cb44fe56 19-Oct-2012 Philipp Reisner <philipp.reisner@linbit.com> drbd: Broadcast sync progress no more often than once per second

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
edc9f5eb7afa3d832f540fcfe10e3e1087e6f527 27-Sep-2012 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: always write bitmap on detach

If we detach due to local read-error (which sets a bit in the bitmap),
stay Primary, and then re-attach (which re-reads the bitmap from disk),
we potentially lost the "out-of-sync" (or, "bad block") information in
the bitmap.

Always (try to) write out the changed bitmap pages before going diskless.

That way, we don't lose the bit for the bad block,
the next resync will fetch it from the peer, and rewrite
it locally, which may result in block reallocation in some
lower layer (or the hardware), and thereby "heal" the bad blocks.

If the bitmap writeout errors out as well, we will (again: try to)
mark the "we need a full sync" bit in our super block,
if it was a READ error; writes are covered by the activity log already.

If that superblock does not make it to disk either, we are sorry.

Maybe we just lost an entire disk or controller (or iSCSI connection),
and there actually are no bad blocks at all, so we don't need to
re-fetch from the peer, there is no "auto-healing" necessary.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
8747d30af97232f9ff4cde78b8d259cc715a9b7a 26-Sep-2012 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: a few more GFP_KERNEL -> GFP_NOIO

This has not yet been observed, but conceivably, when using GFP_KERNEL
allocations from drbd_md_sync(), drbd_flush_after_epoch() or
receive_SyncParam(), we could trigger additional IO to our own device,
or an other device in a criss-cross setup, and end up in a local
deadlock, or potentially a distributed deadlock in a criss-cross setup
involving the peer blocked in a similar way waiting for us to make
progress.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
a506c13a4d1ec5e1f2f9bc0123dacb5d123004d3 26-Sep-2012 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: use list_move_tail instead of list_del/list_add_tail

Using list_move_tail() instead of list_del() + list_add_tail().

spatch with a semantic match is used to found this problem.
(http://coccinelle.lip6.fr/)

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
1b6dd252e6c631322372c018ed546a108d9869d3 04-Sep-2012 Philipp Reisner <philipp.reisner@linbit.com> drbd: panic on delayed completion of aborted requests

"aborting" requests, or force-detaching the disk, is intended for
completely blocked/hung local backing devices which do no longer
complete requests at all, not even do error completions. In this
situation, usually a hard-reset and failover is the only way out.

By "aborting", basically faking a local error-completion,
we allow for a more graceful swichover by cleanly migrating services.
Still the affected node has to be rebooted "soon".

By completing these requests, we allow the upper layers to re-use
the associated data pages.

If later the local backing device "recovers", and now DMAs some data
from disk into the original request pages, in the best case it will
just put random data into unused pages; but typically it will corrupt
meanwhile completely unrelated data, causing all sorts of damage.

Which means delayed successful completion,
especially for READ requests,
is a reason to panic().

We assume that a delayed *error* completion is OK,
though we still will complain noisily about it.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
bc317a9ecd641b78a4b237cb22b30ecf11443c77 22-Aug-2012 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: dequeue single work items in wait_for_work()

As long as we still use drbd_queue_work_front(),
we must only dequeue the single first item during normal operation.

The comment in drbd_worker() even says so,
but bc8a5a1 drbd: remove struct drbd_tl_epoch objects (barrier works)
introduced the batch dequeueing again via list_splice_init() in
wait_for_work().

Change back to list_move() of the first item, if any.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
4eb9b3cba00471a01699cceb0f4b1f0cb8111ee2 20-Aug-2012 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: don't send out P_BARRIER with stale information

We must only send P_BARRIER for epochs we actually sent P_DATA in.

If we (re-)establish a connection, we reinitialized the
send.current_epoch_nr, but forgot to reset send.current_epoch_writes.

This could result in a spurious P_BARRIER with stale epoch information,
and a disconnect/reconnect cycle once the then "unexpected"
P_BARRIER_ACK is received:
BAD! BarrierAck #28823 received, expected #28829!

Introduce re_init_if_first_write() and maybe_send_barrier() helpers,
and call them appropriately for read/write/set-out-of-sync requests.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
58ffa580a748dd16b1e5ab260bea39cdbd1e94ef 26-Jul-2012 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: introduce stop-sector to online verify

We now can schedule only a specific range of sectors for online verify,
or interrupt a running verify without interrupting the connection.

Had to bump the protocol version differently, we are now 101.
Added verify_can_do_stop_sector() { protocol >= 97 && protocol != 100; }

Also, the return value convention for worker callbacks has changed,
we returned "true/false" for "keep the connection up" in 8.3,
we return 0 for success and <= for failure in 8.4.
Affected: receive_state()

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
a324896b173e569fb831c5caa04ccd02ec0bc9ca 30-Jul-2012 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: do not reset rs_pending_cnt too early

Fix asserts like
block drbd0: in got_BlockAck:4634: rs_pending_cnt = -35 < 0 !

We reset the resync lru cache and related information (rs_pending_cnt),
once we successfully finished a resync or online verify, or if the
replication connection is lost.

We also need to reset it if a resync or online verify is aborted
because a lower level disk failed.

In that case the replication link is still established,
and we may still have packets queued in the network buffers
which want to touch rs_pending_cnt.

We do not have any synchronization mechanism to know for sure when all
such pending resync related packets have been drained.

To avoid this counter to go negative (and violate the ASSERT that it
will always be >= 0), just do not reset it when we lose a disk.

It is good enough to make sure it is re-initialized before the next
resync can start: reset it when we re-attach a disk.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
0c849666016cbf541c1030eec55f5f8dd1fba513 30-Jul-2012 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: differentiate between normal and forced detach

Aborting local requests (not waiting for completion from the lower level
disk) is dangerous: if the master bio has been completed to upper
layers, data pages may be re-used for other things already.
If local IO is still pending and later completes,
this may cause crashes or corrupt unrelated data.

Only abort local IO if explicitly requested.
Intended use case is a lower level device that turned into a tarpit,
not completing io requests, not even doing error completion.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
b6dd1a89767bc33e9c98b3195f8925b46c5c95f3 28-Nov-2011 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: remove struct drbd_tl_epoch objects (barrier works)

cherry-picked and adapted from drbd 9 devel branch

DRBD requests (struct drbd_request) are already on the per resource
transfer log list, and carry their epoch number. We do not need to
additionally link them on other ring lists in other structs.

The drbd sender thread can recognize itself when to send a P_BARRIER,
by tracking the currently processed epoch, and how many writes
have been processed for that epoch.

If the epoch of the request to be processed does not match the currently
processed epoch, any writes have been processed in it, a P_BARRIER for
this last processed epoch is send out first.
The new epoch then becomes the currently processed epoch.

To not get stuck in drbd_al_begin_io() waiting for P_BARRIER_ACK,
the sender thread also needs to handle the case when the current
epoch was closed already, but no new requests are queued yet,
and send out P_BARRIER as soon as possible.

This is done by comparing the per resource "current transfer log epoch"
(tconn->current_tle_nr) with the per connection "currently processed
epoch number" (tconn->send.current_epoch_nr), while waiting for
new requests to be processed in wait_for_work().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
d5b27b01f17ef1f0badc45f9eea521be3457c9cb 14-Nov-2011 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: move the drbd_work_queue from drbd_socket to drbd_connection

cherry-picked and adapted from drbd 9 devel branch
In 8.4, we don't distinguish between "resource work" and "connection
work" yet, we have one worker for both, as we still have only one connection.

We only ever used the "data.work",
no need to keep the "meta.work" around.

Move tconn->data.work to tconn->sender_work.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
8c0785a5c9a0f2472aff68dc32247be01728c416 19-Oct-2011 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: allow to dequeue batches of work at a time

cherry-picked and adapted from drbd 9 devel branch

In 8.4, we still use drbd_queue_work_front(),
so in normal operation, we can not dequeue batches,
but only single items.

Still, followup commits will wake the worker
without explicitly queueing a work item,
so up() is replaced by a simple wake_up().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
4439c400ab278378a82efb543bb3bb91b184d8db 26-Mar-2012 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: simplify retry path of failed READ requests

If a local or remote READ request fails, just push it back to the retry
workqueue. It will re-enter __drbd_make_request, and be re-assigned to
a suitable local or remote path, or failed, if we do not have access to
good data anymore.

This obsoletes w_read_retry_remote(),
and eliminates two goto...retry blocks in __req_mod()

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2415308eb94e7bddf9c9a0f210374600210274d7 26-Mar-2012 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: move put_ldev from __req_mod() to the endio callback

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
4b8514ee288dede5013d23c3d6a285052d8392ab 26-Mar-2012 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: fix potential data corruption and protocol error

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
36baf6117b1deee37b9467224a0a14f1bb0863e2 10-Nov-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Fixed an obvious copy-n-paste mistake

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
0cfac5dd904ec8b376beb27f6ad265b12d71bf9e 10-Nov-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Fixes from the drbd-8.3 branch

* drbd-8.3:
drbd: fix spurious meta data IO "error"
drbd: Fixed a race condition between detach and start of resync
drbd: fix harmless race to not trigger an ASSERT
drbd: Derive sync-UUIDs only from the bitmap-uuid if it is non-zero
drbd: Fixed current UUID generation (regression introduced recently, after 8.3.11)

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
a209b4aec31d4b672b7a70f5de272ebf6ce40e1b 17-Aug-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Update some outdated comments to match the code

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
1b7ab15b11716d075b3dca34cf41e8d7aba3cba2 15-Jul-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Fixed w_restart_disk_io() to handle non active AL-extents

Since we now apply the AL in user space onto the bitmap, the AL
is not active for the requests we want to reply.

For that a al_write_transaction() that might be called from
worker context became necessary.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
9b743da96c8640dbfc864cb5d79c51547c3fadb4 15-Jul-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Missing assignment of mdev before drbd_queue_work()

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
cdfda633d235028e9b27381dedb65416409e8729 05-Jul-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: detach from frozen backing device

* drbd-8.3:
documentation: Documented detach's --force and disk's --disk-timeout
drbd: Implemented the disk-timeout option
drbd: Force flag for the detach operation
drbd: Allow new IOs while the local disk in in FAILED state
drbd: Bitmap IO functions can not return prematurely if the disk breaks
drbd: Added a kref to bm_aio_ctx
drbd: Hold a reference to ldev while doing meta-data IO
drbd: Keep a reference to the bio until the completion handler finished
drbd: Implemented wait_until_done_or_disk_failure()
drbd: Replaced md_io_mutex by an atomic: md_io_in_use
drbd: moved md_io into mdev
drbd: Immediately allow completion of IOs, that wait for IO completions on a failed disk
drbd: Keep a reference to barrier acked requests

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
95f8efd08bcce65df994049a292b94e56c7ada67 12-May-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Fix the upper limit of resync-after

The 32-bit resync_after netlink field takes a device minor number as
parameter, which is no longer limited to 255. We cannot statically
verify which device numbers are valid, so set the ummer limit to the
highest possible signed 32-bit integer.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
c141ebda031a0550d75634f7c94f7c85c2d5c9f5 05-May-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Removing drbd_cfg_rwsem

* Updates to all configuration items is done under genl_lock().
Including removal of mdevs or tconns.
* All read non sleeping read sides are protected by rcu
* All sleeping read sides keep reference counts to keep the
objects alive

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
bb77d34ecc6fe6cdc3f4f0841a516695c2eacc04 04-May-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Turn no-tcp-cork into tcp-cork={yes|no}

Change the --no-tcp-cork drbdsetup command line option as well as
the no_cork netlink packet.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
813472ced7fac734157fe5be1137ce2bac942902 03-May-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: RCU for rs_plan_s

This removes the issue with using peer_seq_lock out of different
contexts.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
9958c857c760eec76f4fdf288b6f33a1c3b41833 03-May-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Made the fifo object a self contained object (preparing for RCU)

* Moved rs_planed into it, named total
* When having a pointer to the object the values can
be embedded into the fifo object.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
daeda1cca91d58bb6c8e45f6734f021bab9c28b7 03-May-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: RCU for disk_conf

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
dc97b70801667ea8b1432b37f5c122405c8d6f96 03-May-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Split drbd_alter_sa() into drbd_sync_after_valid() and drbd_sync_after_changed()

Preparing RCU for disk_conf

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
992d6e91d3654c11c2e4d8d5933ffbf82a0440f0 02-May-2011 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: fix thread stop deadlock

There are races where the receiver may be exiting,
but still need the worker to process some stuff.

Do not wait for the receiver to die from an exiting worker.
The receiver must already be dead in case the worker decides to exit.
If the receiver was still alive, it may still want to queue work, and do
drbd_flush_workqueue() from it's disconnect cleanup code,
which would no longer be processed by an exiting worker.

This also would deadlock,
if the worker was to synchornously wait for the receiver to die.

Do not implicitly stop the worker.
The worker will only be stopped from configuration context, from
conn_reconfig_done(), drbd_adm_down() or drbd_adm_delete_connection(),
after making sure the receiver is already stopped.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
a18e9d1eb0660621eb9911e59a9b4d664cbad4d9 24-Apr-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Removed the OBJECT_DYING and the CONFIG_PENDING bits

superseded by refcounting

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
f5e2b8b3b6bed8c60103b4ed5341af072129d7c0 24-Apr-2011 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: move comment about stopping the receiver thread to where it belongs

When the last volume of a replication group is unconfigured,
the worker thread exits. To not interfere with cleanup
of other threads, before the the last cleanups run,
we need to make sure the receiver has already exited.

The commend explaining that clearly belongs above
drbd_thread_stop(&tconn->receiver), not in the cleanup loop below.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
44ed167da74825bfb7950d45a4f83bce3e84921c 19-Apr-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: rcu_read_lock() and rcu_dereference() for tconn->net_conf

Removing the get_net_conf()/put_net_conf() calls

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
d3fcb4908d8cce7f29cff16bbef3b08933148003 13-Apr-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: protect all idr accesses that might sleep with drbd_cfg_rwsem

With this commit the locking for all accesses to IDRs is complete:

* Non sleeping read accesses are protected by RCU
* sleeping read accesses are protocted by a read lock on drbd_cfg_rwsem
* accesses that add anything are protected by a write lock
* accesses that remove an object are protoected by a write lock
and a call to synchronize_rcu() after it is removed from the IDR
and before the object is actually free()ed.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
695d08fa94ce5bb8d9880e260445fbcf50fa41b4 12-Apr-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: rcu_read_[un]lock() for all idr accesses that do not sleep

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
c37c8ecfee685fa42de8fd418ad8ca1e66408bd8 07-Apr-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Rename drbd_pp_alloc() to drbd_alloc_pages() and make it non-static

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
045417f75c718a4ac97fd44106b8aafcbca5a6da 07-Apr-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Rename drbd_{ ee -> peer_req }_has_active_page

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
3967deb192e147328e1a6085a443ea6afef54dbb 06-Apr-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Rename drbd_free_ee() and variants to *_peer_req()

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
0db55363cb1e6cfe2bedecb7e47c05f8992c612e 06-Apr-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Rename drbd_alloc_ee() to drbd_alloc_peer_req()

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
e658983af6e62304be785cd6b0ae756723057395 30-Mar-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Remove headers from on-the-wire data structures (struct p_*)

Prepare the introduction of the protocol 100 headers. The actual protocol
header is removed for the packet declarations. I.e. allow us to use the
packets with different headers.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
9f5bdc339e3becd85aa8add305d794b0b1ec8996 28-Mar-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Replace and remove old primitives

Centralize sock->mutex locking and unlocking in [drbd|conn]_prepare_command()
and [drbd|conn]_send_comman().

Therefore all *_send_* functions are touched to use these primitives instead
of drbd_get_data_sock()/drbd_put_data_sock() and former helper functions.

That change makes the *_send_* functions more standardized.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
78bae59b1b7bc06c84e292e9ecf42c013723e057 28-Mar-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Introduced drbd_read_state()

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
181286ad22bf9bfb85de625e8501285de5261b35 31-Mar-2011 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: preparation commit, pass drbd_interval to drbd_al_begin/complete_io

We want to avoid bio_split for bios crossing activity log boundaries.
So we may need to activate two activity log extents "atomically".
drbd_al_begin_io() needs to know more than just the start sector.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
7c96715aa8ef1b5375c0d2a2d3bb1da99d95a39e 22-Mar-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: _conn_send_cmd(), _drbd_send_cmd(): Pass a struct drbd_socket instead of a plain socket

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
438c8374ae3e87f44d945a2ac2901e3b14aec1a8 28-Mar-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Do not segfault if a sync dependency reaches a diskless device

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
5a87d920f38fcafb790ddd03f0d8d1db56b268a8 24-Mar-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Preallocate one page per drbd_socket as a send buffer

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
8f7bed77740c7418074e6ba82c646a7dd035e6cf 19-Dec-2010 Andreas Gruenbacher <agruen@linbit.com> drbd: Rename various functions from *_oos_* to *_out_of_sync_* for clarity

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
99920dc5c5fe52182fe922aa70330861e2b6418b 16-Mar-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Make all worker callbacks return 0 upon success and an error code otherwise

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
b2f0ab62ecfe8711fefb82223b40430f8141a949 16-Mar-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Temporarily change the return type of all worker callbacks

This helps to ensure that we don't miss one of them when changing their
return value semantics.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
a896527c0658f9073413d46c2401448cdc0427ff 16-Mar-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: drbd_send_short_cmd(): Return 0 upon success and an error code otherwise

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
6bdb9b0e230aae94b084d8a375363ada056653b5 16-Mar-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: drbd_send_dblock(): Return 0 upon success and an error code otherwise

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
7b57b89d624cfdefc91d0a8b015c494c25a49292 16-Mar-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: drbd_send_block(): Return 0 upon success and an error code otherwise

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
73218a3c4c7ae87014b8fc258f8a16a75aad2870 16-Mar-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: drbd_send_oos(): Return 0 upon success and an error code otherwise

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
db1b0b724e56f34608b76197191ef0577a1ddd45 16-Mar-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: drbd_send_drequest_csum(): Return 0 upon success and an error code otherwise

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
6c1005e74d4142511a165edae72cb6648aa308c5 16-Mar-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: drbd_send_drequest(): Return 0 upon success and an error code otherwise

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
5b9f499c664efc1a72a0fe2538b39db7e75ecd2b 16-Mar-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: drbd_send_ov_request(): Return 0 upon success and an error code otherwise

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
fa79abd893f21f458c74af8bca015aa2ef7486a5 16-Mar-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: drbd_send_ack_ex(): Return 0 upon success and an error code otherwise

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
dd5161218bc514a29e1d8670fe1f3753d5e0f813 16-Mar-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: drbd_send_ack(): Return 0 upon success and an error code otherwise

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
04dfa137881efc890544c5cd3af94e54cfe0c480 15-Mar-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: _drbd_send_cmd(): Return 0 upon success and an error code otherwise

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
11b0be28e57fabeb75edfe81a17eddfc484cd9df 15-Mar-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: drbd_get_data_sock(): Return 0 upon success and an error code otherwise

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
f399002e68e626e7bc443e6fcab1772704cc197f 23-Mar-2011 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: distribute former syncer_conf settings to disk, connection, and resource level

This commit breaks the API again.

Move per-volume former syncer options into disk_conf.
Move per-connection former syncer options into net_conf.
Renamed the remainign sync_conf to res_opts

Syncer settings have been changeable at runtime, so we need to prepare
for these settings to be runtime-changeable in their new home as well.

Introduce new configuration operations, and share the netlink attribute
between "attach" (create new disk) and "disk-opts" (change options).
Same for "connect" and "net-opts".

Some fields cannot be changed at runtime, however.
Introduce a new flag GENLA_F_INVARIANT to be able to trigger on that in
the generated validation and assignment functions.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
38fa9988fa838324a0cce6e2f9d3c674230659d5 15-Mar-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Do not modify the connection state with something else that conn_request_state()

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
a2a3c74f243d5d1793f89ccdceaa6918851f7fce 22-Sep-2012 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: always write bitmap on detach

If we detach due to local read-error (which sets a bit in the bitmap),
stay Primary, and then re-attach (which re-reads the bitmap from disk),
we potentially lost the "out-of-sync" (or, "bad block") information in
the bitmap.

Always (try to) write out the changed bitmap pages before going diskless.

That way, we don't lose the bit for the bad block,
the next resync will fetch it from the peer, and rewrite
it locally, which may result in block reallocation in some
lower layer (or the hardware), and thereby "heal" the bad blocks.

If the bitmap writeout errors out as well, we will (again: try to)
mark the "we need a full sync" bit in our super block,
if it was a READ error; writes are covered by the activity log already.

If that superblock does not make it to disk either, we are sorry.

Maybe we just lost an entire disk or controller (or iSCSI connection),
and there actually are no bad blocks at all, so we don't need to
re-fetch from the peer, there is no "auto-healing" necessary.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
06f10adbdb027b225fd51584a218fa8344169514 22-Sep-2012 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: prepare for more than 32 bit flags

- struct drbd_conf { ... unsigned long flags; ... }
+ struct drbd_conf { ... unsigned long drbd_flags[N]; ... }

And introduce wrapper functions for test/set/clear bit operations
on this member.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
7fb907c15fb8d0e10e72c8566a13f6defab3f484 03-Sep-2012 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: panic on delayed completion of aborted requests

"aborting" requests, or force-detaching the disk, is intended for
completely blocked/hung local backing devices which do no longer
complete requests at all, not even do error completions. In this
situation, usually a hard-reset and failover is the only way out.

By "aborting", basically faking a local error-completion,
we allow for a more graceful swichover by cleanly migrating services.
Still the affected node has to be rebooted "soon".

By completing these requests, we allow the upper layers to re-use
the associated data pages.

If later the local backing device "recovers", and now DMAs some data
from disk into the original request pages, in the best case it will
just put random data into unused pages; but typically it will corrupt
meanwhile completely unrelated data, causing all sorts of damage.

Which means delayed successful completion,
especially for READ requests,
is a reason to panic().

We assume that a delayed *error* completion is OK,
though we still will complain noisily about it.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
02b91b55260f7a1bdc8da25866cf27f726f5788f 28-Jun-2012 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: introduce stop-sector to online verify

We now can schedule only a specific range of sectors for online verify,
or interrupt a running verify without interrupting the connection.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
0029d62434d9045bc3e8b2eb48ae696e30336e92 14-Jun-2012 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: do not reset rs_pending_cnt too early

Fix asserts like
block drbd0: in got_BlockAck:4634: rs_pending_cnt = -35 < 0 !

We reset the resync lru cache and related information (rs_pending_cnt),
once we successfully finished a resync or online verify, or if the
replication connection is lost.

We also need to reset it if a resync or online verify is aborted
because a lower level disk failed.

In that case the replication link is still established,
and we may still have packets queued in the network buffers
which want to touch rs_pending_cnt.

We do not have any synchronization mechanism to know for sure when all
such pending resync related packets have been drained.

To avoid this counter to go negative (and violate the ASSERT that it
will always be >= 0), just do not reset it when we lose a disk.

It is good enough to make sure it is re-initialized before the next
resync can start: reset it when we re-attach a disk.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
383606e0dea6a380097dbcb0c319b09ca372f36b 14-Jun-2012 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: differentiate between normal and forced detach

Aborting local requests (not waiting for completion from the lower level
disk) is dangerous: if the master bio has been completed to upper
layers, data pages may be re-used for other things already.
If local IO is still pending and later completes,
this may cause crashes or corrupt unrelated data.

Only abort local IO if explicitly requested.
Intended use case is a lower level device that turned into a tarpit,
not completing io requests, not even doing error completion.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
46385c84acd6654d3a38c9c7af1921dbded74aa2 16-Jan-2012 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: move put_ldev from __req_mod() to the endio callback

One invocation in the endio handler is good enough,
we don't need mention it for each of the different ways
it calls __req_mod().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
001a88687aff26d62f8b61d55c6973618cf0f72f 08-Mar-2012 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: fix potential data corruption and protocol error

We assumed only bios with bi_idx == 0 would end up
in drbd_make_request().

That is wrong.

At least device mapper, in __clone_and_map(), may submit
clones only covering a partial bio, but sharing
the original bvec, by adjusting bi_idx and relevant
other bio members of the clone.

We used __bio_for_each_segment() in various places,
even though that is documented as
* drivers should not use the __ version unless they _really_ want to
* run through the entire bio and not just pending pieces

Impact: we would send the full bio bvec, even for the clone
with bi_idx > 0, which will cause data corruption on the
peer (because we submit wrong data at the clone offset),
and will cause a DRBD protocol error, disconnect/reconnect
and resync (thus fixing the corruption),
because the next package header would be expected right
in the middle of the sent data, causing DRBD magic mismatch.

Fix: drop the assert, and use bio_for_each_segment()
instead of the __ version.

Conflicts:

drbd/drbd_tracing.c

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
e89868a0927cfb8a3f535c938e5d6dd7edc6353c 09-Nov-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Fixed an obvious copy-n-paste mistake

This bug might have caused troubles if disk-barriers and the ahead-behind
more are enabled at the same time.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
aaae506d545bb9d06f4d8362f670f406f12e4b58 06-Oct-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Fixed a race condition between detach and start of resync

drbd_state_lock() is only there to serialize cluster wide state
changes. Testing the local disk state needs to happen while
holding the global_state_lock.

Otherwise you might see something like this (Oct 6 on kugel)
14:20:24 drbd0: conn( WFSyncUUID -> Connected ) disk( Inconsistent -> Failed )
14:20:24 drbd0: helper command: /sbin/drbdadm before-resync-target minor-0 exit code 0 (0x0)
14:20:24 drbd0: conn( Connected -> SyncTarget ) disk( Failed -> Inconsistent )

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
6a9a92f4ef05bb3e94bbfe123c21482fa5da9866 06-Oct-2011 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: fix harmless race to not trigger an ASSERT

We have one pre-allocated page to do certain synchronous meta data IO with,
using it is serialized like so:
drbd_md_get_buffer();
drbd_md_sync_page_io();
drbd_md_sync_page_io();
...
drbd_md_put_buffer();

In drbd_md_sync_page_io() there is an
ASSERT(atomic_read(&mdev->md_io_in_use) == 1);

We want to be able to timeout on unresponsive lower level devices, so we
can "detach" in that case. Inside drbd_md_sync_page_io() we grab an extra
reference, to not have a dangling pointer in case a delayed IO eventually
does still complete, even after we "detached" already.

We need to put the extra reference before we signal completion from the
completion handler, or the second drbd_md_sync_page_io() above may
trigger the assert (reference count still 2).

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
b2057629ea96c33e4ae38102ecd0f27ed9a3c3ef 27-Jun-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Hold a reference to ldev while doing meta-data IO

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
4a2fe568b5428abc56d7d172e3571e33d8ab7265 04-Jul-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Keep a reference to the bio until the completion handler finished

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
0c464425158482647226fb30708c68fffc061585 26-Jun-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Implemented wait_until_done_or_disk_failure()

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
e17117310b73ce6d2340ad46a539d3896a2d6de8 27-Jun-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Replaced md_io_mutex by an atomic: md_io_in_use

The new function drbd_md_get_buffer() aborts waiting for the buffer
in case the disk failes in the meantime.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
cc94c65015022e7329e80e057e20848581d3f2a5 26-Jun-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: moved md_io into mdev

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
81a5d60ecfe1d94627abb54810445f0fd5892f42 23-Feb-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Replaced the minor_table array by an idr

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
0e29d163f7ec8369b3f1fb70900d29b1c4a1dc8b 18-Feb-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Reworked the unconfiguring and thread stopping code

* Moved CONFIG_PENDING and DEVICE_DYING from mdev to tconn.
* Renamed drbd_reconfig_start() and drbd_reconfig_done() to
conn_reconfig_start() and conn_reconfig_done().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
9676c760979371701ea5a6f8adb7ce8125c22c7d 22-Feb-2011 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: fix a wrong likely(), updated comments

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
fcefa62e4c26e70c70b9e8252a4bc9b9031a4182 17-Feb-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Rename drbd_endio_{pri,sec} -> drbd_{,peer_}request_endio

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
fbe29dec98622369c106ba72279500fb2f5aba99 17-Feb-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Rename drbd_submit_ee -> drbd_submit_peer_request

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
8410da8f0e3ff5c97bce1b10627316be509ce476 11-Feb-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Introduced tconn->cstate_mutex

In compatibility mode with old DRBDs, use that as the state_mutex
as well.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
dad20554812e73a2bfbe45d1b161d5d3c249e597 11-Feb-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Removed drbd_state_lock() and drbd_state_unlock()

The lock they constructed is only taken when the state_mutex
was already taken. It is superficial.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
bbeb641c3e4982d6bba21188545a7fd44ab0a715 10-Feb-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Killed volume0; last step of multi-volume-enablement

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2a67d8b93b3363d4a5608d16d510a4bf6b3863fb 09-Feb-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Converted drbd_send_ping() and related functions from mdev to tconn

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
00d56944ff086f895e9ad184a7785ca1eece4a3b 09-Feb-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Generalized the work callbacks

No longer work callbacks must operate on a mdev. From now on they
can also operate on a tconn.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
392c8801922f51466045ece2f1f2884b8c9cd9a2 09-Feb-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: drbd_thread has now a pointer to a tconn instead of to a mdev

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
19393e105f9702a014d3ce08bce92b3ad9cf96b5 09-Feb-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Converted drbd_worker() from mdev to tconn

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
f1b3a6ec7d2b3033b18c6ad125f5694c85599c4a 08-Feb-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Consolidated the setup of the thread name into the framework

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
a21e9298275a0145e43c2413725549112d99ba01 08-Feb-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Moved the mdev member into drbd_work (from drbd_request and drbd_peer_request)

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
808222845d62e551630699a1381bbf8a1fd4a286 08-Feb-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Converted drbd_calc_cpu_mask() and drbd_thread_current_set_cpu() from mdev to tconn

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
611208706f28c502c8c01791ac4f0b14cde395b2 08-Feb-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Converted drbd_(get|put)_data_sock() and drbd_send_cmd2() to tconn

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
0625ac190d222fd0855bad79e93f1556fc45dd20 07-Feb-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Converted wake_asender() and request_ping() from mdev to tconn

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
e64a32945902a178c9de9b38e0ea3290981605bc 05-Feb-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Do no sleep long in drbd_start_resync

Work items that sleep too long can cause requests to take as
long as the longest sleeping work item.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
1f04af33fe7db542d75a487b8381b5a3402b7896 07-Feb-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Moved code

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
bc31fe3352f9cd76195ce6eb638dfc2dac17dc2e 07-Feb-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Eliminated the user of drbd_task_to_thread()

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
db830c464b69e26ea4d371e38bb2320c99c82f41 04-Feb-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Local variable renames: e -> peer_req

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
f6ffca9f42902556bcf72426d2d0714bdbfdbe09 04-Feb-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Rename struct drbd_epoch_entry to struct drbd_peer_request

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
5e4722645afb27ee749ea65988544450f08f78ba 27-Jan-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: _req_conflicts(): Get rid of the epoch_entries tree

Instead of keeping a separate tree for local and remote write requests
for finding requests and for conflict detection, use the same tree for
both purposes. Introduce a flag to allow distinguishing the two
possible types of entries in this tree.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
c012949a4084a9f91654121d28f199ef408cb9d7 19-Jan-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Replaced all p_header80 with a generic p_header

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
87eeee41f8740451b61a1e7d37a494333a906861 19-Jan-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: moved req_lock and transfer log from mdev to tconn

sed -i \
-e 's/mdev->req_lock/mdev->tconn->req_lock/g' \
-e 's/mdev->unused_spare_tle/mdev->tconn->unused_spare_tle/g' \
-e 's/mdev->newest_tle/mdev->tconn->newest_tle/g' \
-e 's/mdev->oldest_tle/mdev->tconn->oldest_tle/g' \
-e 's/mdev->out_of_sequence_requests/mdev->tconn->out_of_sequence_requests/g' \
*.[ch]

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
31890f4ab299c4116cf0a104ca9ce4f9ca2c5da0 19-Jan-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: moved agreed_pro_version, last_received and ko_count to tconn

sed -i \
-e 's/mdev->agreed_pro_version/mdev->tconn->agreed_pro_version/g' \
-e 's/mdev->last_received/mdev->tconn->last_received/g' \
-e 's/mdev->ko_count/mdev->tconn->ko_count/g' \
*.[ch]

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
e6b3ea83bc72e126247b241c1164794a644d6fdc 19-Jan-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: moved receiver, worker and asender from mdev to tconn

Patch mostly:
sed -i -e 's/mdev->receiver/mdev->tconn->receiver/g' \
-e 's/mdev->worker/mdev->tconn->worker/g' \
-e 's/mdev->asender/mdev->tconn->asender/g' \
*.[ch]

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
e42325a57606396539807ff55c24febda39f8d01 19-Jan-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: moved data and meta from mdev to tconn

Patch mostly:

sed -i -e 's/mdev->data/mdev->tconn->data/g' \
-e 's/mdev->meta/mdev->tconn->meta/g' \
*.[ch]

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
89e58e755e37137135c28a90c93be1b28faff485 19-Jan-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: moved net_conf from mdev to tconn

Besides moving the struct member, everything else is generated by:

sed -i -e 's/mdev->net_conf/mdev->tconn->net_conf/g' \
-e 's/odev->net_conf/odev->tconn->net_conf/g' \
*.[ch]

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
841ce241fa355048f66172a47e356bb6e9159c9d 15-Dec-2010 Andreas Gruenbacher <agruen@linbit.com> drbd: Replace the ERR_IF macro with an assert-like macro

Remove the file name and line number from the syslog messages generated:
we have no duplicate function names, and no function contains the same
assertion more than once.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
e77a0a5cc1e6961f485b5623ef42f3b910969675 25-Jan-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Convert all constants in enum drbd_thread_state to upper case

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
8554df1c6d3bb7686b39ed775772f507fa857c19 25-Jan-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Convert all constants in enum drbd_req_event to upper case

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
bb3bfe96144a4535d47ccfea444bc1ef8e02f4e3 21-Jan-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Remove the unused hash tables

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
010f6e678ffddbf3134863038c5b2f6509f1eed3 14-Jan-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Put sector and size in struct drbd_epoch_entry into struct drbd_interval

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
ace652acf2d7e564dac48c615d9184e7ed575f9c 03-Jan-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Put sector and size in struct drbd_request into struct drbd_interval

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
579b57ed730819970a3542b4bbcc2d4176f25c72 13-Jan-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Magic reserved block_id value cleanup

The ID_VACANT definition has become entirely irrelevant by now.

The is_syncer_block_id() macro does not improve the code, so eliminated
it.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
9a8e77530fa7059044114bcf1a897a470ec21bc9 11-Jan-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Consistently use block_id == ID_SYNCER for checksum based resync and online verify

DRBD_MAGIC has nothing to do with block ids and the funny values
computed were not actually used, anyway.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
0cfdd247d1779d5ffc8f685b172a526ecdc6773f 25-May-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Use the correct max_bio_size when creating resync requests

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
24c4830c8ec3cbc904d84c213126a35f41a4e455 21-May-2011 Bart Van Assche <bvanassche@acm.org> drbd: Fix spelling

Found these with the help of ispell -l.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
53ea433145d9a56c7ad5e69f21f5662053e00e84 08-Mar-2011 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: fix potential distributed deadlock

We limit ourselves to a configurable maximum number of pages used as
temporary bio pages.

If the configured "max_buffers" is not big enough to match the bandwidth
of the respective deployment, a distributed deadlock could be triggered
by e.g. fast online verify and heavy application IO.

TCP connections would block on congestion, because both receivers
would wait on pages to become available.

Fortunately the respective senders in this case would be able to give
back some pages already. So do that.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
7961243b7bdd62d72b47eb2c0bee776c51a8a8e2 02-Mar-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Fixed handling of read errors on a 'VerifyS' node

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
8f21420ebd5ca5a751e2f606b49b0acd2a2af314 01-Mar-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Fixed handling of read errors on a 'VerifyT' node

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
c5a91619793d444e5103ec5841045bf878718398 25-Jan-2011 Andreas Gruenbacher <agruen@linbit.com> drbd: Remove unused function atodb_endio()

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
10f6d9926cd17afff9dc03c967706419798b4929 24-Jan-2011 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: don't BUG_ON, if bio_add_page of a single page to an empty bio fails

Just deal with it more gracefully, if we fail to add even a single page
to an empty bio. We used to BUG_ON() there, but it has been observed in
some Xen deployment, so we need to handle that case more robustly now.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
62b0da3a244ac33d25a77861ef1cc0080103f2ff 20-Jan-2011 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: log UUIDs whenever they change

All decisions about sync, sync direction, and wether or not to
allow a connect or attach are based on our set of UUIDs to tag a
data generation.

Log changes to the UUIDs whenever they occur,
logging "new current UUID P:Q:R:S" is more useful
than "Creating new current UUID".

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
6c922ed543bee1bc6685ade07be59f3fa49a7288 12-Jan-2011 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: only generate and send a new sync uuid after a successful state change

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
20ee639024e3d33111df0e343050b218c656bf16 18-Jan-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: cleaned up __set_current_state() followed by schedule_timeout() calls

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
370a43e7982dd497822097e0ae6022947ac2e7d4 14-Jan-2011 Philipp Reisner <philipp.reisner@linbit.com> drbd: Work on the Ahead -> SyncSource transition

The test if rs_pending_cnt == 0 was too weak. Using Test for
unacked_cnt == 0 instead. Moved that into the worker.

Since unacked_cnt gets already increased when an P_RS_DATA_REQ
comes in.

Also using a timer to make Ahead -> SyncSource -> Ahead cycles
slower...

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
794abb753e29e85949b3719dbc2ab6a98711a47e 27-Dec-2010 Philipp Reisner <philipp.reisner@linbit.com> drbd: Cleaned up the resync timer logic

Besides removed a few lines of code, this moves the inspection
of the state from before the queuing process to after the queuing.
I.e. more closely to the actual invocation of the work.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
d612d309e4c8401ad94c531678b59c4a8b7c41ce 27-Dec-2010 Philipp Reisner <philipp.reisner@linbit.com> drbd: No longer answer P_RS_DATA_REQUEST packets when in C_AHEAD mode

When the sync source node replies to a P_RS_DATA_REQUEST packet
when it is already in ahead mode. I.e. those two packets
crossed each other on the wire, that may lead to diverging
bitmaps.

This never happens in a well-tuned-system. In a well-tuned-
system the resync controller has reduced the resync speed
to zero long before we got into ahead-mode.

But we have to be prepared for the not-well-tuned-system
of course as well.
Because -> diverging bitmaps = non terminating resync.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
071942727824bab03b1a3f6b6eeb5b269697b333 20-Dec-2010 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: ratelimit io error messages

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
5a22db8968a69bec835d1ed9a96ab3381719e0c0 17-Dec-2010 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: serialize sending of resync uuid with pending w_send_oos

To improve the latency of IO requests during bitmap exchange,
we recently allowed writes while waiting for the bitmap, sending "set
out-of-sync" information packets for any newly dirtied bits.

We have to make sure that the new resync-uuid does not overtake
these "set oos" packets. Once the resync-uuid is received, the
sync target starts the resync process, and expects the bitmap to
only be cleared, not re-set.

If we use this protocol extension, we queue the generation and sending
of the resync-uuid on the worker, which naturally serializes with all
previously queued packets.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
02851e9f00d78dbc8ded0aacbf9bf3b631d627b3 16-Dec-2010 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: move bitmap write from resync_finished to after_state_change

We must not call it directly from resync_finished,
as we may be in either receiver or worker context there.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
4b0715f09655e76ca24c35a9e25e7c464c2f7346 14-Dec-2010 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: allow petabyte storage on 64bit arch

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
19f843aa08e2d8f87a09b4c2edc43b00638423a8 15-Dec-2010 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: bitmap keep track of changes vs on-disk bitmap

When we set or clear bits in a bitmap page,
also set a flag in the page->private pointer.

This allows us to skip writes of unchanged pages.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
81e84650c200de0695372461964dd960365696db 09-Dec-2010 Andreas Gruenbacher <agruen@linbit.com> drbd: Use the standard bool, true, and false keywords

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
09b9e7979378fe070784de20e50bb1d42aa643ab 03-Dec-2010 Philipp Reisner <philipp.reisner@linbit.com> drbd: Implemented the before-resync-source handler

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
9d77a5fee9d2a1ea4cd9a841d27b107df5913b33 07-Nov-2010 Philipp Reisner <philipp.reisner@linbit.com> drbd: Make some functions static

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
e3555d8545976703938d1b59e2db509426dbe02c 07-Nov-2010 Philipp Reisner <philipp.reisner@linbit.com> drbd: Implemented priority inheritance for resync requests

We only issue resync requests if there is no significant application IO
going on. = Application IO has higher priority than resnyc IO.

If application IO can not be started because the resync process locked
an resync_lru entry, start the IO operations necessary to release the
lock ASAP.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
59817f4fab6a165ba83ce399464ba38432db8233 29-Oct-2010 Philipp Reisner <philipp.reisner@linbit.com> drbd: Do not cleanup resync LRU for the Ahead/Behind SyncSource/SyncTarget transitions

This one should be replaced with moving this cleanup to the
'right' position.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
c4752ef1284519c3baa1c3b19df34a80b4905245 27-Oct-2010 Philipp Reisner <philipp.reisner@linbit.com> drbd: When proxy's buffer drained off go into regular resync mode

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
73a01a18b9c28a0fab1131ece5b0a9bc00a879b8 27-Oct-2010 Philipp Reisner <philipp.reisner@linbit.com> drbd: New packet for Ahead/Behind mode: P_OUT_OF_SYNC

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
1816a2b47afae838e53a177d5d166cc7be97d6b5 11-Nov-2010 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: properly use max_hw_sectors to limit the our bio size

To ease tracking of bios in some hash tables, we want it to
not cross certain boundaries (128k, used to be 32k).
We limit the maximum bio size using queue parameters.

Historically some defines and variables we use there have been named
max_segment_size, which was misguided. Rename them to max_bio_size,
and use [blk_]queue_max_hw_sectors where appropriate.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2649f0809f55e4df98c333a2b85c6fc8fee04804 05-Nov-2010 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: use the resync controller for online-verify requests as well

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
e65f440d474d7d6a6fd8a2c844e851d8c96ed9c5 05-Nov-2010 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: factor out drbd_rs_number_requests

Preparation patch to be able to use the auto-throttling resync controller
for online-verify requests as well.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
9bd28d3c90c80c7ec46085de281b38f67331da41 05-Nov-2010 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: factor out drbd_rs_controller_reset

Preparation patch to be able to use the auto-throttling resync controller
for online-verify requests as well.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
ea5442aff68c559c951373739201721185191748 05-Nov-2010 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: advance progress step marks for online-verify

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
26525618863afcc4aab8b2a83451d37c6f513460 05-Nov-2010 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: only reset online-verify start sector if verify completed

For network hickups during online-verify, on the next verify
triggered, we by default want to resume where it left off.

After any replication link interruption, there will be a (possibly
empty) resync. Do not reset online-verify start sector if some resync
completed, that would defeats the purpose.

Only reset the start sector once a verify run is completed.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
7eaceaccab5f40bbfda044629a6298616aeaed50 10-Mar-2011 Jens Axboe <jaxboe@fusionio.com> block: remove per-queue plugging

Code has been converted over to the new explicit on-stack plugging,
and delay users have been converted to use the new API for that.
So lets kill off the old plugging along with aops->sync_page().

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
a115413de13ae6beb0cbfc198afe385a261ab284 13-Nov-2010 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: fix for spin_lock_irqsave in endio callback

In commit 9b7f76dc37919ea36caa9680a3f765e5b19b25fb,
Author: Lars Ellenberg <lars.ellenberg@linbit.com>
Date: Wed Aug 11 23:40:24 2010 +0200

drbd: new configuration parameter c-min-rate

a bad chunk slipped through, which is now reverted as well,
restoring the correct irqsave for the endio callback.

This patch also add comments at both req_mod()
and in the endio callback so it should not happen again.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
451a3c24b0135bce54542009b5fde43846c7cf67 17-Nov-2010 Arnd Bergmann <arnd@arndb.de> BKL: remove extraneous #include <smp_lock.h>

The big kernel lock has been removed from all these files at some point,
leaving only the #include.

Remove this too as a cleanup.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2451fc3b2bd3a7205270da75a21dde0d5d7c96a2 24-Aug-2010 Philipp Reisner <philipp.reisner@linbit.com> drbd: Removed the BIO_RW_BARRIER support form the receiver/epoch code

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
bc571b8cb930ea78207851dd38b5a435fcb8891c 21-Oct-2010 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: fix a misleading printk

This codepath used to be called only for failed kmalloc GFP_ATOMIC,
but is now also triggered by other things.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
13d42685bec1f012dcbc5d187490eb1d15ec8219 13-Oct-2010 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: add explicit drbd_md_sync to drbd_resync_finished

As we usually update the generation UUIDs here, we should explicitly
sync them to disk. So far this has been done only implicitly by related
code paths.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
22cc37a943832c948808884604ec6f5ff2594c1d 14-Sep-2010 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: fix unlikely access after free and list corruption

Various cleanup paths have been incomplete, for the very unlikely case
that we cannot allocate enough bios from process context when submitting
on behalf of the peer or resync process.

Never observed.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
af85e8e83d160f72a10e4467852646ac08614260 07-Oct-2010 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: fix for spurious fullsync (uuids rotated too fast)

If it was an "empty" resync, the SyncSource may have already "finished"
the resync and rotated the UUIDs, before noticing the connection loss
(and generating a new uuid, if Primary, rotating again), while the
SyncTarget did not change its uuids at all, or only got to the previous
sync-uuid.
This would then again lead to a full sync on next handshake
(see also Bug #251).

Fix:
Use explicit resync finished notification even for empty resyncs,
do not finish an empty resync implicitly on the SyncSource.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
f10f262349762c96ab247b6108af3a30b52b6f5a 05-Oct-2010 Philipp Reisner <philipp.reisner@linbit.com> drbd: Fixed a stupid copy and paste error

This caused rs_planed to be not in sync with the content of the fifo.
That in turn could cause that the resync comes to a complete halt.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
78db89287ce0f146a1f2a019a0b243ea4557caac 13-Sep-2010 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: DIV_ROUND_UP not needed here

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
5a75cc7cfbb98e896232902214432dae30653dfe 09-Sep-2010 Philipp Reisner <philipp.reisner@linbit.com> drbd: Fixed compatibility with protocol versions smaller than 95

Forgot to consider the max size for the resync requests.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
1d53f09e170e477de67babd7a10e277479260d51 05-Sep-2010 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: fix potential kernel BUG (NULL deref)

BUG trace would look like:
lc_find
drbd_rs_complete_io
got_OVResult
drbd_asender

Could be triggered by explicit, or IO-error policy based,
detach during online-verify.

We may only dereference mdev->resync, if we first get_ldev(), as the
disk may break any time, causing mdev->resync to disappear once all
ldev references have been returned.
Already in flight online-verify requests or replies may still come in,
which we then need to ignore.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
435f07402b3165b90592073bc0f8c6f8fa160ff9 06-Sep-2010 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: don't count sendpage()d pages only referenced by tcp as in use

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
63106d3c6c769b6219bd04edde513b12abae3f61 01-Sep-2010 Philipp Reisner <philipp.reisner@linbit.com> drbd: Removed a race that could cause unexpected execution of w_make_resync_request()

The actual race happened int the drbd_start_resync() function. Where
drbd_resync_finished() -> __drbd_set_state() set STOP_SYNC_TIMER and
armed the timer.

If the timer fired before execution reaches the mod_timer statement
at the end of drbd_start_resync() the latter would cause an
unexpected call to w_make_resync_request().

Removed the STOP_SYNC_TIMER bit, and base it on the connection state.

The STOP_SYNC_TIMER bit probably originates probably the time before
the state engine.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
0778286a133d2d3f81861a4e5db308e359583006 31-Aug-2010 Philipp Reisner <philipp.reisner@linbit.com> drbd: Disable activity log updates when the whole device is out of sync

When the complete device is marked as out of sync, we can disable
updates of the on disk AL. Currently AL updates are only disabled
if one uses the "invalidate-remote" command on an unconnected,
primary device, or when at attach time all bits in the bitmap are
set.

As of now, AL updated do not get disabled when a all bits becomes
set due to application writes to an unconnected DRBD device.
While this is a missing feature, it is not considered important,
and might get added later.

BTW, after initializing a "one legged" DRBD device
drbdadm create-md resX
drbdadm -- --force primary resX
AL updates also get disabled, until the first connect.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
0b70a13dac014ec9274640b9e945bde493ba365e 20-Aug-2010 Philipp Reisner <philipp.reisner@linbit.com> drbd: Sending of big packets, for payloads from 64KByte to 4GByte

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
204bba9965c4cc175bf5bc65ddd19889e9085c72 23-Aug-2010 Philipp Reisner <philipp.reisner@linbit.com> drbd: Bugfix for regression introduced with f9bc8913c06022e

If we intent to use the block_id member of an epoch entry,
we may not use the digest member.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
0f0601f4ea2f53cfd8bcae060fb03d9bbde070ec 11-Aug-2010 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: new configuration parameter c-min-rate

We now track the data rate of locally submitted resync related requests,
and can thus detect non-resync activity on the lower level device.

If the current sync rate is above c-min-rate, and the lower level device
appears to be busy, we throttle the resyncer.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
80a40e439e5a3f30b0a6210a1add6d7c33392e54 11-Aug-2010 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: reduce code duplication when receiving data requests

also canonicalize the return values of read_for_csum
and drbd_rs_begin_io to return -ESOMETHING, or 0 for success.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
1d7734a0df02ff5068ff8baa1447c7baee601db1 11-Aug-2010 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: use rolling marks for resync speed calculation

The current resync speed as displayed in /proc/drbd fluctuates a lot.
Using an array of rolling marks makes this calculation much more stable.
We used to have this (a long time ago with 0.7), but it got lost somehow.

If "stalled", do not discard the rest of the information, just add a
" (stalled)" tag to the progress line.

This patch also shortens a spinlock critical section somewhat, and
reduces the number of atomic operations in put_ldev.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
c36c3ced692b38d0cf90a5e6f875be2f9ebbc037 11-Aug-2010 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: let drbd_free_ee implicitly free any digest

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
85719573dd716bc2ac3e098b44adfed884250bab 21-Jul-2010 Philipp Reisner <philipp.reisner@linbit.com> drbd: Replaced some casts by an union. Improved comments

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
d207450cf2731c6a2afa8c78fb31c7206cd35eba 22-Jul-2010 Philipp Reisner <philipp.reisner@linbit.com> drbd: Bugfix: rs_in_flight could become wrong if read_for_csum() requested reschedule later

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
778f271dfe7a7173c0bae2d6cde8d9bd1533e668 06-Jul-2010 Philipp Reisner <philipp.reisner@linbit.com> drbd: The new, smarter resync speed controller

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
8e26f9ccb9be00fdb33551a34c8f6029e89ab79f 06-Jul-2010 Philipp Reisner <philipp.reisner@linbit.com> drbd: New sync_param packet, that includes the parameters of the new controller

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
265be2d09853d425ad14a61cda0ca63345613d0c 31-May-2010 Philipp Reisner <philipp.reisner@linbit.com> drbd: Finished the "on-no-data-accessible suspend-io;" functionality

When no data is accessible (no connection to the peer, nor a local disk)
allow the user to select to freeze all IO operations instead of getting
IO errors.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
e7f52dfb4f378ea1bbfd4476f4e8ba42f5fb332c 03-Aug-2010 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: revert "delay probes", feature is being re-implemented differently

It was a now abandoned attempt to throttle resync bandwidth
based on the delay it causes on the bulk data socket.
It has no userbase yet, and has been disabled by
9173465ccb51c09cc3102a10af93e9f469a0af6f already.
This removes the now unused code.

The basic feature, namely using up "idle" bandwith
of network and disk IO subsystem, with minimal impact
to application IO, is being reimplemented differently.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2a0ab2cd73c26835e635ed4e3868f983519048fb 26-May-2010 Philipp Reisner <philipp.reisner@linbit.com> drbd: Reduce verbosity

The "Local READ/WRITE failed" messages are too verbose.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
d255e5ff5fc6cc6c60dd014d1261448a7bbc8134 27-May-2010 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: fix hang on local read errors while disconnected

"canceled" w_read_retry_remote never completed, if they have been
canceled after drbd_disconnect connection teardown cleanup has already
run (or we are currently not connected anyways).

Fixed by not queueing a remote retry if we already know it won't work
(pdsk not uptodate), and cleanup ourselves on "cancel", in case we hit a
race with drbd_disconnect.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
32fa7e91f923d8b2578c42016ff3a94efc9968a2 26-May-2010 Philipp Reisner <philipp.reisner@linbit.com> drbd: Removed the now empty w_io_error() function

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
fc8ce1941d668c70e57a07f13f5a63e73e5dbff3 20-May-2010 Philipp Reisner <philipp.reisner@linbit.com> drbd: Fix: Do not detach, if a bio with a barrier fails

Introduced a few days ago:
commit 45bb912bd5ea4d2b3a270a93cbdf767a0e2df6f5
Author: Lars Ellenberg <lars.ellenberg@linbit.com>
Date: Fri May 14 17:10:48 2010 +0200

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
bb3d000cb99aa0924b78c1ae5f5943484527868a 14-May-2010 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: allow resync requests to be larger than max_segment_size

this should allow for better background resync performance.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
45bb912bd5ea4d2b3a270a93cbdf767a0e2df6f5 14-May-2010 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: Allow drbd_epoch_entries to use multiple bios.
This should allow for better performance if the lower level IO stack
of the peers differs in limits exposed either via the queue,
or via some merge_bvec_fn.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
cdd67a74603d0453ddffc24c572aed2ddd1795b8 04-May-2010 Philipp Reisner <philipp.reisner@linbit.com> drbd: Control the actual resync rate based on the queuing delay of data packets

In a setup with a high bandwidth and high latency network, eventually
involving deep queues in routers, it is beneficial to only fill those
queues up to an limited extend with resync data.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
5c3c7e64bbcb60db39d0d55c8fab39ef41d41c30 10-Apr-2010 Lars Ellenberg <lars.ellenberg@linbit.com> drbd: don't expose failed local READ to upper layers

fix regression introduced in 8.3.3:
commit a9b17323f2875f5d9b132c2b476a750bf44b10c7
Author: Lars Ellenberg <lars.ellenberg@linbit.com>
Date: Wed Aug 12 15:18:33 2009 +0200

out-of-spinlock completion of master bio

: (bio_rw(bio) == READA)
? read_completed_with_error
: read_ahead_completed_with_error;

is obviously not what was intended.

No one noticed because of
* page-cache at work,
* local RAIDs

Impact:
Failed local READs are not retried remotely,
but errored to upper layers, causing filesystems
to remount read-only, or worse.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
d0c3f60f3611ceac9b1e4fdffd1497337568e7cb 02-Mar-2010 Philipp Reisner <philipp.reisner@linbit.com> drbd: Make sure we do not send state updates during an empty resync [Bugz 271]

This is a race condition that existed for ages.
The previous commit reduces the window, this one closes it.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
309d1608cce32903d67d47e7545e232c400b6aa0 02-Mar-2010 Philipp Reisner <philipp.reisner@linbit.com> drbd: Reduce the time an empty resync takes usually

This mitigates changes introduced with commit:
http://git.drbd.org/?p=drbd-8.3.git;a=commit;h=4b6803a3276652da3737

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
676396d545350a70d922605ec23c2ed26124334a 03-Mar-2010 Lars Ellenberg <lars.ellenberg@linbit.com> fix unit of rs_same_csums accounting

Depending on resync request size,
we need to account for more than one bit.

Impact: cosmetic

If SyncTarget reported correctly 100% equal checksums,
the SyncSource usually reported 12% equal checksums instead,
because it only counted requests, we typically do 32k resync requests,
and the bitmap granularity is still 4k.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
820cd61a28503598f4262c544082ccb33678b9fc 13-Dec-2009 Huang Weiyi <weiyi.huang@gmail.com> drbd: remove unused #include <linux/version.h>

Remove unused #include <linux/version.h>('s) in
drivers/block/drbd/drbd_main.c
drivers/block/drbd/drbd_receiver.c
drivers/block/drbd/drbd_worker.c

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
7b886f4f7a051dc88165684cbcddd98e22bd0203 09-Dec-2009 Huang Weiyi <weiyi.huang@gmail.com> drbd: remove duplicated #include

Remove duplicated #include('s) in
drivers/block/drbd/drbd_worker.c

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
6a0afdf58d40200abd0c717261d1bc4c49195c2f 01-Oct-2009 Jens Axboe <jens.axboe@oracle.com> drbd: remove tracing bits

They should be reimplemented in the current scheme.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
ab8fafc2e1ecc0090f2c78902d3b992eec8b11f8 28-Sep-2009 Lars Ellenberg <lars.ellenberg@linbit.com> dropping unneeded include autoconf.h

It is force-included on the gcc command line since at least 2.6.15.
Explicit include lines seem to break compilation now in certain configurations.

Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
b411b3637fa71fce9cf2acf0639009500f5892fe 26-Sep-2009 Philipp Reisner <philipp.reisner@linbit.com> The DRBD driver

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>