History log of /external/autotest/scheduler/rdb_testing_utils.py
Revision Date Author Comments (<<< Hide modified files) (Show modified files >>>)
6818633834ad52c3de153235639ea9299a6e9a6d 28-Apr-2015 Matthew Sartori <msartori@chromium.org> [autotest] Require lock reason to lock device

When locking a device, a locking reason now must be provided.
This applies to both adding a new device, and modifying an
existing device from both the web frontend and the atest
command-line tool.

BUG=chromium:336805
DEPLOY=migrate
TEST=Tested adding locked/unlocked devices and locking/unlocking devices
from both the web frontend and using the 'atest host ...' command-line tools.

Change-Id: I3a8cd8891a2999f026dd709ae8a79e2b8cbc251a
Reviewed-on: https://chromium-review.googlesource.com/267595
Tested-by: Matthew Sartori <msartori@chromium.org>
Reviewed-by: Dan Shi <dshi@chromium.org>
Commit-Queue: Matthew Sartori <msartori@chromium.org>
/external/autotest/scheduler/rdb_testing_utils.py
52a239316b829106b540e57a0100496fed1fe5aa 21-Nov-2014 Fang Deng <fdeng@chromium.org> [autotest] RDB respects min_duts requirement.

This is part II of making host scheduler support a min_dut
requirement per suite.

With this Cl, rdb will do a two-round host acquisition.

In the first round, it will try to allocate at most |suite_min_duts|
number of duts to jobs that belong to a suite.

If there are still available duts, in the second round,
it will try to allocate the rest of the duts to the jobs that have
not been satisfied.

BUG=chromium:432648
TEST=add unit tests to rdb_integration_tests;
ran rdb_cache_unittests; rdb_host_unittests;rdb_unittests;
TEST=Integration test with CL:231139. Run two bvt-cq suites with
different priority and suite_min_duts. Set testing_mode=True
and testing_exceptions=test_suites. Confirm the host allocation.
TEST=Test inline host acquisition still works

Change-Id: I7b39bd8eaa5b6966f3ed267667919bae07d5665a
Reviewed-on: https://chromium-review.googlesource.com/231210
Reviewed-by: Prashanth B <beeps@chromium.org>
Commit-Queue: Fang Deng <fdeng@chromium.org>
Tested-by: Fang Deng <fdeng@chromium.org>
/external/autotest/scheduler/rdb_testing_utils.py
22dd226625255110c079e979113dcda1f4fa5ea8 29-Nov-2014 Prashanth Balasubramanian <beeps@google.com> [autotest] Shard client resyncs conflicting jobs.

There is currently a race in the shard_client, described thus:

* scheduler marks job 1 complete
* shard heartbeat doesn't include job 1 in packet
* scheduler marks job 1 with shard_id = NULL (for upload)
* master replies with 'Hey job 1 is Queued but you say it
isn't running'
* shard serializes the fks of job1 sent by master, over
writing the NULL shard_id
=> Job1 is never synced back to the master

This cl fixes this race by:
1. The shard client continues to declare a job as incomplete till
the scheduler has update the shard_id to NULL. In the above
example, the second step won't happen.
2. If the shard_client notices any disagreement between the complete
bits of jobs in the local db and the 'new' jobs sent from the
master, it re-marks the job for upload by setting shard_id=NULL.
In the above example this would happen after the last step,
forcing the next heartbeat to pick up the job.

The cl also adds some important stats and logging to help debug
issues in production.

TEST=Ran suites via puppylab, unittests.
BUG=chromium:423225,chromium:425347
DEPLOY=scheduler

Change-Id: Ib35b193681b187e3745a4678778dbeba77fe83e5
Reviewed-on: https://chromium-review.googlesource.com/232193
Tested-by: Prashanth B <beeps@chromium.org>
Reviewed-by: Fang Deng <fdeng@chromium.org>
Commit-Queue: Prashanth B <beeps@chromium.org>
/external/autotest/scheduler/rdb_testing_utils.py
4ec9867f46deb969c154bebf2e64729d56c3a1d3 15-May-2014 Prashanth B <beeps@google.com> [autotest] Split host acquisition and job scheduling II.

This cl creates a stand-alone service capable of acquiring hosts for
new jobs. The host scheduler will be responsible for assigning a host to
a job and scheduling its first special tasks (to reset and provision the host).
There on after, the special tasks will either change the state of a host or
schedule more tasks against it (eg: repair), till the host is ready to
run the job associated with the Host Queue Entry to which it was
assigned. The job scheduler (monitor_db) will only run jobs, including the
special tasks created by the host scheduler.

Note that the host scheduler won't go live till we flip the
inline_host_acquisition flag in the shadow config, and restart both
services. The host scheduler is dead, long live the host scheduler.

TEST=Ran the schedulers, created suites. Unittests.
BUG=chromium:344613, chromium:366141, chromium:343945, chromium:343937
CQ-DEPEND=CL:199383
DEPLOY=scheduler, host-scheduler

Change-Id: I59a1e0f0d59f369e00750abec627b772e0419e06
Reviewed-on: https://chromium-review.googlesource.com/200029
Reviewed-by: Prashanth B <beeps@chromium.org>
Tested-by: Prashanth B <beeps@chromium.org>
Commit-Queue: Prashanth B <beeps@chromium.org>
/external/autotest/scheduler/rdb_testing_utils.py
f66d51b5caa96995b91e7c155ff4378cdef4baaf 06-May-2014 Prashanth B <beeps@google.com> [autotest] Split host acquisition and job scheduling.

This is phase one of two in the plan to split host acquisition out of the
scheduler's tick. The idea is to have the host scheduler use a job query
manager to query the database for new jobs without hosts and assign
hosts to them, while the main scheduler uses the same query managers to
look for hostless jobs.

Currently the main scheduler uses the class to acquire hosts inline,
like it always has, and will continue to do so till the
inline_host_acquisition feature flag is turned on via the shadow_config.

TEST=Ran the scheduler, suites, unittets.
BUG=chromium:344613
DEPLOY=Scheduler

Change-Id: I542e4d1e509c16cac7354810416ee18ac940a7cf
Reviewed-on: https://chromium-review.googlesource.com/199383
Reviewed-by: Prashanth B <beeps@chromium.org>
Commit-Queue: Prashanth B <beeps@chromium.org>
Tested-by: Prashanth B <beeps@chromium.org>
/external/autotest/scheduler/rdb_testing_utils.py
0e960285b022fad77f0b087a2007867363bf6ab9 14-May-2014 Prashanth B <beeps@google.com> [autotest] Consolidate methods required to setup a scheduler.

Move methods/classes that will be helpful in setting up another scheduler
process into scheduler_lib:
1. Make a connection manager capable of managing connections.
Create, access, close the database connection through this manager.
2. Cleanup setup_logging so it's usable by multiple schedulers if they
just change the name of the logfile.

TEST=Ran suites, unittests.
BUG=chromium:344613
DEPLOY=Scheduler

Change-Id: Id0031df96948d386416ce7cfc754f80456930b95
Reviewed-on: https://chromium-review.googlesource.com/199957
Reviewed-by: Prashanth B <beeps@chromium.org>
Tested-by: Prashanth B <beeps@chromium.org>
Commit-Queue: Prashanth B <beeps@chromium.org>
/external/autotest/scheduler/rdb_testing_utils.py
2d8047e8b2d901bec66d483664d8b6322501d245 28-Apr-2014 Prashanth B <beeps@google.com> [autotest] In process request/host caching for the rdb.

This cl implements an in process host cache manager for the rdb. The
following considerations were taken into account while designing it:
1. The number of requests outweigh the number of leased hosts
2. The number of net hosts outweighs the number of leased hosts
3. The 'same' request can consult the cache within the span of a single
batched request. These will only be same in terms of host labels/acls
required, not in terms of priority or parent_job_id.

Resulting ramifications:
1. We can't afford to consult the database for each request.
2. We can afford to refresh our in memory representation of a host
just before leasing it.
3. Leasing a host can fail, as we might be using a stale cached host.
4. We can't load a map of all hosts <-> labels each request.
5. Invalidation is hard for most sane, straight-forward choices of
keying hosts against requests.
6. Lower priority requests will starve if they try to lease the same
hosts taken by higher priority requests.

Main design tenets:
1. We can tolerate some staleness in the cache, since we're going
to make sure the host is unleased just before using it.
2. If a job hits a stale cache line it tries again next tick.
3. Trying to invalidate the cache within a single batched request will
be unnecessarily complicated and error prone. Instead, to prevent
starvation, each request only invalidates its cache line, by removing
the hosts it has just leased.
4. The same host may be preset in 2 different cache lines but this won't
matter because each request will check the leased bit in real time before
acquiring it.
5. The entire cache is invalidated at the end of a batched request.

TEST=Ran suites, unittests.
BUG=chromium:366141
DEPLOY=Scheduler

Change-Id: Iafc3ffa876537da628c52260ae692bc2d5d3d063
Reviewed-on: https://chromium-review.googlesource.com/197788
Reviewed-by: Dan Shi <dshi@chromium.org>
Tested-by: Prashanth B <beeps@chromium.org>
Commit-Queue: Prashanth B <beeps@chromium.org>
/external/autotest/scheduler/rdb_testing_utils.py