History log of /external/autotest/scheduler/rdb.py
Revision Date Author Comments (<<< Hide modified files) (Show modified files >>>)
1e1c41b1b4a1b97c0b7086b8430856ed45e064d3 05-Feb-2015 Gabe Black <gabeblack@chromium.org> graphite: Separate out configuration from the statsd classes.

The new version of the statsd classes should be created using an instance of
the new Statsd class which sets up some defaults without having to specify
them over and over. This makes it essentially compatible with the existing
usage in autotest, but will allow chromite to configure things differently and
avoid having side effects from importing the module or global state.

BUG=chromium:446291
TEST=Ran unit tests, ran stats_es_functionaltest.py, ran the
stats_mock_unittest, ran a butterfly-paladin tryjob with --hwtest, testing by
fdeng.
DEPLOY=apache,scheduler,host-scheduler

Change-Id: I1071813db197c0e5e035b4d8db615030386f1c1c
Reviewed-on: https://chromium-review.googlesource.com/246428
Reviewed-by: Fang Deng <fdeng@chromium.org>
Reviewed-by: Dan Shi <dshi@chromium.org>
Commit-Queue: Gabe Black <gabeblack@chromium.org>
Tested-by: Gabe Black <gabeblack@chromium.org>
/external/autotest/scheduler/rdb.py
a9bc9592e64c5fc6088e07ab07826a0d5762ba54 28-Jan-2015 Fang Deng <fdeng@chromium.org> [autotest] RDB prefers to choose a host with the same cros-version

Once we share pool between, say canary and pfq, it is possible that
two suites can exchange duts which would lead to unnecessary cost
of dut provisioning back and forth between two builds.

This can be solved by making host-scheduler always prefer a host
with the same cros-version as the job if such a host is available.

The idea is generalized to allow the request to specify a list
of preferred deps, so that it won't be limited to only
cros-version.

DEPLOY=host_scheduler
BUG=chromium:452752
TEST=Test as follows:
- Create 10 fake duts: fake1, fake2...fake10
- Randomly select 3 and assign them with a specific cros-version
- Run a dummy suite with the cros-version
- Observe the 3 random duts are selected.
Log the sorted list of hosts, observe the 3 hosts are ranked
in the front.
- unittest and integration test.

Change-Id: I0d5def604523173cf59b60d802ca981b6d68bb5f
Reviewed-on: https://chromium-review.googlesource.com/243750
Reviewed-by: Mungyung Ryu <mkryu@google.com>
Commit-Queue: Fang Deng <fdeng@chromium.org>
Tested-by: Fang Deng <fdeng@chromium.org>
/external/autotest/scheduler/rdb.py
8c98ac10beaa08bfb975c412b0b3bda23178763a 23-Dec-2014 Prashanth Balasubramanian <beeps@google.com> [autotest] Send frontend jobs to shards.

Frontend jobs on hosts that are on the shard are disallowed
currently, because the host-scheduler on master currently
ignore jobs based on meta-host, but frontend jobs have no
meta-host. This CL have the following changes:
- Make host-scheduler ignore frontend jobs that are supposed
to be picked by shard.
- Send such frontend jobs in heartbeat.
- Allow creation of frontend jobs in rpc.

TEST=Test the follows:
- Create a job on a host on shard from AFE frontend.
Observe it runs on shards and completes on master.
- Create a job on two hosts (one host on shard, the other on master)
from AFE frontend. Make sure exception is railed with correct
message.
- Run a normal dummy suite on shard, make sure normal flow still
works. Heartbeat contains the right information.
- Run a normal dummy suite on master, make sure it works.
BUG=chromium:444790
DEPLOY=apache, host-scheduler

Change-Id: Ibca3d36cb59fed695233ffdc89506364c402cc37
Reviewed-on: https://chromium-review.googlesource.com/240396
Reviewed-by: Mungyung Ryu <mkryu@google.com>
Reviewed-by: Dan Shi <dshi@chromium.org>
Commit-Queue: Fang Deng <fdeng@chromium.org>
Tested-by: Fang Deng <fdeng@chromium.org>
/external/autotest/scheduler/rdb.py
52a239316b829106b540e57a0100496fed1fe5aa 21-Nov-2014 Fang Deng <fdeng@chromium.org> [autotest] RDB respects min_duts requirement.

This is part II of making host scheduler support a min_dut
requirement per suite.

With this Cl, rdb will do a two-round host acquisition.

In the first round, it will try to allocate at most |suite_min_duts|
number of duts to jobs that belong to a suite.

If there are still available duts, in the second round,
it will try to allocate the rest of the duts to the jobs that have
not been satisfied.

BUG=chromium:432648
TEST=add unit tests to rdb_integration_tests;
ran rdb_cache_unittests; rdb_host_unittests;rdb_unittests;
TEST=Integration test with CL:231139. Run two bvt-cq suites with
different priority and suite_min_duts. Set testing_mode=True
and testing_exceptions=test_suites. Confirm the host allocation.
TEST=Test inline host acquisition still works

Change-Id: I7b39bd8eaa5b6966f3ed267667919bae07d5665a
Reviewed-on: https://chromium-review.googlesource.com/231210
Reviewed-by: Prashanth B <beeps@chromium.org>
Commit-Queue: Fang Deng <fdeng@chromium.org>
Tested-by: Fang Deng <fdeng@chromium.org>
/external/autotest/scheduler/rdb.py
da8c60af1e1e3ee97170c700d0b72991687e35a2 03-Jun-2014 Michael Liang <michaelliang@chromium.org> [autotest] Migrate graphite directory to client/common_lib/cros

This change allows us to report stats in client tests.
1. Change import paths for all files that import modules from graphite
2. Clean up some unused modules
Related CL: https://chromium-review.googlesource.com/#/c/202467/
BUG=chromium:237255
TEST=Ran scheduler locally, scheduled reboot jobs, verified stats such as monitor_db_cleanup.user_cleanup._cleanup were reported on chromeos-stats.
DEPLOY = apache, scheduler, host_scheduler
Change-Id: Iebfe3b8acc1c363a0b70ea555744e85d1367cb67
Reviewed-on: https://chromium-review.googlesource.com/202727
Reviewed-by: Dan Shi <dshi@chromium.org>
Commit-Queue: Michael Liang <michaelliang@chromium.org>
Tested-by: Michael Liang <michaelliang@chromium.org>
/external/autotest/scheduler/rdb.py
86934c86a72509c6aede78591817edcedbddc268 09-May-2014 Prashanth B <beeps@google.com> [autotest] Calculate database cache staleness.

TEST=Unittests, calculated staleness.
BUG=None
DEPLOY=Scheduler

Change-Id: I82084c1d412a0a9bbda911159a156c5436e5e6c6
Reviewed-on: https://chromium-review.googlesource.com/199114
Commit-Queue: Prashanth B <beeps@chromium.org>
Tested-by: Prashanth B <beeps@chromium.org>
Reviewed-by: Dan Shi <dshi@chromium.org>
/external/autotest/scheduler/rdb.py
2d8047e8b2d901bec66d483664d8b6322501d245 28-Apr-2014 Prashanth B <beeps@google.com> [autotest] In process request/host caching for the rdb.

This cl implements an in process host cache manager for the rdb. The
following considerations were taken into account while designing it:
1. The number of requests outweigh the number of leased hosts
2. The number of net hosts outweighs the number of leased hosts
3. The 'same' request can consult the cache within the span of a single
batched request. These will only be same in terms of host labels/acls
required, not in terms of priority or parent_job_id.

Resulting ramifications:
1. We can't afford to consult the database for each request.
2. We can afford to refresh our in memory representation of a host
just before leasing it.
3. Leasing a host can fail, as we might be using a stale cached host.
4. We can't load a map of all hosts <-> labels each request.
5. Invalidation is hard for most sane, straight-forward choices of
keying hosts against requests.
6. Lower priority requests will starve if they try to lease the same
hosts taken by higher priority requests.

Main design tenets:
1. We can tolerate some staleness in the cache, since we're going
to make sure the host is unleased just before using it.
2. If a job hits a stale cache line it tries again next tick.
3. Trying to invalidate the cache within a single batched request will
be unnecessarily complicated and error prone. Instead, to prevent
starvation, each request only invalidates its cache line, by removing
the hosts it has just leased.
4. The same host may be preset in 2 different cache lines but this won't
matter because each request will check the leased bit in real time before
acquiring it.
5. The entire cache is invalidated at the end of a batched request.

TEST=Ran suites, unittests.
BUG=chromium:366141
DEPLOY=Scheduler

Change-Id: Iafc3ffa876537da628c52260ae692bc2d5d3d063
Reviewed-on: https://chromium-review.googlesource.com/197788
Reviewed-by: Dan Shi <dshi@chromium.org>
Tested-by: Prashanth B <beeps@chromium.org>
Commit-Queue: Prashanth B <beeps@chromium.org>
/external/autotest/scheduler/rdb.py
2c1a22a9f93bf50147cd4e6b10487d02768f8919 03-Apr-2014 Prashanth B <beeps@google.com> [autotest] Include parent_job_id as a tie breaker in the host request.

Teach the rdb to give hosts to requests from the same suite by including
parent_job_id in the acquire hosts request, and using it to order
requests of equal priority. This means that we will only group requests
from the same suite, which will result in an O(number of concurrent suites)
increase in database queries to find hosts, but it should lead to an
overall win since suites will timeout less often if they're allowed to
run till completion before hosts in the same pool are re-allocated to another
suite.

TEST=Ran suites, added unittests that failed before and passes now.
BUG=chromium:351861

Change-Id: Ia6e01f92484fb7bfd2e2e1e780cc4a16832fe901
Reviewed-on: https://chromium-review.googlesource.com/192965
Tested-by: Prashanth B <beeps@chromium.org>
Reviewed-by: Alex Miller <milleral@chromium.org>
Commit-Queue: Prashanth B <beeps@chromium.org>
/external/autotest/scheduler/rdb.py
b474fdfd353cdb0888191f4b80e47e6b5343d891 04-Apr-2014 Prashanth B <beeps@google.com> [autotest] Lease hosts according to frontend job priorities.

This cl modifies the way we lease hosts by teaching the
RDBServerHostWrapper to handle host leasing. Though this involves
a seperate query for each host it leads to a design we can later
build atomicity into, because we can check the leased bit on a single
host before setting it. This model of leasing also has the following benefits:
1. It doesn't abuse the response map.
2. It gives us more clarity into which reqeusts are acquiring
hosts by setting the leased bit in step with host validation.
3. It is more tolerant to db errors because exceptions raised while
leasing one host will not fail the entire batched request.

This cl also adds an rdb_unittest module.

TEST=Unittests, ran suites.
BUG=chromium:353183
DEPLOY=scheduler

Change-Id: I35c04bcb37eee0191a211c133a35824cc78b5d71
Reviewed-on: https://chromium-review.googlesource.com/193182
Reviewed-by: Prashanth B <beeps@chromium.org>
Commit-Queue: Prashanth B <beeps@chromium.org>
Tested-by: Prashanth B <beeps@chromium.org>
/external/autotest/scheduler/rdb.py
489b91d72cd225e902081dbd3f9e47448fe867f6 15-Mar-2014 Prashanth B <beeps@google.com> [autotest] Establish a common interface for host representation.

This cl has the work needed to ensure that schema changes made on
the server trickled down into the client. If the same changes don't
reflect on the client, creating or saving a client host wrapper for
a given host will fail deterministically on the client side until
modules using the rdb_host are modified to reflect the changes.

1. rdb_hosts: A module containing the host heirarchy needed to
establish a dependence between the creation of the RDBServerHostWrapper
(which is serialized and returned to the client, which converts it
into an RDBClientHostWrapper) and the saving of the RDBClientHostWrapper
through and rdb update request.
2. rdb_requests: Contains the requests/request managers that were in
rdb_utils, because I plan to expand them in subsequent cls.
3. rdb_model_extensions: Contains model classes common
to both server and client that help in establishing
the common host model interface.
4. rdb integration tests.

TEST=Ran suites, unittests
BUG=chromium: 348176
DEPLOY=scheduler

Change-Id: I0bbab1dd184e505b1130ee73714e45ceb7bf4189
Reviewed-on: https://chromium-review.googlesource.com/191357
Commit-Queue: Prashanth B <beeps@chromium.org>
Tested-by: Prashanth B <beeps@chromium.org>
Reviewed-by: Dan Shi <dshi@chromium.org>
/external/autotest/scheduler/rdb.py
9bc32fa584a253142f75a48a063cd6074ea468b9 20-Feb-2014 Prashanth B <beeps@google.com> [autotest] Include priority in AcquireHostRequest.

Before this change we would allocate hosts to competing requests,
to the request group with most demand, i.e if we had 10 requests with
board:lumpy and 5 with board:lumpy, bluetooth, we would favor
satisfying the 10 request group even if the 5 request group had
a higher priority job in it. With this change requests are grouped by
to priority, and then sorted, so higher priority jobs get to ask for
hosts first.

Downsides (that are obviously less important than correctness):
1. Same deps/acls requests will sometimes need > 1 request for hosts,
though the priority is not a part of the actual (django) request.
2. Before, we would lock a maximum number of devices upfront with the
highest demand request. This change leads to a slight fragmentation
of this model.

BUG=chromium:345308
TEST=Ran 2 suites with different priorities and forced inversion without
the change, then made sure they scheduler correctly with it.
DEPLOY=scheduler

Change-Id: Icb12cebd0c874529ccc0d199d8fe377a953da7f1
Reviewed-on: https://chromium-review.googlesource.com/187317
Tested-by: Prashanth B <beeps@chromium.org>
Reviewed-by: Alex Miller <milleral@chromium.org>
Commit-Queue: Prashanth B <beeps@chromium.org>
/external/autotest/scheduler/rdb.py
cc9fc70587d37775673e47b3dcb4d6ded0c6dcb4 02-Dec-2013 beeps <beeps@chromium.org> [autotest] RDB Refactor II + Request/Response API.

Scheduler Refactor:
1. Batched processing of jobs.
2. Rdb hits the database instead of going through host_scheduler.
3. Migration to add a leased column.The scheduler released hosts
every tick, back to the rdb.
4. Client rdb host that queue_entries use to track a host, instead
of a database model.

Establishes a basic request/response api for the rdb:
rdb_utils:
1. Requests: Assert the format and fields of some basic request types.
2. Helper client/server modules to communicate with the rdb.
rdb_lib:
1. Request managers for rdb methods:
a. Match request-response
b. Abstract the batching of requests.
2. JobQueryManager: Regulates database access for job information.
rdb:
1. QueryManagers: Regulate database access
2. RequestHandlers: Use query managers to get things done.
3. Dispatchers: Send incoming requests to the appropriate handlers.
Ignores wire formats.

TEST=unittests, functional verification.
BUG=chromium:314081, chromium:314083, chromium:314084
DEPLOY=scheduler, migrate

Change-Id: Id174c663c6e78295d365142751053eae4023116d
Reviewed-on: https://chromium-review.googlesource.com/183385
Reviewed-by: Prashanth B <beeps@chromium.org>
Commit-Queue: Prashanth B <beeps@chromium.org>
Tested-by: Prashanth B <beeps@chromium.org>
/external/autotest/scheduler/rdb.py
7d8273bad1318c13698a162a6e5910bea060d167 06-Nov-2013 beeps <beeps@chromium.org> [autotest] RDB refactor I

Initial refactor for the rdb, implementes 1 in this schematic:
https://x20web.corp.google.com/~beeps/rdb_v1_midway.jpg

Also achieves the following:
- Don't process an hqe more than once, after having assigned a host to it.
- Don't assign a host to a queued, aborted hqe.
- Drop the metahost concept.
- Stop using labelmetahostscheduler to find hosts for non-metahost jobs.
- Include a database migration script for jobs that were still queued during
the scheduler restart, since they will now need a meta_host dependency.

This cl also doesn't support the schedulers ability to:
- Schedule an atomic group
* Consequently, also the ability to block a host even when the hqe using it is
no longer active.
- Schedule a metahost differently from a non-metahost
* Both metahosts and non-metahosts are just labels now
* Jobs which are already assigned hosts are still give precedence, though
- Schedule based on only_if_needed.

And fixes the unittests appropriately.

TEST=Ran suites, unittests. Restarted scheduler after applying these changes
and tested migration. Ran suite scheduler.
BUG=chromium:314082,chromium:314219,chromium:313680,chromium:315824,chromium:312333
DEPLOY=scheduler, migrate

Change-Id: I70c3c3c740e51581db88fe3ce5879c53d6e6511e
Reviewed-on: https://chromium-review.googlesource.com/175957
Reviewed-by: Alex Miller <milleral@chromium.org>
Commit-Queue: Prashanth B <beeps@chromium.org>
Tested-by: Prashanth B <beeps@chromium.org>
/external/autotest/scheduler/rdb.py