5de10171860a2ef3a30aa2ba6a67431673120649 |
|
05-Jun-2015 |
Dan Shi <dshi@chromium.org> |
[autotest] Fix couple issues in bulk post metadata. The metadb had an outage last couple hours and led to all data reported get the same timestamp. This CL is to fix the bug and increase buffer size. 1. Increase the buffer size, also limit size in a single upload. 2. Add time_recorded in the host history metadata. BUG=None TEST=local Change-Id: I157eb25ab0b8aeb227080aca47e42d3834ae1337 Reviewed-on: https://chromium-review.googlesource.com/275641 Trybot-Ready: Dan Shi <dshi@chromium.org> Tested-by: Dan Shi <dshi@chromium.org> Reviewed-by: Fang Deng <fdeng@chromium.org> Commit-Queue: Dan Shi <dshi@chromium.org>
/external/autotest/scheduler/rdb_hosts.py
|
cf2e8dd3f81d5eb4c9720db396ebbf64fd7b9ae4 |
|
08-May-2015 |
Dan Shi <dshi@chromium.org> |
[autotest] Add a new thread to upload metadata reported by scheduler Currently host state change was reported to metadb before the change is committed to database. Each change makes a ES post call to send data. To avoid performance overhead for scheduler, UDP is used. UDP has a data lost issue. Especially that the ES server now lives in GCE, while scheduler runs in a different network. This CL attempts to fix the issue by reporting metadata in a separate thread in bulk. The performance of ES bulk API is much better than individual calls. For example, a single index request through HTTP might take 80ms. For bulk API, 1000 records can be indexed in less than 0.5 second. BUG=chromium:471015 TEST=run local scheduler, make sure all metadata was uploaded. Also, confirm scheduler can be properly shut down. Change-Id: I38991b9e647bb7a6fcaade8e8ef9eea27d9aa035 Reviewed-on: https://chromium-review.googlesource.com/270074 Reviewed-by: Dan Shi <dshi@chromium.org> Commit-Queue: Dan Shi <dshi@chromium.org> Trybot-Ready: Dan Shi <dshi@chromium.org> Tested-by: Dan Shi <dshi@chromium.org> Reviewed-by: Keith Haddow <haddowk@chromium.org>
/external/autotest/scheduler/rdb_hosts.py
|
b72f4fbcf1583da27f09f4abb9d8162530bf4559 |
|
21-Jan-2015 |
Gabe Black <gabeblack@chromium.org> |
graphite: Reorganize the elastic search code so we can put it in chromite. This change reorganizes the elastic search integration code so that it's separate from the code that, for instance, reads config information from the autotest global config. That way, it can be moved from chromite without breaking any dependencies. BUG=chromium:446291 TEST=Ran stats_es_functionaltest.py. Ran unit tests. Ran a butterfly-paladin tryjob with --hwtest. Change-Id: I0dbf135c4f1732d633e5fc9d5edb9e1f4f7199d5 Reviewed-on: https://chromium-review.googlesource.com/242701 Reviewed-by: Dan Shi <dshi@chromium.org> Tested-by: Gabe Black <gabeblack@chromium.org> Commit-Queue: Gabe Black <gabeblack@chromium.org>
/external/autotest/scheduler/rdb_hosts.py
|
8c98ac10beaa08bfb975c412b0b3bda23178763a |
|
23-Dec-2014 |
Prashanth Balasubramanian <beeps@google.com> |
[autotest] Send frontend jobs to shards. Frontend jobs on hosts that are on the shard are disallowed currently, because the host-scheduler on master currently ignore jobs based on meta-host, but frontend jobs have no meta-host. This CL have the following changes: - Make host-scheduler ignore frontend jobs that are supposed to be picked by shard. - Send such frontend jobs in heartbeat. - Allow creation of frontend jobs in rpc. TEST=Test the follows: - Create a job on a host on shard from AFE frontend. Observe it runs on shards and completes on master. - Create a job on two hosts (one host on shard, the other on master) from AFE frontend. Make sure exception is railed with correct message. - Run a normal dummy suite on shard, make sure normal flow still works. Heartbeat contains the right information. - Run a normal dummy suite on master, make sure it works. BUG=chromium:444790 DEPLOY=apache, host-scheduler Change-Id: Ibca3d36cb59fed695233ffdc89506364c402cc37 Reviewed-on: https://chromium-review.googlesource.com/240396 Reviewed-by: Mungyung Ryu <mkryu@google.com> Reviewed-by: Dan Shi <dshi@chromium.org> Commit-Queue: Fang Deng <fdeng@chromium.org> Tested-by: Fang Deng <fdeng@chromium.org>
/external/autotest/scheduler/rdb_hosts.py
|
0e96b046c053e8b9e85c6512e75b850ffbbd4358 |
|
30-Sep-2014 |
Dan Shi <dshi@chromium.org> |
[autotest] Record host's platform and pool info in host status metadata BUG=chromium:419043 TEST=local run tests, and query test ES server: http://172.25.61.45:9200/_plugin/elastic-hammer/ search for: {"query": {"bool": {"minimum_should_match": 4, "should": [{"term": {"_type": "host_history"}}, {"term": {"hostname": "172.27.213.193"}}, {"term": {"pools": "bvt"}}, {"range": {"time_recorded": {"gte": 1407196142.893756, "lte": 1507282542.893756}}}]}}, "size": 10000, "sort": [{"time_recorded": "asc"}]} Confirm the results has metadata like: platform: "peppy" pools[] "bvt" "suites" Change-Id: I068427510fc983b9ef4cc448e8a5bb1dace71c52 Reviewed-on: https://chromium-review.googlesource.com/220551 Commit-Queue: Dan Shi <dshi@chromium.org> Tested-by: Dan Shi <dshi@chromium.org> Reviewed-by: Fang Deng <fdeng@chromium.org>
/external/autotest/scheduler/rdb_hosts.py
|
e4cb9e23709425fcbb5faec40bfe824b818bd106 |
|
29-Aug-2014 |
Dan Shi <dshi@chromium.org> |
[autotest] Change devserver stats call to log data only for staging artifacts. File names are different as they have build information included. Recording stats for each file will create too many counters in graphite and lead to disk space issue. BUG=chromium:404475 TEST=local setup. To verify metadata: visit http://172.25.61.45:9200/_plugin/elastic-hammer/ update search url with index of local setup: dshi.mtv/_search search for: { "query": {"bool": {"minimum_should_match": 1, "should": [{"term": {"_type": "devserver"}} ]}}, "size": 10000, "sort": [{"time_recorded": "asc"}]} Confirm the data. Change-Id: Ic11f76c3ef6fbf8cf5d25312d3285e1e9b87a178 Reviewed-on: https://chromium-review.googlesource.com/215640 Tested-by: Dan Shi <dshi@chromium.org> Reviewed-by: Fang Deng <fdeng@chromium.org> Commit-Queue: Dan Shi <dshi@chromium.org>
/external/autotest/scheduler/rdb_hosts.py
|
7cf3d84fda609f6402543bb7e0bf3e3b7f93d539 |
|
13-Aug-2014 |
Dan Shi <dshi@chromium.org> |
[autotest] Record more metadata to include more job related information. For host history rpc to return more info like job name/owner, we added a new attribute metadata_info to host object. metadata_info is a dictionary containing information such as task_id, task_name, job_id, job_name, parent_job_id. When host status is changed, metadata_info will be reported to metaDB. Examples of metadata_info are: {"hostname": "192.96.48.88", "task_name": "Verify", "task_id": 6551} {"job_name": "dummy_pass", "hostname": "192.96.48.88", "task_name": "Reset", "task_id": 6552, "job_id": 4133, "parent_job_id": 4132} {'owner': 'debug_user', 'parent_job_id': null, 'job_id': 4140, 'job_name': 'dummy_pass'} BUG=chromium:394451 TEST=local setup, site_utils/host_history.py --hosts 100.96.48.196 test output with visiting page: http://172.25.61.45:9200/_plugin/elastic-hammer/ Enter following in filter, then click search to see results. {"query": {"bool": {"minimum_should_match": 3, "should": [{"term": {"_type": "host_history"}}, {"term": {"hostname": "100.96.48.196"}}, {"range": {"time_recorded": {"gte": 1408121317, "lte": 1409007215}}}]}}, "size": 10000, "sort": [{"time_recorded": "asc"}]} Change-Id: Icddc27fb39529924d0030dbec97176a2d8b683fc Reviewed-on: https://chromium-review.googlesource.com/212304 Reviewed-by: Dan Shi <dshi@chromium.org> Tested-by: Dan Shi <dshi@chromium.org> Commit-Queue: Dan Shi <dshi@chromium.org>
/external/autotest/scheduler/rdb_hosts.py
|
0d7474640b6a6e4334d16921fe0df79418007af9 |
|
17-Jul-2014 |
Michael Liang <michaelliang@chromium.org> |
[autotest] Log host history and categorize metadata in es by _type Log hostname, status, time stamp, along with a debug string which has job_id and hqe id to metadata db under the _type='host_history'. I also modified scheduler_models to have log with _type='hqe_status'. es_utils index now defaults to autotest instance, i.e. 'cautotest'. BUG=None TEST=Ran generic_RebootTest locally and verified status logged in esdb. TEST=python stats_es_functionaltest.py --all --es_port=prod Change-Id: I64223ed12c45c5e2adeca7630cd9d2ffd28dd2c2 DEPLOY=scheduler Reviewed-on: https://chromium-review.googlesource.com/208711 Reviewed-by: Michael Liang <michaelliang@chromium.org> Tested-by: Michael Liang <michaelliang@chromium.org> Reviewed-by: Dan Shi <dshi@chromium.org> Commit-Queue: Michael Liang <michaelliang@chromium.org>
/external/autotest/scheduler/rdb_hosts.py
|
2d8047e8b2d901bec66d483664d8b6322501d245 |
|
28-Apr-2014 |
Prashanth B <beeps@google.com> |
[autotest] In process request/host caching for the rdb. This cl implements an in process host cache manager for the rdb. The following considerations were taken into account while designing it: 1. The number of requests outweigh the number of leased hosts 2. The number of net hosts outweighs the number of leased hosts 3. The 'same' request can consult the cache within the span of a single batched request. These will only be same in terms of host labels/acls required, not in terms of priority or parent_job_id. Resulting ramifications: 1. We can't afford to consult the database for each request. 2. We can afford to refresh our in memory representation of a host just before leasing it. 3. Leasing a host can fail, as we might be using a stale cached host. 4. We can't load a map of all hosts <-> labels each request. 5. Invalidation is hard for most sane, straight-forward choices of keying hosts against requests. 6. Lower priority requests will starve if they try to lease the same hosts taken by higher priority requests. Main design tenets: 1. We can tolerate some staleness in the cache, since we're going to make sure the host is unleased just before using it. 2. If a job hits a stale cache line it tries again next tick. 3. Trying to invalidate the cache within a single batched request will be unnecessarily complicated and error prone. Instead, to prevent starvation, each request only invalidates its cache line, by removing the hosts it has just leased. 4. The same host may be preset in 2 different cache lines but this won't matter because each request will check the leased bit in real time before acquiring it. 5. The entire cache is invalidated at the end of a batched request. TEST=Ran suites, unittests. BUG=chromium:366141 DEPLOY=Scheduler Change-Id: Iafc3ffa876537da628c52260ae692bc2d5d3d063 Reviewed-on: https://chromium-review.googlesource.com/197788 Reviewed-by: Dan Shi <dshi@chromium.org> Tested-by: Prashanth B <beeps@chromium.org> Commit-Queue: Prashanth B <beeps@chromium.org>
/external/autotest/scheduler/rdb_hosts.py
|
b474fdfd353cdb0888191f4b80e47e6b5343d891 |
|
04-Apr-2014 |
Prashanth B <beeps@google.com> |
[autotest] Lease hosts according to frontend job priorities. This cl modifies the way we lease hosts by teaching the RDBServerHostWrapper to handle host leasing. Though this involves a seperate query for each host it leads to a design we can later build atomicity into, because we can check the leased bit on a single host before setting it. This model of leasing also has the following benefits: 1. It doesn't abuse the response map. 2. It gives us more clarity into which reqeusts are acquiring hosts by setting the leased bit in step with host validation. 3. It is more tolerant to db errors because exceptions raised while leasing one host will not fail the entire batched request. This cl also adds an rdb_unittest module. TEST=Unittests, ran suites. BUG=chromium:353183 DEPLOY=scheduler Change-Id: I35c04bcb37eee0191a211c133a35824cc78b5d71 Reviewed-on: https://chromium-review.googlesource.com/193182 Reviewed-by: Prashanth B <beeps@chromium.org> Commit-Queue: Prashanth B <beeps@chromium.org> Tested-by: Prashanth B <beeps@chromium.org>
/external/autotest/scheduler/rdb_hosts.py
|
489b91d72cd225e902081dbd3f9e47448fe867f6 |
|
15-Mar-2014 |
Prashanth B <beeps@google.com> |
[autotest] Establish a common interface for host representation. This cl has the work needed to ensure that schema changes made on the server trickled down into the client. If the same changes don't reflect on the client, creating or saving a client host wrapper for a given host will fail deterministically on the client side until modules using the rdb_host are modified to reflect the changes. 1. rdb_hosts: A module containing the host heirarchy needed to establish a dependence between the creation of the RDBServerHostWrapper (which is serialized and returned to the client, which converts it into an RDBClientHostWrapper) and the saving of the RDBClientHostWrapper through and rdb update request. 2. rdb_requests: Contains the requests/request managers that were in rdb_utils, because I plan to expand them in subsequent cls. 3. rdb_model_extensions: Contains model classes common to both server and client that help in establishing the common host model interface. 4. rdb integration tests. TEST=Ran suites, unittests BUG=chromium: 348176 DEPLOY=scheduler Change-Id: I0bbab1dd184e505b1130ee73714e45ceb7bf4189 Reviewed-on: https://chromium-review.googlesource.com/191357 Commit-Queue: Prashanth B <beeps@chromium.org> Tested-by: Prashanth B <beeps@chromium.org> Reviewed-by: Dan Shi <dshi@chromium.org>
/external/autotest/scheduler/rdb_hosts.py
|