-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Contributing isoscope scheduler #1126
base: master
Are you sure you want to change the base?
Conversation
69fdbee
to
0c95b78
Compare
Hi @vitaly-krugl thanks for the PR! Although this PR has very high quality, to be honest I'm hesitant of accepting this PR, because it is somewhat complex and pytest-xdist does not have much maintenance bandwidth and it adds to that. But this is my opinion only, @RonnyPfannschmidt @bluetech what do you think? Also consider that this might be published as a separate plugin, it does not need to be included in pytest-xdist: the package would implement |
first i want to recognize the effort and creativity that went into this pr, those few thousand lines are conceptually very dense, and currently come without additional tests, we have a very bad match up here for the maintenance situation - the new contributed code is without tests and of necessarily high complexity most volunteer managed opensource projects are unable to take in such a contribution easily, in particular if it comes in without any deep prior engagement/discussion with the maintainer team if the goal is to make this quickly available, then i strongly recommend publishing this as a "plugin" to xdist, if the goal is to land this in xdist, then its likely to take a long time, as the contribution size and complexity of the solved problem is immense i agree that the code itself is high quality, but the high complexity of the solved problem, the sheer size of this contribution, the need for tests to be written and our own awareness of our very limited maintenance capabilities its going to be a while before im in a place to give it the due diligence it needs, and i suspect this will be a problem for the initial contribution i also suspect that multi host aware architecture of the coordinator will look different than initially imagined my current recommendation would be to begin this as own pypi package, adding a own testsuite and to coordinate with xdist to ensure xdist changes dont break it in addition if this is not a direct part of xdist, its more easy to declare multi host testing not being supported in the feature and engage as we eventually enable to pull this in i absolutely want to enable coordinated fixture/resource creation/destruction |
Hi @nicoddemus and @RonnyPfannschmidt, thank you for following up with you recommendations. I immensely appreciate the effort and personal sacrifice of the volunteer maintainers, having previously contributed extensively to an open source project myself (Pika). I developed this scheduler to facilitate a CI use case at my workplace. This scheduler has been in constant - and successful - use for quite some time now. Just to be clear - I am working on tests. I integrated the isoscope scheduler into existing acceptance tests. I will add isoscope-specific tests next. UPDATE: tests are done. Regarding complexity: The code of this scheduler could be simplified quite a bit if there is a supported (and efficient) way to shut down and restart all worker nodes between scopes (without re-triggering costly test collection; e.g., we have over 30,000 test cases - and growing rapidly - so cannot afford additional costly collection cycles between scopes. Please advise whether it's possible to shut down and restart all worker nodes efficiently (without triggering test collection), and how (a link to an example would be appreciated). Regarding code size: The code is very well commented, which contributes significantly to line count, but not to complexity of the code. Also, I will include design documentation. I would be happy to engage in any further discussion with the maintainer team. Since I already had the code that solved a useful problem, I expected that the creation of the pull request would naturally facilitate/trigger a discussion. There is not a huge amount of urgency from my side, as the company is using its own copy of the scheduler. That said, I would like to get it integrated into xdist out of practical considerations, if ever possible. As a fallback, I will consider open-sourcing it as an independent PyPi project, in which case your offer of CI integration to ensure compatibility with future xdist changes would be immensely helpful. Is there an example of such CI integration in existence today? Finally, the feedback about multi-host support/testing is noted. Are you referring to the |
…dist. Not tested yet.
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
…file and test_multi_file.
for more information, see https://pre-commit.ci
…ripts to work around "TypeError: 'type' object is not subscriptable" in py38
for more information, see https://pre-commit.ci
…List vs list in type hings, favoring list and tuple.
for more information, see https://pre-commit.ci
…d test_module_single_start.
for more information, see https://pre-commit.ci
…workers_utilized.
for more information, see https://pre-commit.ci
…TestIsoScope.test_single_scope_subset_of_workers_utilized.
…ngle_scope_subset_of_workers_utilized.
for more information, see https://pre-commit.ci
… TestIsoScope.test_multi_scope_with_insufficient_fence.
…st_multi_scope_with_insufficient_fence.
….test_multi_scope_with_insufficient_fence.
…r_rule_with_two_workers.
for more information, see https://pre-commit.ci
…up_teardown_coordination.
for more information, see https://pre-commit.ci
…up_teardown_coordination.
…up_teardown_coordination.
…up_teardown_coordination.
…up_teardown_coordination.
…up_teardown_coordination.
…up_teardown_coordination.
for more information, see https://pre-commit.ci
…up_teardown_coordination.
for more information, see https://pre-commit.ci
…up_teardown_coordination.
…st_distributed_setup_teardown_coordination.
… in TestIsoScope.test_distributed_setup_teardown_coordination.
… bytes in position 2-3: truncated \UXXXXXXXX escape in TestIsoScope.test_distributed_setup_teardown_coordination.
…ybe_call_setup to work correctly on Windows.
e4a1e6f
to
95b91c8
Compare
isoscope
scheduler
isoscope
scheduler
there is no way to "restart" a worker without collection - when a new worker starts, it has no collection and need to run it to get into "shape" |
and yes - multi-host is about |
READY FOR REVIEW (tests implemented and passing) ...
Implementation of the Distributed Scope Isolation scheduler
isoscope
for pytest-xdist.Background
I developed this scheduler to facilitate a CI use case at my workplace. This scheduler has been in constant - and successful - use for quite some time now.
NOTE: In the following discussion, the meaning of "Scope" is the same as in the
loadscope
scheduler - it's the set of tests associated with a single test class or the set of module-level test functions associated with a single python test module.Introduction
The proposed scheduler facilitates distributed execution of unit tests on a one-Scope-at-a-time basis with coordinated Scope-level resource setup and teardown.
We say that the test Scopes are isolated, since each test Scope - along with its Scope-level resource Setup/Teardown - is executed entirely before the subsequent test Scope. Hence we name this scheduler
isoscope
.Alternatives: While it's possible to achieve this type of isolation by using the
loadscope
scheduler with a single xdist worker (--dist=loadscope -n1
) which serializes executions of all tests, theisoscope
scheduler avails more performant distributed execution of each Scope's tests.Tests Scopes that meet the following constraints may benefit from the
isoscope
scheduler:Properties of this scheduler
Foundational Design Concepts of this Scheduler
Pytest/xdist Runtest Protocol: Each xdist worker always holds one test item in the queue and won't execute it until it receives either an additional test item from xdist scheduler or the "shutdown" signal in order to satisfy the
nextitem
requirements ofpytest_runtest_protocol
which requires a valid "nextitem" during the test session and treatsnextitem=None
as the end of the test session.Our design takes advantage of the above-described pytest/xdist Runtest Protocol, including the requirement to always retain the last test item in each worker's input queue, to implement an orderly isolation of all tests within a given test Scope. We call it the "At least two active-Scope tests per worker" rule, described as follows:
When
isoscope
scheduler distributes tests for the Active Scope - the Scope it intends to execute in the current cycle - it distributes at least two tests from this Scope into each available worker queue. The only exception is when the Active Scope has only one test - in this case only this single test is distributed to an available worker. Here is the detailed workflow:isoscope
provides theiso_scheduling
fixture of typeIsoSchedulingFixture
that facilitates coordination of Scope-level resource Setup/Teardown such that Setup and Teardown are performed exactly once per Scope versus per worker.isoscope
enqueues "Fence" test items from the subsequent Scope(s) - one Fence test per xdist Worker queue containing its last test from the current Active Scope. If there are not enough Fence items available in the remaining test Scopes, thenisoscope
enqueues the "Shutdown" messages into the remaining workers instead of Fence tests. This causes each such xdist Worker to process its last remaining test from the current Active Scope and execute all of the Scope's class or module Teardown fixtures.NOTE: The distribution of the Fence tests also takes into account the DSI "At least two active-Scope tests per worker" Rule: DSI limits the number of Fence items it takes from a subsequent Test Scope, making sure that the Rule won't be compromised when the Fence Scope becomes the Active Scope.
Thanks for submitting a PR, your contribution is really appreciated!
Here's a quick checklist that should be present in PRs:
Make sure to include reasonable tests for your change if necessary
We use towncrier for changelog management, so please add a news file into the
changelog
folder following these guidelines:Name it
$issue_id.$type
for example588.bugfix
;If you don't have an issue_id change it to the PR id after creating it
Ensure type is one of
removal
,feature
,bugfix
,vendor
,doc
ortrivial
Make sure to use full sentences with correct case and punctuation, for example: