Waiting for CI is painful
Does this sound familiar?
- You push a commit with minor changes
- CI is slow, so you start working on something else
- A couple hours later, you remember to check the CI. It failed and you don’t remember why you made the changes
For the mysql charm, our replication integration tests took around 30 minutes. By running the tests in parallel we were able to reduce this by nearly one half.
Run tests sequentially
Run tests in parallel
Why are integration tests slow?
For the mysql charm, the vast majority of the test duration is spent:
- Building the charm (see the Faster integration tests with GitHub caching post)
- Waiting for a charm to deploy and reach active status
- Waiting for the mysql charm to scale up or scale down
Take this test function as an example. For one CI run, the total test duration (excludes building the charm) was 13 minutes 48 seconds. Eighty-five percent of the test duration (11 minutes 41 seconds) was spent waiting for the charm to deploy, scale up, or scale down.
Running tests in parallel
In the mysql charm, our integration tests are organized into multiple Python files. Each of these files already runs in parallel in our CI. To further shorten our total test duration, we can run test functions from the same Python file in parallel.
Step 1: Identify when to run tests in parallel
Running tests in parallel has tradeoffs. Parallel tests decrease the time spent waiting for your CI suite to pass, but increase the total machine time used. If your CI runners cost money or have a concurrent job limit, it may be better to run tests sequentially.
Defining “slow test”
For the mysql charm, all of our integration tests have a slow setup (deploy 3 mysql units & wait for active status). After the setup, some of our integration tests are fast (they do not deploy more charms or scale the mysql charm) and some of our integration tests are slow (they do deploy more charms or scale the mysql charm).
If one of our Python test files contains 2 or more tests that are slow after the initial setup, we run the slow tests in parallel.
Step 2: Split the tests into groups
In our Python test file, we add @pytest.mark.group(#)
to each of our test functions (replace #
with an integer).
Each slow test should have its own group. Fast tests should share a group with a slow test (unless there are no slow tests—then there is only one group).
Exception: Sometimes, a slow test will depend on another slow test (e.g. the first test deploys charm A and the second test deploys charm B & relates it to charm A). If only one slow test depends on the first slow test, they should share a group.
Remember to register the custom marker in pyproject.toml
[tool.pytest.ini_options]
markers = ["group"]
Step 3: Provision a runner for each group in GitHub CI
In tests/conftest.py
, add a pytest command-line option to collect groups
def pytest_addoption(parser):
parser.addoption(
"--collect-groups",
action="store_true",
help="Collect test groups (used by GitHub Actions)",
)
# Enables “`--collect-only` mode” if `--collect-groups` is passed
def pytest_configure(config):
if config.option.collect_groups:
config.option.collectonly = True
In tests/integration/conftest.py
, collect the group numbers for each Python file and append a JSON-encoded string to the GITHUB_OUTPUT
file
def _get_group_number(function) -> Optional[int]:
"""Gets group number from test function marker.
This example has a group number of 1:
@pytest.mark.group(1)
def test_build_and_deploy():
pass
"""
group_markers = [marker for marker in function.own_markers if marker.name == "group"]
if not group_markers:
return
assert len(group_markers) == 1
marker_args = group_markers[0].args
assert len(marker_args) == 1
group_number = marker_args[0]
assert isinstance(group_number, int)
return group_number
def _collect_groups(items):
"""Collects unique group numbers for each test module."""
@dataclasses.dataclass(eq=True, order=True, frozen=True)
class Group:
path_to_test_file: str
group_number: int
job_name: str
groups: set[Group] = set()
for function in items:
if not (group_number := _get_group_number(function)):
continue
# Example: "integration.relations.test_database"
name = function.module.__name__
assert name.split(".")[0] == "integration"
# Example: "tests/integration/relations/test_database.py"
path_to_test_file = f"tests/{name.replace('.', '/')}.py"
# Example: "relations/test_database.py | group 1"
job_name = f"{'/'.join(path_to_test_file.split('/')[2:])} | group {group_number}"
groups.add(Group(path_to_test_file, group_number, job_name))
sorted_groups: list[dict] = [dataclasses.asdict(group) for group in sorted(list(groups))]
output = f"groups={json.dumps(sorted_groups)}"
print(f"\n\n{output}\n")
output_file = os.environ["GITHUB_OUTPUT"]
with open(output_file, "a") as file:
file.write(output)
def pytest_collection_modifyitems(config, items):
if config.option.collect_groups:
_collect_groups(items)
In your GitHub workflow, add a job to collect the group numbers
jobs:
collect-integration-tests:
name: Collect integration test groups
steps:
- name: Checkout
- name: Collect test groups
id: collect-groups
run: tox run -e integration -- tests/integration --collect-groups
outputs:
groups: ${{ steps.collect-groups.outputs.groups }}
In tests/conftest.py
, add an option to specify the group number
def pytest_addoption(parser):
…
parser.addoption("--group", type=int, help="Integration test group number")
Then, add a matrix to your integration test job for each group
jobs:
collect-integration-tests:
...
integration-test:
strategy:
matrix:
groups: ${{ fromJSON(needs.collect-integration-tests.outputs.groups) }}
name: ${{ matrix.groups.job_name }}
needs:
- collect-integration-tests
steps:
- name: Checkout
- name: Run integration tests
run: tox run -e integration -- ${{ matrix.groups.path_to_test_file }} --group ${{ matrix.groups.group_number }}
Finally, update tests/integration/conftest.py
to only run tests that match the --group
number
def pytest_collection_modifyitems(config, items):
…
elif selected_group_number := config.option.group:
# Remove tests that do not match the selected group number
filtered_items = []
for function in items:
group_number = _get_group_number(function)
if not group_number:
function.add_marker(pytest.mark.skip("Missing group number"))
filtered_items.append(function)
elif group_number == selected_group_number:
filtered_items.append(function)
assert (
len({function.module.__name__ for function in filtered_items}) == 1
), "Only 1 test module can be run if --group is specified"
items[:] = filtered_items
For a complete example, look at this pull request: mysql-operator#109.