Classify tests with pytest custom markers for quick Integration testing iterations

This started as a note-to-self for an integration testing development flow that I keep finding out and forgetting over and over again.

The typical integration testing module with pytest-operator looks something like:

# setup
@pytest.mark.abort_on_fail
async def test_build(ops_test): ...

@pytest.mark.abort_on_fail
async def test_deploy(ops_test): ...

@pytest.mark.abort_on_fail
async def test_integrate(ops_test): ...

# actual tests
async def test_1(ops_test): ...

async def test_2(ops_test): ...

# teardown
async def test_disintegrate(ops_test): ...

Note the three sections. The ā€˜pack - deploy - integrateā€™ setup steps are, in all charms Iā€™ve worked with so far, not very difficult to write. Most of the work is in iterating over the ā€˜business logicā€™ tests in the central section. In those tests you might want to, for example, run actions, read relation data, curl API endpoints, verify connectivity, test the workloads and so on.

However, a disproportionate amount of time is spent in the setup/teardown steps because of all the packing and waiting involved.

Wouldnā€™t it be nice if we could skip them, when weā€™re iterating on the tests?

It helps to think of the test functions in the module above as falling into one of two categories:

  • tests that mutate the model topology (e.g. add or remove an app / unit / relation)
  • tests that only observe and verify (e.g. run actions, send tcp requestsā€¦)

Assuming you can keep the two nice and cleanly separated, you can do the following:

Using custom markers to (de)select topology-changing tests

Add a custom pytest mark to all of your topology-changing tests:

# in test_my_something_integration.py

# setup
@pytest.mark.setup  # <<<
@pytest.mark.abort_on_fail
async def test_build(ops_test): ...

@pytest.mark.setup  # <<<
@pytest.mark.abort_on_fail
async def test_deploy(ops_test): ...

@pytest.mark.setup  # <<<
@pytest.mark.abort_on_fail
async def test_integrate(ops_test): ...

# business logic tests
async def test_1(ops_test): ...

async def test_2(ops_test): ...

# teardown
@pytest.mark.teardown  # <<<
async def test_disintegrate(ops_test): ...

If you have whole modules or classes dedicated to performing setup/teardown tasks, you can also collectively mark those. See the documentation for how you can do that.

Now you can:

tox -e integration -- -k test_my_something_integration -m setup --keep-models

This will:

  • set up your integration testing environment with tox, so you have all dependencies nicely encapsulated
  • only run this test module
  • only run the tests marked with @pytest.mark.setup
  • keep the juju model that ops_test provisioned for you once the setup is done

When the command returns, assuming your charms pack/deploy/relate without issues, youā€™ll have a test-something model in your currently selected juju controller.

Alternatively, you can create that manually, prior to running the setup tests, and pass the model name to ops-test using the --model flag. The command then becomes:

tox -e integration -- -k test_my_something_integration -m setup --model <some-model-name> --keep-models

In that model, youā€™ll find your charms ready to be tested. Grab that model name, youā€™ll need it next. Now comes the fun part. Run:

tox -e integration -- -k test_my_something_integration -m "not setup and not teardown" --model <some-model-name>

This will:

  • set up your integration testing environment with tox, so you have all dependencies nicely encapsulated
  • only run this test module
  • select the juju model you already created (with all the charms/apps you already deployed and integrated)
  • only run the tests not marked with either setup or teardown, that is, only your business logic tests

The end

In conclusion, separating the test concerns allows you, by using a set of carefully selected pytest markers, to iterate quickly on parts of the testing suite (or the charm code that the suite is testing!).

Pro tip: say you have a failing integration test for an actual charm bug. For example, a config file isnā€™t where itā€™s supposed to be. Thereā€™s an issue with the charmā€™s _push_config_to_container() function.

$ # prepare the testing env
$ tox -e integration -- -k my_test_module -m setup --keep-models  

$ # switch to that model
$ juju switch test-model  # name will be random

$ # go to the charm repo 
$ cd /path/to/charm/repo

$ # set up code syncing
$ jhack sync -S mycharm

$ # fix your code 
$ vim ./src/charm.py  # do what needs doing
$ # once you save, `jhack sync` will push your changed charm.py file to the unit

$ # invoke the charm method that's pushing the config to the container
$ jhack eval myapp/0 "self._push_config_to_container()"

$ # run only the integration test that was failing
$ tox -e integration -- -k test_config_is_valid --model test-model  

et voilĆ 

4 Likes

Iā€™ve done a poor manā€™s version of this for a while and its really helpful. I might do tox -e integration -- --model my-model once, and if it fails on a test Iā€™d rerun it with tox -e integration -- -k "not build_and_deploy" or tox -e integration -- -k "test_that_failed". But using markers with a standard nomenclature is a much nicer way, especially when setup occurs across several tests. If I could find the time, Iā€™d love to put these markers in all the Kubeflow charms

Iā€™ve wanted to do this too for cases where a bundle can benefit from a charmā€™s tests. It would be nice for a bundleā€™s tests to be able to do:

  • deploy bundle
  • git clone charm-in-bundle
  • cd charm-in-bundle; tox -e integration -- -k "not setup" # ā† test the bundle using a charmā€™s tests

That could help spot configuration errors that bundles introduce to break charms without literally duplicating tests unnecessarily

Very nice!

This also reminds me that we rarely include teardown in our itests!

1 Like

to help standardization, I could see if I can contribute this to pytest-operator so we could run tox -e integration -- --skip-teardown | --skip-setup

Yeah having a few standard keywords that pytest-operator takes as first class would be good. If nothing else, it documents a convention people can opt into

I just took a look at the codebase and found out thereā€™s already a @pytest.mark.skip_on_deployed you can toggle with --no-deploy. So half the functionality is there already