In Charm Tech, we’ve been experimenting with using GitHub Copilot to automate the migration of charm integration tests from pytest-operator to Jubilant, and feel that the results are good enough that you should give this a go yourself. The joy of running your integration tests on Juju 4 is within (much easier) reach!
If you’re reading this, I imagine you already know, but for the record: we feel that getting your integration tests to use Jubilant is a critical step towards ensuring that your charm runs on Juju 4. If there are issues with your charm on Juju 4.0.3, then we really want to know if those are issues with Juju (please report them!).
TL;DR
With a single prompt and the right model, Copilot can migrate your integration tests to Jubilant in minutes, producing code that’s ready to merge with light review. We tested this on seven charms across five different teams and got scores of 21–25 out of 25 (against our internal rubric) on every single run. Some took only three minutes, all were done in under twenty (although you’ll need a bit of human time afterwards for validation, of course). This may be faster than the actual tests run – it might be faster than bootstrapping the test environment in some cases ;).
What We Did
We ran 22 experiments testing different strategies for AI-assisted migration:
- 6 different prompt strategies — from a bare one-line instruction to a detailed step-by-step recipe
- 2 models — Claude Sonnet 4.6 and Claude Opus 4.6
- 7 charms — from simple (2 tests) to large (18 files, 1,450 lines removed)
We scored each result on correctness, completeness, code quality, minimality of changes, and how much human review would be needed.
What We Found
The best approach is simple: tell the model to install Jubilant and read its source code before migrating.
copilot -p "Migrate this charm's integration tests from pytest-operator \
to jubilant and pytest-jubilant. Update all test files, conftest.py, \
helpers, and dependencies.
Before starting, install jubilant and pytest-jubilant from PyPI \
(pip install jubilant pytest-jubilant) and read the source code to \
understand the API." --model claude-sonnet-4.6 --allow-all-paths --allow-all-urls
That’s it. No detailed recipe needed. No example charm to point to. Just “read the source, then migrate.”
Depending on how much trust you have in your agent, or how straightforward it is to run the integration tests locally, it’s also worth telling the agent how to get the tests results (for example, via gh to get from CI, or running charmcraft test locally).
Some things we found interesting:
- Sonnet beats Opus — for this task, every single time. Opus over-engineers; Sonnet keeps it simple.
- A bare prompt scores 24/25 — the model already knows enough about Jubilant to do a decent job with zero guidance. This was definitely not the case even 6 months ago.
- Detailed recipes can hurt — we borrowed a carefully-crafted migration recipe, but it actually scored lower than the bare prompt (feel free to argue in the comments that we were unfair here).
- Pointing to docs is counterproductive — the model reads the low-level API docs and rolls its own fixtures instead of using
pytest-jubilant’s built-in ones. This is almost certainly because Charm Tech doesn’t (yet!) promotepytest-jubilant, and that’s changing (really soon). It’ll be interesting to see how much this changes, and how quickly.
Example Migrations
We asked Copilot to pick 5 charms to migrate, instructing it to select a diverse set across teams. Although the tests are somewhat diverse in style and complexity, most of the chosen charms are actually from the same team — sorry about that! These have been submitted as real PRs upstream, although they’re in draft:
| Charm | PR | Time | Score | AI Commit | Fix Commits | CI |
|---|---|---|---|---|---|---|
| content-cache-k8s | #167 | 7 min | 25/25 | 1 | 3 | All green |
| nginx-ingress-integrator | #324 | 10 min | 22/25 | 1 | 6 | Pass |
| indico | #723 | 10 min | 21/25 | 1 | 8 | Pass |
| loki-k8s | #572 | 17 min | 21/25 | 1 | 5 | Pass |
| hockeypuck-k8s | #201 | 9 min | 21/25 | 1 | 4 | All green |
(For some of these, we think that there are unrelated failures, but are not 100% sure on that, and would appreciate input from those that know the charms & tests.)
In each PR, the first commit is the direct output of the AI migration process. All the remaining commits are fixes to get the tests actually passing — but AI was able to make those too, with a bit of help. The key was having the CI infrastructure hooked up so the model could see test failures and iterate. Having a way to charmcraft test and run integration tests locally would let this feedback loop happen earlier, or you could set up the PR early (or let the agent do that) and have it use that for feedback. In both cases, the faster your tests run, the better.
I reviewed each of the PR before submitting upstream. They look reasonable from the perspective of someone that knows Jubilant, but they need review from someone familiar with the specific charm and its tests.
What the fix commits tell us
Looking across the fix commits, the same patterns come up repeatedly:
- Linting/formatting — almost every PR needed a ruff, black, or flake8 fix. The migration is structurally correct but doesn’t always match the project’s exact formatter config.
- Lock files —
pyproject.tomlwas updated butuv.lockwasn’t always regenerated. - Wait/status subtleties —
jubilant.all_activechecks all apps in the model, which breaks if a related app is in “waiting” status. Several PRs needed scoped waits. - Duplicate registrations —
pytest-jubilantalready registers--charm-file, so re-adding it inconftest.pycauses errors. - CI configuration — Jubilant needs Juju 3+, so workflows defaulting to 2.9 needed channel updates.
None of these are hard to fix, and the AI handled them all once it could see the CI output. All code checks pass on every PR, and the integration test failures that remain seem environmental (timeouts, DNS, missing credentials) — not migration-related.
What to Watch For
The output is good but not perfect, as you’d expect with using AI. AI does the migration, but charming humans do full review! When reviewing, check:
- conftest.py: Does it use the built-in
jujufixture, or did it create a custom one? (Built-in is better.) juju.wait()calls: Watch forsuccesses=3— this parameter doesn’t exist.- Dependencies: Make sure both
jubilantandpytest-jubilantare added, andpytest-operatoris removed. - Linting: Run the project’s linter — the AI migration may not match your exact config.
- Lock files: Regenerate
uv.lock(or equivalent) if the AI didn’t.
Full Details
The complete experiment write-up, including all 22 run transcripts, evaluation scores, methodology, and a detailed practical guide, is available at:
charming-with-claude/experiments/2026-02-17-jubilant-migration-experiment
If you try this on your charm, let us know how it goes! We know that a few people have already boldly walked this path — we’d love to hear more about how things went for you, too, and what approach(es) you took.