Hello! Here is the report for the 8th pulse of the IS DevOps team in 2024:
High level
- 12-factor - documentation updates
- cloud mirror - finalize approach and begin implementation
- discourse - charm upgrades , COS to prod (after staging stable)
- dns bind charm - basic charm, library and snap, handle conflicts between sources at relation level
- documentation - reference documentation and redeployment documentation
- github runners - Openstack integration, Webhook router deployment
- https lego provider - minor updates on API (enabling automations for REST API)
- indico Reconcile tf for staging indico version 3.3.1, prod rollout to 3.3.1, support for S3 integrator (deploy on staging and prod) - Done
- jenkins Fix integration with traefik, setup ps6 staging environment, begin of IS migration (havenβt got confirmation yet) - Done
- netbox. Finish publish/promoting and Redis integration.
- synapse - S3 media, irc bridge testing on staging, horizontal scaling continue with implementation, stability fixes, add a dashboard to Synapse statistics (stats_exporter)
- wordpress - plugins implementation and tests
12-factor
Architecture
Bugs
- GitHub Actions Dashboard: repositories were missing the webhook secret. Fixed by adding it in github terraform plan. (@amandahla)
- result: done.
- Synapse PostgreSQL database monitoring not working. (@amandahla)
- result: in progress. Waiting Managed Solutions team.
- HTTPRequest lego provider create-user action not working (@arturo-seijas )
- result: done
Charming improvements
-
Update managing changelog spec according to reviews (@aliaw)
- result: In review. Updated according to one reviewer. Pending one more review.
-
Spec ISD142: CharmHub Token Refresh in GitHub Charm repositories. (@amandahla)
- result: in review.
Cloud mirror
- finalize approach and begin implementation
Discourse
Documentation
-
Add disaster recovery doc for github runners (@bartz)
- result: done
-
Synapse external access documentation. (@amandahla)
- result: done.
DNS charm
GitHub Self-hosted runners
- Deploy webhook router charm to staging environment (@bartz)
- result: deployed to a new environment but still waiting for IS to make it accessible from outside
- Add webhook forwarding to the message queue in the router (@bartz)
- result: postponed to next cycle
- Align upgrade with install hook (@bartz)
- result: deployed to edge runners
- Add support for different Ubuntu base images (@charlie4284)
- result: (Docker issues pending fix) Pending - supporting github-runner-image-builder first
- Refactor charm to support multiple clouds (@charlie4284)
- result: Pending - supporting github-runner-image-builder first
- Implementation of removal of OpenStack runners (@charlie4284, @aliaw)
- result: Removed runner implemented for production. Might needs to be improved. The integration test is not passing.
- Implement the test for spawning runner on OpenStack (@aliaw)
- result: The spawning runner should be working. Working on removal runner in integration test.
- Setup production env for OpenStack integration (@aliaw)
- result: Done. The arm64 runners are available.
- Work with data-platform team on testing the ARM OpenStack runners. (@aliaw)
- result: Rejected.
Indico
- update to v3.3.1
- Result: Done
- update saml_groups plugin to support v3.3.1
- Result: Done
- Deploy on staging and production
- Result: Done
Jenkins
- Replan pebble on ingress ready
- Result: Done
- Correctly handle integration w/ traefik in path-routing mode
- Result: Done
- Add RAM percentage JVM option and documentation
- Result: Done
- Update to 2.440.2 LTS
- Result: Done
Matrix
- add a dashboard to Synapse statistics (@amandahla )
- result: Done. It needs to be deployed in production.
- performance improvements: alert for SynapseProcessNewPulledEventHighCPU, libjemalloc, NGINX. (@amandahla)
- result: Done. It needs to be deployed in production.
- horizontal scaling: deployed in staging. (@amandahla)
- result: In review. Running tests.
Netbox
- publishing and promoting (@javierdelapuente)
- result: Done
- Upgrade Redis from config to integration (@javierdelapuente)
- result: Done