IS Charms Team Updates - Pulse #8 2024

Hello! Here is the report for the 8th pulse of the IS DevOps team in 2024:

High level :high_brightness:

  • 12-factor - documentation updates
  • cloud mirror - finalize approach and begin implementation
  • discourse - charm upgrades :tada: , COS to prod (after staging stable)
  • dns bind charm - basic charm, library and snap, handle conflicts between sources at relation level
  • documentation - reference documentation and redeployment documentation
  • github runners - Openstack integration, Webhook router deployment
  • https lego provider - minor updates on API (enabling automations for REST API)
  • indico Reconcile tf for staging indico version 3.3.1, prod rollout to 3.3.1, support for S3 integrator (deploy on staging and prod) - Done :tada:
  • jenkins Fix integration with traefik, setup ps6 staging environment, begin of IS migration (haven’t got confirmation yet) - Done :tada:
  • netbox. Finish publish/promoting and Redis integration.
  • synapse - S3 media, irc bridge testing on staging, horizontal scaling continue with implementation, stability fixes, add a dashboard to Synapse statistics (stats_exporter)
  • wordpress - plugins implementation and tests

12-factor :factory:

Architecture :bridge_at_night:

Bugs :bug:

  • GitHub Actions Dashboard: repositories were missing the webhook secret. Fixed by adding it in github terraform plan. (@amandahla)
    • result: done.
  • Synapse PostgreSQL database monitoring not working. (@amandahla)
    • result: in progress. Waiting Managed Solutions team.
  • HTTPRequest lego provider create-user action not working (@arturo-seijas )
    • result: done

Charming improvements :wrench:

  • Update managing changelog spec according to reviews (@aliaw)

    • result: In review. Updated according to one reviewer. Pending one more review.
  • Spec ISD142: CharmHub Token Refresh in GitHub Charm repositories. (@amandahla)

    • result: in review.

Cloud mirror :cloud:

  • finalize approach and begin implementation

Discourse :flying_disc:

Documentation :books:

  • Add disaster recovery doc for github runners (@bartz)

    • result: done
  • Synapse external access documentation. (@amandahla)

    • result: done.

DNS charm :beans:

GitHub Self-hosted runners :running_man:

  • Deploy webhook router charm to staging environment (@bartz)
    • result: deployed to a new environment but still waiting for IS to make it accessible from outside
  • Add webhook forwarding to the message queue in the router (@bartz)
    • result: postponed to next cycle
  • Align upgrade with install hook (@bartz)
    • result: deployed to edge runners
  • Add support for different Ubuntu base images (@charlie4284)
    • result: (Docker issues pending fix) Pending - supporting github-runner-image-builder first
  • Refactor charm to support multiple clouds (@charlie4284)
    • result: Pending - supporting github-runner-image-builder first
  • Implementation of removal of OpenStack runners (@charlie4284, @aliaw)
    • result: Removed runner implemented for production. Might needs to be improved. The integration test is not passing.
  • Implement the test for spawning runner on OpenStack (@aliaw)
    • result: The spawning runner should be working. Working on removal runner in integration test.
  • Setup production env for OpenStack integration (@aliaw)
    • result: Done. The arm64 runners are available.
  • Work with data-platform team on testing the ARM OpenStack runners. (@aliaw)
    • result: Rejected.

Indico :calendar:

  • update to v3.3.1
    • Result: Done
  • update saml_groups plugin to support v3.3.1
    • Result: Done
  • Deploy on staging and production
    • Result: Done

Jenkins :man_artist:

  • Replan pebble on ingress ready
    • Result: Done
  • Correctly handle integration w/ traefik in path-routing mode
    • Result: Done
  • Add RAM percentage JVM option and documentation
    • Result: Done
  • Update to 2.440.2 LTS
    • Result: Done

Matrix :spider_web:

  • add a dashboard to Synapse statistics (@amandahla )
    • result: Done. It needs to be deployed in production.
  • performance improvements: alert for SynapseProcessNewPulledEventHighCPU, libjemalloc, NGINX. (@amandahla)
    • result: Done. It needs to be deployed in production.
  • horizontal scaling: deployed in staging. (@amandahla)
    • result: In review. Running tests.

Netbox :package:

1 Like