I would like to add some heat if possible to the aws security group cleanup issue that has been plaguing us for years now.
With a quick glance I can find multiple bugs for this issue stemming back to the early juju 2.x days. I’m hoping to bring this bug out front before it sneaks its way into 3.x.
Here are three bugs that seem to all be for this same issue:
The issue manifests as failed deployments with a user facing error that indicates a security group quota has been hit.
juju status shows:
Model Controller Cloud/Region Version SLA Timestamp 02a3164-centos7 osl-aws aws/us-west-2 2.8.6 unsupported 12:09:28-07:00 App Version Status Scale Charm Store Rev OS Notes percona-cluster waiting 0/1 percona-cluster jujucharms 293 ubuntu slurm-configurator waiting 0/1 slurm-configurator local 0 centos slurmctld waiting 0/1 slurmctld local 0 centos slurmd waiting 0/1 slurmd local 0 centos slurmdbd waiting 0/1 slurmdbd local 0 centos slurmrestd waiting 0/1 slurmrestd local 0 centos Unit Workload Agent Machine Public address Ports Message percona-cluster/0 waiting allocating 0 waiting for machine slurm-configurator/0 waiting allocating 1 waiting for machine slurmctld/0 waiting allocating 2 waiting for machine slurmd/0 waiting allocating 3 waiting for machine slurmdbd/0 waiting allocating 4 waiting for machine slurmrestd/0 waiting allocating 5 waiting for machine Machine State DNS Inst id Series AZ Message 0 pending pending bionic failed to start machine 0 (cannot set up groups: creating security group "juju-dad7c9ed-e8a5-49e9-82ce-fe6cd7a3043c": The maximum number of security groups has been reached. (SecurityGroupLimitExceeded)), retrying in 10s (5 more attempts) 1 pending pending centos7 2 pending pending centos7 3 pending pending centos7 4 pending pending centos7 5 pending pending centos7
Looking at the aws console I see that I have 1000s of security groups that juju has created but has not cleaned up.