Introduction
In this post I will be detailing analysis, design and progress with regard to the ongoing work around Juju networking spaces. It can be considered a work-in-progress, with additions and comments welcome.
The intent is to:
- Materialise value from analysis done so far, by disseminating it to the team.
- Explore deficiencies, potential improvements and design decisions.
- Report development progress.
Current Behaviour
Storing Addresses
Juju stores addresses in the following locations:
- The machines collection has fields for addresses, sourced from the provider, and machineaddresses, sourced from the local agents.
- The ip.addresses collection has addresses related to entries in the linklayerdevices collection.
- CAAS addresses reside in the collections cloudservices and cloudcontainers.
- The controllers collection contains two documents with host/port entries for controller connection endpoints:
- One with all available endpoints, suitable for use by clients.
- Another with endpoints suitable for use by agents, which may be a proper subset of those for clients if a controller management space has been configured.
Machine Addresses
The machineaddresses field in the machines collection is populated via the machiner worker. It can be configured to clear addresses on start-up, which will cause machineaddresses to be nil. Otherwise addresses are updated from the results of a call to the standard library’s net.InterfaceAddrs
method. These are only ever set upon worker start-up.
The addresses field is kept up-to-date via the instancepoller worker. It uses the provider implementation of instance.Addresses
in order to source them. On MAAS these addresses are also decorated with a space name and provider space ID where known.
Whenever machine addresses are updated, the PreferredPublicAddress and/or PreferredPrivateAddress fields may be updated.
Link-Layer Device Addresses
The machiner worker also populates link-layer devices and addresses. Each time it runs, it interrogates all network devices on the machine, gathering detailed information (see params.NetworkConfig
). It then calls SetObservedNetworkConfig
on the provisioner API, where the provider network config is obtained and merged with the machine configuration before linklayerdevices and ip.addresses are populated.
CAAS Addresses
Addresses in the cloudservices collection are updated by the caasunitprovisioner’s application worker, by asking k8s about the service (application) directly.
Addresses in the cloudcontainers collection are updated by the same worker when there is a cluster change event. The addresses are sourced from each unit’s pod.
Controller Endpoints
API endoints are set at bootstrap from the initial machine’s provider-sourced instance addresses.
The peergrouper worker maintains these entries.
Using Addresses
Machine Addresses
TBC
Link-Layer Device Addresses
These addresses are used by the network/containerizer
package to reason about container spaces, host devices and bridges when configuring networking for containers.
Controller Endpoints
These are watched by machine agents that maintain a local configuration file with endpoints that can be used to communicate with controllers.
CAAS Addresses
TBC
Current Deficiencies
Identification of Spaces and Subnets
Spaces are identified by name and subnets by CIDR. This means:
- Issues associated with renaming a space.
- The inability to work with subnets in different networks that have the same CIDR.
Identification of spaces and subnets by unique IDs is the first task to be undertaken as part of the remodelling work.
Address filtering by space will then be changed to work via space IDs rather than names.
Incomplete Device Address Information
When the provisioner API server receives network configuration gathered by the machiner, it gathers provider configuration. This is collected by the provider as network.InterfaceInfo
, converted to params.NetworkConfig
, merged with the machine-sourced data, then converted into state
types for persistence.
At each of the three conversions some of the fidelity is lost.
In order to reconcile incoming link-layer device addresses with the correct subnets, we need to maintain and transport the provider IDs for subnet (and probably network) so that these can be used to relate addresses to the new subnet IDs.
Space Support
Network spaces are supported by the MAAS and AWS providers.
Only MAAS currently supports controller configuration for juju-ha-space (used for Mongo replica-set communication in HA) and juju-mgmt-space (the management plane on which agents connect to controllers). This is because it is the only provider that decorates addresses in the machines collection with a space name.
After remodelling spaces and addresses, the intent is to:
- Make space support available to other providers.
- Detect and decorate provider-sourced machine addresses with known space IDs, so that the controller configuration options for spaces become generally available.
Adding Subnets
The API server logic for adding subnets pre-dates the auto-loading of spaces and subnets from the provider.
When adding a subnet, network info is queried to create a cache of spaces and subnets. Subnet data is gathered from the provider. Space data is gathered from state. The incoming request is compared to the cached data to ensure that the entities referred to exist according to the provider.
We should:
- Remove the cache logic. It over complicates add-subnet assuming more than one subnet is added at a time by a user.
- Replace the add-subnet command with new commands that allow linking and unlinking of subnets to spaces. Part of the add-subnet functionality is currently replaced by reload-spaces.
- Possibly in future work, investigate how we might allow manual addition of subnets for say, the manual provider, or make auto-loading work there.
Development Progress
Spaces are Identified by ID
Spaces are now stored with a monotonically increasing ID in similar fashion to machines. Migration and upgrade steps are in-place to handle this change.
Subnets are Identified by ID
As with spaces, subnets have a numeric ID. Migration and upgrade steps are in place, and the names package no longer validates subnet tags as a CIDR.
Machine/Container Addresses Use Space IDs
Behind the API boundary, address spaces are identified by ID instead of name. This includes updates to logic for filtering addresses for the HA and management spaces.
Endpoint Bindings Use Space IDs
As for addresses above, endpoint bindings now use space IDs internally instead of space names.
Moving Towards Ubiquitous Spaces
Juju models now always have an “alpha” space. By default, all subnets are in this space. This ID (0) of this space is also the default space used for endpoint bindings.
Preserving Interface Detail and Enhancing AWS Spaces
The loss of network information mentioned above has been rectified in a significant overhaul of the instance poller undertaken by @achilleasa. When combined with other development in the cycle, this means that the spaces capability on AWS now includes the ability to set configuration for the management space (for agent communication with controllers) and the HA space (for Mongo peer communication).