Juju and IP Addresses

Introduction

In this post I will be detailing analysis, design and progress with regard to the ongoing work around Juju networking spaces. It can be considered a work-in-progress, with additions and comments welcome.

The intent is to:

  • Materialise value from analysis done so far, by disseminating it to the team.
  • Explore deficiencies, potential improvements and design decisions.
  • Report development progress.

Current Behaviour

Storing Addresses

Juju stores addresses in the following locations:

  • The machines collection has fields for addresses, sourced from the provider, and machineaddresses, sourced from the local agents.
  • The ip.addresses collection has addresses related to entries in the linklayerdevices collection.
  • CAAS addresses reside in the collections cloudservices and cloudcontainers.
  • The controllers collection contains two documents with host/port entries for controller connection endpoints:
    • One with all available endpoints, suitable for use by clients.
    • Another with endpoints suitable for use by agents, which may be a proper subset of those for clients if a controller management space has been configured.

Machine Addresses

The machineaddresses field in the machines collection is populated via the machiner worker. It can be configured to clear addresses on start-up, which will cause machineaddresses to be nil. Otherwise addresses are updated from the results of a call to the standard library’s net.InterfaceAddrs method. These are only ever set upon worker start-up.

The addresses field is kept up-to-date via the instancepoller worker. It uses the provider implementation of instance.Addresses in order to source them. On MAAS these addresses are also decorated with a space name and provider space ID where known.

Whenever machine addresses are updated, the PreferredPublicAddress and/or PreferredPrivateAddress fields may be updated.

Link-Layer Device Addresses

The machiner worker also populates link-layer devices and addresses. Each time it runs, it interrogates all network devices on the machine, gathering detailed information (see params.NetworkConfig). It then calls SetObservedNetworkConfig on the provisioner API, where the provider network config is obtained and merged with the machine configuration before linklayerdevices and ip.addresses are populated.

CAAS Addresses

Addresses in the cloudservices collection are updated by the caasunitprovisioner’s application worker, by asking k8s about the service (application) directly.

Addresses in the cloudcontainers collection are updated by the same worker when there is a cluster change event. The addresses are sourced from each unit’s pod.

Controller Endpoints

API endoints are set at bootstrap from the initial machine’s provider-sourced instance addresses.

The peergrouper worker maintains these entries.

Using Addresses

Machine Addresses

TBC

Link-Layer Device Addresses

These addresses are used by the network/containerizer package to reason about container spaces, host devices and bridges when configuring networking for containers.

Controller Endpoints

These are watched by machine agents that maintain a local configuration file with endpoints that can be used to communicate with controllers.

CAAS Addresses

TBC

Current Deficiencies

Identification of Spaces and Subnets

Spaces are identified by name and subnets by CIDR. This means:

  • Issues associated with renaming a space.
  • The inability to work with subnets in different networks that have the same CIDR.

Identification of spaces and subnets by unique IDs is the first task to be undertaken as part of the remodelling work.

Address filtering by space will then be changed to work via space IDs rather than names.

Incomplete Device Address Information

When the provisioner API server receives network configuration gathered by the machiner, it gathers provider configuration. This is collected by the provider as network.InterfaceInfo, converted to params.NetworkConfig, merged with the machine-sourced data, then converted into state types for persistence.

At each of the three conversions some of the fidelity is lost.

In order to reconcile incoming link-layer device addresses with the correct subnets, we need to maintain and transport the provider IDs for subnet (and probably network) so that these can be used to relate addresses to the new subnet IDs.

Space Support

Network spaces are supported by the MAAS and AWS providers.

Only MAAS currently supports controller configuration for juju-ha-space (used for Mongo replica-set communication in HA) and juju-mgmt-space (the management plane on which agents connect to controllers). This is because it is the only provider that decorates addresses in the machines collection with a space name.

After remodelling spaces and addresses, the intent is to:

  • Make space support available to other providers.
  • Detect and decorate provider-sourced machine addresses with known space IDs, so that the controller configuration options for spaces become generally available.

Adding Subnets

The API server logic for adding subnets pre-dates the auto-loading of spaces and subnets from the provider.

When adding a subnet, network info is queried to create a cache of spaces and subnets. Subnet data is gathered from the provider. Space data is gathered from state. The incoming request is compared to the cached data to ensure that the entities referred to exist according to the provider.

We should:

  • Remove the cache logic. It over complicates add-subnet assuming more than one subnet is added at a time by a user.
  • Replace the add-subnet command with new commands that allow linking and unlinking of subnets to spaces. Part of the add-subnet functionality is currently replaced by reload-spaces.
  • Possibly in future work, investigate how we might allow manual addition of subnets for say, the manual provider, or make auto-loading work there.

Development Progress

Spaces are Identified by ID

Spaces are now stored with a monotonically increasing ID in similar fashion to machines. Migration and upgrade steps are in-place to handle this change.

Subnets are Identified by ID

As with spaces, subnets have a numeric ID. Migration and upgrade steps are in place, and the names package no longer validates subnet tags as a CIDR.

Machine/Container Addresses Use Space IDs

Behind the API boundary, address spaces are identified by ID instead of name. This includes updates to logic for filtering addresses for the HA and management spaces.

Endpoint Bindings Use Space IDs

As for addresses above, endpoint bindings now use space IDs internally instead of space names.

Moving Towards Ubiquitous Spaces

Juju models now always have an “alpha” space. By default, all subnets are in this space. This ID (0) of this space is also the default space used for endpoint bindings.

Preserving Interface Detail and Enhancing AWS Spaces

The loss of network information mentioned above has been rectified in a significant overhaul of the instance poller undertaken by @achilleasa. When combined with other development in the cycle, this means that the spaces capability on AWS now includes the ability to set configuration for the management space (for agent communication with controllers) and the HA space (for Mongo peer communication).

@manadart

Since there is the data model work going on, I just wanted to make a comment about similarities in MAAS and OpenStack data models as it might be worth noting for the future development of spaces for different providers:

  • Fabric (MAAS) ~ physnet (OpenStack);
    • each fabric has a separate range of VLAN IDs (1 - 4094);
  • L2 broadcast domain ~ VLAN (MAAS) ~ segment (OpenStack)
    • physical segments in OpenStack are associated with physnet labels (non-physical ones are, for example, VXLAN or GRE).
      • So when you create a physical segment you need to specify:
        • a type: flat (no tagging) or vlan
        • a physnet name to identify a fabric;
        • a segment ID (vlan ID) for vlan segments;
      • each segment has its own subnet (or multiple subnets) associated;
        • the legacy OpenStack behavior is 1 segment per network - nowadays a segment is created implicitly and during subnet creation you specify a network and, optionally, a segment on a network;
  • a collection of L2 segments ~ (somewhat) Space (MAAS) ~ network (OpenStack);
    • one network in OpenStack can contain many segments that belong to different physnets in the same way one space can contain multiple VLANs that belong to different fabrics in MAAS;
  • routing domain/VRF/L3VPN - (somewhat) Space (MAAS) ~ address scope (OpenStack)
    • address scopes are used to support overlapping address spaces by several tenants. The concept from network engineering is called VRF and you could make an analogy by calling them “L3 VLANs”;
    • For a given network space in MAAS there is a notion that hosts which have interfaces on VLANs belonging to that space also rely on a routing setup in the data center for sending packets between hosts in different subnets on those VLANs.

I think we will find similarities as above for other providers as well as they evolve their complex networking support so making the Juju data model suitable for that would be useful.

2 Likes

Thanks @Dmitrii, this is great info to have.

Bumping this to the top in case any community members have any further input.

I have added some more information to the original post.

I still need to complete some of the address usage sections, but I think the Development Progress section reflects the work up to the 2.7.0 release.

It is probably best to add new replies to this post as we land fixes/enhancements rather than keeping on editing the original - it indicates a timeline better. I will do so as time permits.

New features landing: https://discourse.jujucharms.com/t/openstack-multi-space-support/2672