I started playing around with some of the Bundles in order to get my bearings and figure out how Juju works with each piece of my network. I have seen a number of posts regarding similar issues and I think i grasped the primary issues with them and multiple spaces hence eliminating the undetermined space error other people have gotten.
Background:
I have a 4 node server which contains 4 NICs for each node. I’m only using one 1GB NIC as the Public Space and one of the 10GB NICs for the Internal Network which is not exposed to the rest of the network and certainly not the internet.
This all seems to work ok at this level. the Juju controller is running on an LXD VM within MaaS. Issues:
Using the Bundles such as the Kubernetes-Core I run into an issue where an LXD container (0/lxd/0) is created on what I believe is the first node (0). Part of the issue I found was that the Bundle asks for a Space named “Alpha” I tried changing it all to internal as recomended in Where are Bindings “alpha” coming from? and saw the issue with the LXD container persisted so I changed the space Name to “alpha” just to streamline testing. The 0/lxd/0 continued to be Pending.
When I looked through LXD I found that the br0 bridge (internal) had numerous containers created. however none of them are associated with br1 (public). This is somewhat interesting but I wasn’t sure if it was definitively an issue. On one hand an LXD container not being exposed is probably fine as the ultimate goal is I only need one container possibly the host node acting as the ingress into the network. But if I read correctly it seems like the container should have access to both. Documentation on this is a bit sketch on implementation of such a thing from what I can find.
I think my next step is to remote into the node and see why it’s not starting but if someone is able to shine a light in the right direction that would be great.
Update:
so interestingly after leaving the machine alone instead of being my usual impatient self I notice the status page reports alpha-2 as a space.
no obvious space for container "0/lxd/0", host machine has spaces: "alpha-2", "public"
On a side question:
Do I actually need lxd involved in making this cluster? do I really need that one virtual node?
Long story short I want to create a kubernetes cluster that utilizes Ceph. In the example that is currently documented and it is quite old as it still calls out kubernetes-master among other things. The 3 nodes also create multiple LXD nodes for etcd and ceph-mon. Could I instead cut out the middleman as it were and just run everything within the hardware? Am I actually gaining anything by utilizing lxd?
After stumbling around trying to push forward I came across this link Using Multiple Host Networks | Ubuntu
The nugget here was getting me to look at what Juju sees for spaces.
~ juju spaces
Name Space ID Subnets
alpha 0
public 1 192.168.0.0/23
alpha-2 2 10.0.0.0/24
undefined 3 10.0.1.0/24
10.1.77.128/32
172.17.0.0/16
This makes it a bit strange because what it’s calling alpha-2 is what I had named alpha. It is actually referring to fabric-0 as alpha.
alpha-2 as juju calls it is my internal network which started off being called fabric-3 in MaaS when I first started deploying machines.
So now I am trying to push everything over to fabric-0 for naming simplicity I can look at renaming them in the future. But hopefully this resolves the issue for anyone else. downside is I had to delete the juju controller to do this in MaaS but since there isn’t too much invested here it’s a minor inconvenience
ok, so after redeploying the controller. i’m stumped.
~ juju spaces
Name Space ID Subnets
alpha 0
public 1 192.168.0.0/23
alpha-2 2 10.0.0.0/24
undefined 3 10.0.1.0/24
10.1.77.128/32
172.17.0.0/16
I even ran juju reload-spaces and it didn’t change unlike when I had done so during my attempts to manipulate it while the controller was operational.
Here is what MaaS has set for reference:
I tried Renaming it in MaaS and then using reload-spaces again and apparently juju doesn’t care what I do in MaaS. apparently I have to delete the controller again which I consider a bug.
Also in case anyone is wondering, with the bundle charms like Kubernetes-core, --bind is flat out refused.
~ juju deploy kubernetes-core --bind alpha-2
ERROR options provided but not supported when deploying a bundle: --bind
the Answer is forget the prebuilt bundles unless your following a user-guide and you fit the criteria exactly. likely as a result you are probably exclusively on LXD or you have Maas with several machines that only have a single network space defined.
If you have anything outside of that narrow definition, just build it out from scratch and define the default space in the model ahead of time. Unless someone provides you a bundle file that you can edit to get things going.
I’m running the openstack setup from scratch and the lxd machines are deploying to the hardware just fine. the documentation is woefully out of date so prepare to substitute commands and such to fit whatever version of JUJU you are currently using.
There is still an underlying problem which I’m not sure if I caused or not. The resulting LXD containers cannot reach out to the internet. looking at the resulting /etc/netplan/50-… file I see it points to whatever gateway I set in Maas. The tricky part is there really isn’t one and I had to set that because MaaS would not let me leave it blank. I suppose I could make that address a NAT. But again based on the configuration of the node it seems redundant. Looking at Openstack’s Documentation this seems to be indicated in one of the infographics they provide.
or is it…
egress-subnets:
type: string
description: Source address(es) for traffic originating from this model
the constraints are also shown where the machine declarations are made as well. And best of all I don’t see a complaint about not knowing which space to use, nor do I see any hook errors because it can’t download software.
I ssh’d into 0/lxc/0 just to double check things and I could ping my router and googles DNS server.
Update: Every time you deploy an app to a new LXD container you do in fact have to add the constraint for spaces. I inadvertently missed a line and that machine could never complete it’s installation. if you miss it, just remove the app and the machine with --force and then redeploy the app with the constraint tags.