Deployment Tips for proper Networking

I recently discovered a couple odd quirks that will affect the way you deploy either Calico or Canal when deploying on MaaS.

Node Naming

It turns out that Calico doesn’t handle Capital Letters in the server name. I came across this while attempting to deploy a kubernetes model directly to Metal utilizing this old guide. Now if you try to deploy that bundle you will find it needs a lot of fixes to make it work. But that’s not the point of this post. What I discovered after deploying several revisions of the bundle was I kept having an issue with one node not being able to resolve it’s own name. It didn’t matter even if I tried renaming the node manually while it was running either. Long story short I inadvertently had capitalized that nodes name. All of the other nodes were “node-n” and my little trouble maker was “Node-3”. For whatever reason the name cannot be resolved with a Capital letter. When you look at the debug output from the calico unit that is stuck with

INFO juju-log b'null\nresource does not exist: Node(Node-3) with error: <nil>\n'

And the Status for the unit stayed as Waiting to retry Calico node configuration"
I actually thought it might have been related to this bug or something similar. I tried manually running the calicoctrl on the node and others and referencing all of the node names and thats when it eventually caught my eye that this one and only node had a capital letter. I even tried to ping the node by name from another node and got absolutely nothing back just unreachable.

juju run --unit calico/0 'ping -c1 node-2'

After renaming the node in MaaS and redeploying the model I finally had a proper Calico networking experience… accept for one other thing I noticed…

PXE Boot Device

Another thing to consider in a multi-nic network environment is MaaS uses the PXE boot device in the network to resolve against. You can discover this by pinging one of your nodes from another by name.

juju run --unit calico/0 'ping -c1 node-2'

What you find may surprise you if you planned to route all of your traffic across a single network and only expose certain processes/events across the other. The resolved IP will be from the PXE device. Which means that despite your binding it will still attempt to route the traffic across the public network instead of the internal one you had designated. This may be tolerable in certain environments such as if the public network is firewalled behind a router and the router lacks the configuration to forward the traffic but it’s still not ideal and should be avoided.
In my case it needs to be avoided 100% because this will have an impact on Ceph and I need the traffic routed across the sfp+ 10g network and not the 1g public network. Of course since this is PXE you think I would have known better and adjusted the configuration on the nodes bios settings so that the loading of the OS would have been streamlined. Though that represents a challenge since the sfp is an Add-on card on a SuperMicro system (you can’t select an add-on card with these to PXE boot)…

Bandaids
A quick fix though not 100% tenable is to log into each node while they are still in the process of installing charms and editing /etc/hosts to reflect each nodes intended address on the NIC desired. I went a step further to make this more predictable and set the address to be a known static address in MaaS. If I had dozens or hundreds of nodes this would be a nightmare, though I could probably write a script to automate this in a deployment scenario. The obvious downsides are speed of installation since it’s pulling the image from MaaS on the 1G network but it’s better than the former issue. Unfortunately MaaS doesn’t give you the ability to shift to the other network in a PXE environment it also doesn’t give you the ability to correct the DNS entries on that end. In fact after reading through their discourse I discovered the recommended fix was to point each node to your own DNS server after deployment. I suppose I could try to find a way to create a DNS server on the Juju controller or perhaps another system within the internal network. More thinking will need to be done here.