I am trying to make a cross-model-relation from an application in a vsphere model to an application in a MAAS model. The result, when I try to create the offer from the vsphere model:
17:55:56 INFO juju.cmd supercommand.go:91 running juju [2.8.1 0 16439b3d1c528b7a0e019a16c2122ccfcf6aa41f gc go1.14.4]
17:55:56 DEBUG juju.cmd supercommand.go:92 args: []string{"/snap/juju/13324/bin/juju", "offer", "slurmctld:slurmd", "--debug"}
17:55:56 INFO juju.juju api.go:67 connecting to API addresses: [jimm.bdxbdx.com:443]
17:55:56 DEBUG juju.api apiclient.go:1105 successfully dialed "wss://jimm.bdxbdx.com:443/api"
17:55:56 INFO juju.api apiclient.go:1007 cannot resolve "jimm.bdxbdx.com": operation was canceled
17:55:56 INFO juju.api apiclient.go:637 connection established to "wss://jimm.bdxbdx.com:443/api"
17:55:56 DEBUG juju.api apiclient.go:1105 successfully dialed "wss://jimm.bdxbdx.com:443/api"
17:55:56 DEBUG juju.api monitor.go:35 RPC connection died
ERROR unknown object type "ApplicationOffers" (not implemented)
17:55:56 DEBUG cmd supercommand.go:537 error stack:
unknown object type "ApplicationOffers" (not implemented)
/build/snapcraft-juju-03af7d/parts/juju/src/rpc/client.go:178:
/build/snapcraft-juju-03af7d/parts/juju/src/api/apiclient.go:1200:
/build/snapcraft-juju-03af7d/parts/juju/src/api/applicationoffers/client.go:52:
Seems that the offer functionality is not implemented in vsphere …
We have architected a solution using cross-model-relations to connect application running in vsphere models to applications running in models in MAAS. I didn’t previously understand that cross-model-relations only worked on certain clouds (not vsphere). Is there anything that can be done to get the cross-model-relations hooked up for MAAS and vsphere clouds?
Furthermore, I can’t seem to create a cross model relation offer on a maas model either.
~$ juju status
Model Controller Cloud/Region Version SLA Timestamp
slurm-nash-1 jimm.bdxbdx.com bdxbdx-maas-1 2.8.1 unsupported 18:06:29Z
App Version Status Scale Charm Store Rev OS Notes
slurmd waiting 0/1 slurmd jujucharms 1 ubuntu
Unit Workload Agent Machine Public address Ports Message
slurmd/0 waiting allocating 0 10.104.195.134 waiting for machine
Machine State DNS Inst id Series AZ Message
0 down 10.104.195.134 n-c125 focal se-west-1a Failed deployment: Failed to power on node - Power on for the node failed: Could not contact node's BMC: Connection timed out while performing power action. Check BMC configuration and connectivity and try again.
$ juju offer slurmd:slurmd
ERROR unknown object type "ApplicationOffers" (not implemented)
Sorry about this. It’s a frustrating place to bump into the issue.
That error message is generated by the RPC infrastructure between Juju agents. It’s possible that this is a JIMM issue rather than a vSphere issue. I’ll check and get back to you.
CMR is not cloud specific - any standalone Juju controller on any cloud has the service facades needed to support CMR.
Ah, I just noticed your controller is “jimm.xxx” - have you set up a bespoke JAAS deployment? The Juju controllers will be fully CMR aware but the front end JIMM proxy may need work done to participate in a CMR deployment. A check on the JIMM project source seems to be missing the CMR facades. We’d need to check with the folks on that team to ask that that omission be addressed.
@wallyworld Yes, we have a bespoke JAAS deployment. Sounds like a great start, to move in that direction. Is there anything I can do to help relay the information and/or get the correct eyes on this?
We have an ubuntu advantage deal going on, I think @erik-lonroth may have carried ^bug over to the ubuntu advantage ticketing system as well. Not sure if that will help or not, but its over there also.
Not having cross-model relations functionality from juju in combination with JIMM/candid adds a major problem.
Let me elaborate on why:
So, we are running SLURM, which is a system that normally has many thousands of machines picked up from the underlying cloud. This can be MAAS or perhaps some other cloud with virtual machines. The SLURM principal charms needs one or two subordinates which adds a multiplier to the number of units in a model.
SLURM models then grows to potentially tens of thousands of units which is not manageable at all. To remedy this, we (wanted to) architect SLURM with juju by splitting up the SLURM clusters into smaller models and using CMR to make up a larger SLURM cluster split into many models. You can look at it like sharding or partitioning our HPC SLURM environment making the environment manageable.
This also saves alot of machine resources from our MAAS/cloud since we would also not need to replicate central components of SLURM, like databases, SLURM-controllers etc. Maintaining multiple of these central components adds to complexity aswell, so CMR is needed here for that purpose aswell.
Missing out on CMR with JIMM+juju+candid puts us in a bad place where we are now left to chose between JIMM xor Juju…
I’m more than happy to provide more information and help in resolving this.
Good news is that the CMR in JIMM is on the ‘to-do’ list, and we are going to post and show here on Discourse some design that @ziheliu214 recently discussed with users and teams about displaying the CMR in the Dashboard (hence having it on JIMM/JAAS).
CC @hatch if you want to add anything to this.