Kubernetes Service Interface with Operator Framework

dominik.f · 4 November 2020 14:04

Hello!

@cory_fu and I have been working on a very simple interface for passing Kubernetes Service information from one Charm to the other.

Both the interface and the charms that are written with the operator framework.

There are still several points that we would like to discuss with the charmcraft and the juju team. Cory has recollected them in this PR comment. Which I will restate here:

How do we strongly encourage (or require) interface API authors to use versioned API protocols from the start?
Where should interface API components live / how should they be published?
Does this approach seem like the best pattern to encourage; specifically, of eschewing event handling in interface API components and instead using __init__ to examine the state of the model and distill it down to a more semantically relevant view? (Check the PR comment for more details)

We would love to hear your feedback about this.

chris.sanders · 5 November 2020 21:01

Commenting only on the 3rd bullet point about doing the setup in the init function of the interface.

I like emitting an event to let charms that implement the relation know that you’re now ready. That’s a really clean and easy to understand paradigm for charms to know how to use the interface.

I’m not clear why that’s being done in __init__ though. It would seem to me this is going to cause the event to be emitted every time this interface is initialized by the charm after the relation data has been setup. That means every status, config-changed, etc will be emitting this event again.

Wouldn’t it be a better flow to register on the joined/changed relations in the interface and then call k8s_service_available.emit() only when processing one of those events? Those are the only times that the interface actually need to to evaluate the relations and presumably is the only time we would need to re-validate.

It might not make a difference in a very simple charm, but if we’re talking about a common pattern that most interfaces would follow I would think doing it even on a simple interface would be more uniform and easier for charm authors.

dominik.f · 10 November 2020 13:34

I think those are really valid points and maybe @cory_fu can also add a little more to this.

Our main concern with using relation joined/changed was depending on this specific timing when these events are launched. Using __init__ the whole execution is more predictable and we can be sure the relation data will be kept up to date with the charm.

Regarding, every status, config-changed emitting the same event, this is avoided by checking if the data has changes using a hash.

cory_fu · 10 November 2020 17:02

I’m not clear why that’s being done in __init__ though. It would seem to me this is going to cause the event to be emitted every time this interface is initialized by the charm after the relation data has been setup. That means every status, config-changed, etc will be emitting this event again.

Since the data might be used (once available) on any possible hook depending on the needs of the charm using the API class, the most natural thing to do felt like inspecting the model during __init__ and setting up attributes with the data and whether it was complete. After that, since the data was already read and setup there, it then also felt natural to go ahead and check if it had changed there as well. (Note that the event is only emitted if the data is newly complete or if it had changed since becoming complete, so it wouldn’t be emitted every time as you thought.)

Actually, the original implementation tried to only read the data during the relation joined / changed hook but that led to thinking it needed to be cached in stored state to be available at other times and then race conditions keeping it in sync, which is what originally led to the current approach.

That said, as you point out, the relation joined / changed events are really the only point at which the data could possibly have changed, so it would be reasonable to do a hybrid approach of inspecting the model during __init__ (or even making the attributes into properties which inspect the model, to avoid asking for relation data when it might not be needed yet) and then only doing the changed check in the joined / changed handler and emitting the event there would work. My main argument against that approach, if you can really call it that, is that, and this mostly applies to K8s charms, you end up with a charm that consists of an __init__ with a slew of observes that all point to the same handler which then checks if preconditions are met and, if so, sets the pod spec. With that, a common source of error that I’ve already seen is not thinking of one or more hook events that you should include in the list and thus end up with the charm not doing the right thing at odd times. Just doing the checks and pod spec directly in __init__ leads to easier to follow code and avoids that source of error. But, of course, it also doesn’t encourage you to think about and understand the underlying life-cycle events which are happening, so once you do move on to more complex use-cases, you might get blindsided by the additional learning curve, and you can end up doing unnecessary work if you’re not careful, as well.

It’s also true that the expectation that events emitted in a component not be dropped seems like it will be a common and understandable one, so if we don’t want to actually allow for that in the framework, we might want to at least document it well and possibly even have it emit some sort of warning or even a fatal error (maybe just having the testing harness consider it fatal but not the runtime would be better).

Anyway, I think that the hybrid approach is the best solution for this, and just doing the data collection in __init__ is sufficient to avoid the timing issues that @dominik.f ran into in the first pass.

cory_fu · 10 November 2020 17:16

I should note that it might be better to use @property methods rather than actual attributes read during the component’s __init__ to avoid reading and deserializing data from the relation during hooks where it’s never actually used, but depending on how complex the data / deserialization of a given interface is, that could skew the other way (or require additional caching, but there are ways to make that easy, as well).

chris.sanders · 10 November 2020 17:24

So what you’re saying is, you’ve seen a race condition where storing the service_name and service_port in the state db during a relation-joined and relation-changed event caused nondeterministic results?

If that’s the case, I think there’s a bug we need to track down. It would seem you should be able to store those values on realtion-joined and relation-changed, updating the class attribute at that time, and in __init__ simply populate the values from the statedb.

This is all completely aside from the question of what’s the better approach, either should work and if you’re saying it doesn’t this is a simple charm we should fix that.

As for which is preferred, I lean toward storing in the statedb for one primary reason. Storing things in the statedb lets you address one specific relation at a time, based on an event. You don’t have to loop over all relations and throw out the ones you don’t want. Having to understand how relation data is stored to loop over and evaluate which ones you need seems unintuitve when Juju will give you the relation you care about during the realtion-* events. However, I don’t feel strongly about this I’m more concerned if it’s actually nondeterministic which shouldn’t be the case.

cory_fu · 10 November 2020 17:45

Well, there’s some inherent non-determinism involved since the timing and ordering of hooks and in which hook data is visible is only partially deterministic (e.g., you’re guaranteed to get a relation-joined hook before a relation-changed for a given relation ID, but you could have other hooks in between, and data might be visible during relation-joined or it might not be), but I don’t think any of that was insurmountable, just more prone to mistakes.

The other problem with caching data during the relation events is that any time you’re caching data, there’s the chance of it getting out of sync. That’s particularly true in the case of relation data where you then start having to worry about removing things at the appropriate time, and it makes it harder to present things in a consistent order, etc. You also end up duplicating all of the data, doubling the storage requirements on the controller. And personally, I actually find it more intuitive to think about it in terms of “I want to present a filtered view of the current set of data” which leads naturally to just looping over the relations.

cory_fu · 10 November 2020 18:20

I created PR #2 based on this discussion to have something concrete to compare against and discuss further.

chris.sanders · 10 November 2020 19:00

I like the new PR. That looks good to me now.

The difference of tracking relations in a statedb and updating on relevant events or filtering the relations is honestly an implementation detail and doesn’t affect someone consuming the relation. It could honestly even change over the lifecycle of an interface and consumers shouldn’t care.

facundo · 13 November 2020 19:30

Commenting on the first two items in the bullets points…

We’re in the process of adding to charmcraft a simple mechanism to publish and fetch libraries to/from Charmhub. You can think about these libraries like interfaces, components, or any piece of code that you as charm author want to share with other people, because other people using that library will interact easily with your own charm.

Regarding versioning, these libraries will be shared using an API version number (that will change only if the library mutates enough that break interfaces/signatures) and a PATCH version number to track the regular non-breaking evolution of the library.

I’ll be posting in this forum more information (and a tutorial!) about this library mechanism when we have it ready, but please let me know if you have any question or how I can help you!

Regards,

cory_fu · 16 November 2020 20:49

That all sounds great!

Regarding versioning, I was specifically thinking about versioning of the interface API protocol that the charms use to communicate, rather than the code itself. Because different charms may be using different versions of a given interface API library, there really needs to be some version negotiation step in case the protocol needs to evolve over time to add new features or clean up the communication protocol. This has happened quite a few times in the past with interfaces and has led to interface layer code which becomes a mess of logic to try to handle all different protocol versions. I’d really like to see something that provides protocol version negotiation akin to Juju’s websocket API protocol versioning in a way that makes it easy for interface authors to use and that we strongly encourage all interfaces to use.

The downsides or difficulties with this is that it does force at least one round of back-and-forth communication to negotiate the common version, which with relation data and hooks is a lot more time expensive than it is with a websocket connection, and it can be difficult to retrofit an API with version negotiation when it wasn’t using it before and one side just assumes it can start sending data without doing a version check first.

facundo · 11 January 2021 16:20

This functionality already landed in Charmcraft (you need the 0.6.1+76.gcebe5a1 versions, from the edge channel).

I also just published a tutorial to use this new Charm Libraries feature.