Where do relations live?

magicaltrout · 13 January 2021 16:58

Hello folks,

Continuing on from this thread:

and possibly rather dumb but stuff changes all the time and I can’t keep up.

Where do relations live these days then? Charms, charm libraries, git repos with links in layer-index, or somewhere else?

Thanks!

sabdfl · 14 January 2021 06:59

I can speak to what was discussed in our discovery and design sprint which lead to the current framework.

At a low level, the framework should have some sort of data structure which represents the whole relation data set - the application-level key/values (set by the leaders) and the unit-level key/values too. We took to describing that as a sort of scoreboard. To use cricket as an analogy, you might have team-level scores and player-level scores. That’s where you would get the raw keys and values being set with relation-set (and the leader equivalent for application-level keys/values).

A charm integration library would typically wrap this in a higher-level, branded object which offers a more application-specific view. Under the hood, you have an ‘endpoint’ and ‘relation data’, but a MySQL charm integration library might turn that endpoint into a ‘SingleDatabase’ object (which handles errors if you try to give the app two databases at the same time for example, which the generic model allows but the app might not have a meaning for). Inside that SingleDatabase object would be a MySQL-sensible set of attributes. If you dug into the code of that MySQL integration library and class, you would see the framework relation data object being parsed and transformed into the MySQL view.

The point of this is to have clean Python for the low-level juju-generic relation data, and equally clean Python for the high-level “I just want to connect a MySQL database” view.

Will ask those closer to the framework to weigh in.

niemeyer · 14 January 2021 10:00

“Where do relations live” is a great question because it’s hard to answer it without actually explaining the different layers involved:

The relation data lives inside the juju controller, and is associated with the particular units that are part of the relation. Events will be sent to the units when that data changes or the lifecycle of a related unit changes.
The relation interface is the “protocol” — the sequence of actions that units will perform using that data and its events. This is defined conventionally at the moment by the person or team that is responsible for that specific relation. We may eventually transform this into a more formal process, but the goal here is to enable people to solve their own problems without being constrained by an artificial process before we have a good understanding of best practices.
The implementation of the relation interface and protocol lives inside the charms themselves, in the form of code that handles that data and its events according to the established conventions. Charms are free to define their own implementation.
While charms are free to define their own implementation, that’s not very practical or convenient to redo all the time. So it’s natural that a library of implementations emerges and gets reused. While people are free to create their own libraries, we’ve been slowly making progress on proposing good implementations for reuse or highlighting the best ones from the community.
Reusing parts from one another extensively is somewhat hard unless we sit on a common base, which is the reason why we’re also proposing a common framework that manages the data and events associated with relations, as well as the general interaction of a Python application with the juju agent and controller.

So those are the several layers of the problem space that surround the question “where do relations live”. It may feel like things change all the time, but actually if you have a charm that has a relation and respective interface working in juju, the same exact charm should continue to work unchanged for as long as its interface remains unchanged.

What we’ve introduced recently is a framework that makes it much more convenient to interact with those ideas from Python, and we’re also improving tooling so that it’s easy to share code that is based on this framework. With that we’re not replacing GitHub, or PyPI, or pip. We’re just making it easy to collaborate on this specific problem space.

Does that make sense? The question and the answer are both widely scoped, so feel free to pick something more specific for us to dive in.

sabdfl · 14 January 2021 11:02

At the Juju level, the only change to relation-data in the Juju controller and Juju agent protocol that I am aware of, in nearly 10 years, was the addition of application-level data which the leader units can write. relation-set and relation-get are the agent-level commands that handle that data, iirc.

All the work in the charm code space, with reactive and now the Python operator framework, uses those agent-level primitives to set and get these key-value relation data structures, and handles the events that come from a counterpart unit updating its data in the structure.

I think the remaining piece of info I would like to see in this thread are:

can someone reply with an example of Python using the Python operator framework that gets or sets a value in the relation data, as exposed at the Python operator framework level?
can someone reply with an example of a higher-level application-specific application integration class, where the same sort of relation data is exposed, but in an application-centric way?

I would guess at something like this for the framework level expression of the relation-data:

# Raw relation data in the framework
value = self.endpoints['db'][n].remote[m]['addr']

In this rather pedantic guess-a-thon, self is the application that the charm is driving, endpoints['db'] is the endpoint for the relation indexed by declared name, [n] is to accommodate the general Juju application graph that allows multiple relations to a particular endpoint (you can connect multiple apps to the same logging service for example), remote[m] is a way of expressing that I want the unit data set by the unit m on the other side of the relation rather than the data from a unit of my peers, and ['addr'] is which data I want from that units key-value pairs.

And I would guess at something like this for what a MySQL charm might offer up in an integration library:

# Endpoint that is only ever allowed to have a single database related
import charm.mysql.v3.msqldb as mysql

db = mysql.SingleDB('db')
if db.connected:
  ip_addr = db.addr

Waving hands furiously, in this example I am importing the latest minor revision of mysqldb integration library major version 3, telling it to give me an instance for any database attached to the endpoint ‘db’, ignoring units because those have been handled in the integration library class internals that I don’t want to care about (failover etc are all handled internally to the library ideally).

magicaltrout · 14 January 2021 11:51

Thanks folks, I think we’re heading down the right path, I’m just trying to ensure I do interfaces and relations in the operator framework in the manner you folk invisage it.

The dual layer interface stuff is certainly interesting and some concrete examples of how this should work would be much appreciated.

In pre-operator juju, we’d create a git repo, commit our interface code to this repo and submit a PR to the layer-index git repo where they would all live so others could reuse our interfaces, for relation/interface sharing is this still the case?

This worked alright, I don’t really have any complaints in the process, but the annotation stuff will no doubt have changed from pre to post operator framework, so does the relation interface codebase change with regards to handling an operator charm?

I’ve been looking at a number of examples to try and get a handle on it:

Firstly James’s examples he wrote when asking about relations last year: https://github.com/jamesbeedy/operator-foo-requirer/blob/master/src/charm.py and https://github.com/jamesbeedy/operator-foo-provider/blob/master/src/charm.py

In this instance the relation code is in the charm itself, which of course doesn’t lend itself to reusability.

I’ve also been looking at:

which make use of a number of relations both on the provides side and the peer relation side.

In the postgres charm you have

provides:
  db:
    interface: pgsql
  db-admin:
    interface: pgsql

I’m making the current assumption that the relation information still comes from layer-index?

In James’s relation example he makes use of operator framework callbacks inside his interface. If you’re doing this and wanted to make this code reusable, is dumping it in the separate git repo and submitting a layer-index PR still the way to go?

It is unclear where postgres-k8s pulls its relation code from, so I’m assuming its using the old interface here: provides.py - interface-pgsql - [no description] ?

Sorry I know its a bunch of random questions stuck together in a post, I’m just trying to no have to rewrite mine too much by following the most current processes!

sabdfl · 14 January 2021 12:35

Ah, is it possible that layer-index is a reactive construct? Reactive had the idea of sharing code, expressed as layers, with some awkwardness for multi-language situations where you could potentially have a layer in language A but still use it in a charm written in language B.

In the new Python operator framework, we have (controversially ) gone the other way. We are optimising for charms written in Python, integrated with more charms written in Python. That way, we replace layers with simple Python libraries - literally foo.py can be exported from one charm and imported into another charm.

This is the ‘charm library’ capability that Facundo is mentioning in charmcraft 0.7.

Essentially, Charmhub becomes a super-simple PyPI. If you are the publisher of charm A you can publish a library foo.py. You publish major.minor versions, starting with 0, and the latest minor version for a given major version can be fetched with charmcraft. So instead of managing layers, you are just updating Python libraries. Charmcraft sticks those in a subdirectory of your charm, hence the import charm.a.v3.foo stuff in my handwavy example. This is guesstimated for ‘import the latest minor version of foo.py v3 from the A charm’, and it will work if you are using charmcraft to maintain the charm directory tree.

magicaltrout · 14 January 2021 12:41

Yeah which I think is where some of the confusion arises:

https://github.com/juju/layer-index has layer layers and interface layers listed, from which you could pull interfaces. Which gave an index of reusability.

So we’re now saying for interfaces we define them in a library, export the library and allow charms to consume that library? Or did I miss a step?

Thanks!

sabdfl · 14 January 2021 12:44

With the Python operator framework, you can export a Python library, which other charms can import, and charmcraft facilitates the process of exporting and importing/updating.

Low-level interface handling is the obvious use case for this, but exactly the same mechanism provides for general code sharing. In other words, the libraries you share can provide any Python you want, not just interface relation-data handling. There may be other capabilities. For example, a subordinate charm might offer up classes that interact directly with the subordinate workload, not just with the relation-data (since a subordinate workload is going to be right there alongside the main workload).

jameinel · 14 January 2021 12:55

There is work being done on charmhub to make it easier to share just the interface definitions for interacting with a charm. This is still in progress, so the current mechanisms are built around “how do you get python code packaged together” which is either something like git submodules, or pypi.

In the case of Postgresql specifically, I would recommend this library:

An example use of it is in

At the low level of the framework, you can see that it is doing:

for relation in self.model.relations[self.relation_name]:
...
    app_data = relation.data[self.model.unit.app]
    for k in ["database", "roles", "extensions"]:
        v = app_data.get(k, "")

Which is equivalent to

for relation in self.model.relations[self.relation_name]:
  relation.data[self.model.unit.app]["database"]

However, lib-pgsql does provide the higher level semantic of PostgreSQLClient which has a set of events related to the lifecycle of Postgreqsl.
It defines a custom Event type (PostgreSQLRelationEvent) which then defines attributes that you specifically care about (event.master gives you the pgsql connection string). And also custom events like PostgreSQLClient.on.database_available and PostgreSQLClient.on.master_changed. So that in your charm you don’t have to worry about the individual relation joined/changed/etc events, but can wait for the logical “is PostgreSQL ready for me to talk to it.”.

magicaltrout · 14 January 2021 13:13

Ah yes, thanks John! I forgot about ops-lib-pgsql from my poking around a couple of weeks ago. That gives me something to prod around in.

I’m not using Postgres in this case, I’m building out a data processing backend for some tooling we use so I’m looking to develop some effective Zookeeper and Solr charms. Solr, if run in cloud mode will need Zookeeper and my web crawler charm will require a Solr relation so I’m just ensuring I have some patterns to go on rather than just guessing.

So its clear layer-index is consigned to the dustbin of history. I shall try and knock up some interface code and see how I get on with libraries and imports. Looking forward to getting this all going, I really enjoy the new framework and tooling.

sabdfl · 14 January 2021 14:20

Very glad to hear that!