Supporting Pydantic's MISSING sentinel in Ops

Hi charmers,

Context

Pydantic now has a way to express the concept of “field not set” or “field is missing”:

https://docs.pydantic.dev/latest/concepts/experimental/#missing-sentinel

This is how it works:

from pydantic import BaseModel
from pydantic.experimental.missing_sentinel import MISSING

class C(BaseModel):
    foo: str | MISSING = MISSING
    bar: str | MISSING = MISSING

# JSON with missing fields can be parsed
> C.model_validate({"foo": "aa"})
C(foo='aa', bar=<MISSING>)

> C.model_validate({"foo": "aa"}).bar
<MISSING>

# Objects with missing fields can be constructed, implicitly or explicitly
> C(foo="aa")
C(foo='aa', bar=<MISSING>)

> C(foo="aa", bar=MISSING)
C(foo='aa', bar=<MISSING>)

Today, the Ops API for typed relation data, namely ops.Relation.save()doesn’t handle the missing values. It’s more of a bug really, the set of fields doesn’t match the set of values.

It turns out that Charm Tech is not of a single mind about what the expectation of .save(C(), ...) is.

Erase the missing fields

One could argue that since the fields are declared in the model, Ops ought to erase the missing fields in the databag. This would also allow for a certain round-trip property: c_in = C(…); rel.save(c_in); c_out = rel.load(C); c_in == c_out.

Ignore the missing fields

When the ops.Relation.load() and .save() were discussed at the sprint before last, it was decided that .save() should act like __dict__.update(), that is leave the existing fields alone and overwrite with new data only. The logic was that two independent pieces of code could write to the same databag. Under this reading, the Pydantic semantics of MISSING indicate that .save() should do nothing with the missing fields, leaving old data in the databag, if there is any.

Your opinion

Please let us know how you would use this feature!

Are you in the Erase or Ignore camp? Somewhere in between? Somewhere else engtirely?

I vote for erase as default. Though perhaps it’d more elegant to have two clear and distinct paths for the behaviours

  • databag.update(foo) same behaviour as dict.update, i.e. overwrite anything explicitly provided, leave everything else untouched
  • databag.set(foo) wipe the databag, then update.

I think for a reconciler charm it makes sense to always use the set semantics because a reconciler charm never cares about the previous state of a databag. For a delta charm, on the other hand…

1 Like

Here’s a completely different take: shouldn’t the absence of fields declared in the model be treated as an invalid action and error out?

If the fields are in the model, they are expected. If they are not set, the bag is malformed. If the model allows for states that do not include that property, that should be modelled as a separate valid state.

1 Like

Interesting take, @0x12b .

I was thinking this is the difference between optional (not required) and nullable (sent as null).

The wider context is handling schema evolution: being able to introduce new fields and retire old ones.

There’s an additional issue that comes with that: If you are to use multiple schemas to show the different states, how do you show that in the interfaces, etc? In some cases, how do you know which schema to use (eg: multiple different schemas for relation-changed events because your communication protocol takes multiple steps ?)

2 Likes

With code like this one:

class AppData(BaseModel):
    foo: str | MISSING = MISSING
    bar: str

...
data = db_relation.load(AppData, event.app)
data.foo = MISSING
relation.save(data, event.app)

for me it would break the principle of least astonishment if the field is not deleted. Besides that, I understand that the behavior will be different in nested fields if the field is not deleted and MISSING is supported.

From other point of view, it is hard to see a use case for this feature…

2 Likes

I retract my previous statement. I’d be fine with either or, even though it seems like a change in search for a problem. The value not being there is a good enough signal to it not being missing, IMO. :stuck_out_tongue:

1 Like

Thank you for your comments, charmers!

We’ve decided that when a class declares a field and the object has the MISSING value, we’ll remove that field from the databag on .save().

1 Like

The change is included in today’s Ops release, 3.6.0 :tada: