Making Data Fabric compete in the market

Hi Tom!

First of all, happy 2024 to you as well :wink: ! And many thanks for your feedback and comment!

Indeed, many of the things you mention are extremely spot on and things we are working on or planning to. Let me address them below:

  1. Jupyter charm. Right now we have a first Jupyter integration with Charmed Spark, provided by a spark image that can be run locally. We are currently in the process of separating this feature into a dedicated image (see PR here) in order to have a bit more minimal and segregated images. This is propedeutic to also charm this, which is something that will be addressed in the mid-term (within the next months). However, already with the image, one can spin up a Jupyter notebook powered by Spark locally. See here for an example on how to set this up easily if you have a MicroK8s cluster running locally. The UX may slightly change (e.g. the image to be used) once the PR will land, but hopefully you can the gist.

  2. Integration with Iceberg jars. We are about to integrate the Iceberg jars in the Charmed Spark image actually right now. We have a task in the current sprint backlog, you see the Jira epic here.

  3. Object storage integration (and integrations in general). We are currently designing a charm that should land in the next months to centralize integrations. The idea is indeed (as you also suggest) to NOT have the developer to set up the S3-compatible backend bindings manually in the configuration (e.g. as suggested here), but rather encode these via juju relations. We envision a “Configuration Hub Charm”, that relates to other charms (e.g. s3-integrator) and that charm builds the low-level configuration for the user. The spark-client snap then uses these configurations when running a Spark job. Note that this can also work for other set of configurations, e.g. monitoring, and in general for integrations.

  4. Flink Unfortunately, Flink is not yet in our roadmap in the short term, say until April-May this year, but it is certainly on our radar, and I have been discussing this with @robgibbon quite few times.

Once again, many thanks for your feedback. It is really very much appreciated, and it is indeed nice to see that the need you are outlining here is aligned (mostly :wink: ) with the roadmap that we have envisioned.

Best, Enrico