Is the Reactive Framework Making Juju Slow? ⏱ - My Experiences With Juju So Far

zicklag · 29 October 2019 21:37

Hello everybody,

In this post I want to go over a bit of my experience with Juju and ask some questions.

Brilliant!, Brilliant!, Brilliant!

I found Juju recently and have since been doing some experimentation with it and have started on writing a basic charm. The first thing that I will say is that the idea is brilliant. I have experience with Chef, Docker, Packer, Terraform, Rancher, and Docker Swarm and I’ve told my partner before that I felt like there still needed to be a tool that facilitates communication between the applications that are being provisioned. Juju is that tool. Juju allows for a level of orchestration that practically allows you to create your own “orchestrator” that manages your applications in a way that Kubernetes or Swarm would never let you. It is a GAME CHANGER.

The Problem

Anyway, as I started to test Juju a little bit, one of the first things that I noticed was that it seems to be somewhat unresponsive or slow when it comes to building out the application stacks. For instance, even with a simple app consisting of an app charm and a Postgresql Charm when I make a relation between that app and Postgres it can take a couple minutes before the app even responds to the new relation. Also, spinning up a Kubernetes cluster can take about an hour.

To clarify, I understand that the charms are doing a lot: they have to download, install, and configure the applications, but what seems to be taking a lot of time is not the installation or configuration of the apps, but the interaction between the apps over the relations. I’m not 100% sure this is the case yet, but I wanted to bring it up and see what other people’s experiences have been with it.

Initially, I just accepted the fact that things were going to be slower to spin up in Juju than they would be in something like Docker Swarm. It is a different environment and a different kind of way to have the apps interact. My problem came, though, when I started to try to write my own charm, and it started taking a long time to debug the charm because I would end up just waiting for the system to do something after I made a change, and I didn’t know what it was waiting for.

The charm I was trying to make was a super simple Docker charm that would run the CodiMD Docker container and connect to the Postgresql charm for the database. As I made deployments of my charm and looked at the juju debug-log I noticed that most of the time that the Postgresql charm spent doing stuff was actually spent printing huge stacks of logs like this:

tracer: starting handler dispatch, 166 flags set
tracer: set flag apt.installed.postgresql-10
tracer: set flag apt.installed.postgresql-client-10
tracer: set flag apt.installed.postgresql-client-common
...

This seems to be where Juju is spending a lot of its time. To me it looks like the reactive framework is slowing Juju down, maybe even slowing it down drastically. My suspicion is that the hooks are firing relatively quickly, but the reactive framework is limiting the speed that the layers can actually do their work with all of the flags that it is setting.

I’m not sure if Python itself is the bottleneck, or if it is just an inefficient implementation of the Reactive framework, or if I’m missing the source of the delays entirely, but I’d love to get some feedback.

In order to test whether or not it is the reactive framework slowing things down, I’m going to do some testing with using just the normal hooks, skipping the reactive system, to see if there is a big difference in the responsiveness. I haven’t done any tests yet but I’ll post any further experience here once I do.

Idea!

As long as the Juju hooks do run in a timely fashion without delays I was entertaining the idea of creating a Rust framework for writing charms that would serve essentially the same purpose as the existing Python reactive framework except it would be written in Rust and use Cargo for package management instead of the existing charm layers.

If the reactive framework is the performance bottleneck, then writing the framework in efficient Rust code should solve that. A lot of people might not want to write Rust, which is totally understandable, but Rust is by far my favorite language and I think it could work well for me and my team. Also, because the way that you interact with relations in Juju stays the same, charms written with a Rust reactive framework would be 100% compatible with charms written with plain hooks or with the Python reactive framework. You could, if there was value in it, even create Python bindings to the Rust reactive framework to get the extra performance but still be able to use the familiar language.

What I love about Juju is the fact that I’m free to design my own charm building framework if I need/want to.

Summary

These are just my thoughts after starting out with Juju and I would like to get feedback on whether or not I’m completely misinterpreting what Juju is doing or whether or not that is just a limitation of the system somehow. My Rust idea is mostly brainstorming and may not may not make sense yet, but I figured I’d throw it out there in case anybody had any thoughts.

I’m really loving Juju so far and I think that me and my team will be able to do some pretty awesome stuff with it. Can’t wait to see what we can accomplish.

timClicks · 30 October 2019 00:25

Absolutely love this enthusiasm @zicklag.

I think that you’re right that Reactive Charming is slow. Flags being set everywhere doesn’t help. There also some other issues. Because they’re implemented as independent packages, every layer from every charm calls apt update and apt upgrade. This can make things painful if you’re not deploying to somewhere with a fast apt mirror.

RIIR? Not sure. Don’t get me wrong, I love Rust (otherwise I wouldn’t be publishing a book on it). The problem is that what’s really needed is a simple mechanism for charming. Saying “Welcome to Juju, please learn Rust to get started” won’t fly I think. Yet, writing charms is too difficult currently. In a sense, the space for innovation here is wide open. If you can create something that make it easy for people to write a charms and that gains adoption - then give it a go.

I’ve been wondering whether charming could be more like web frameworks. There’s no single way to write a web app, perhaps there should be multiple ways to write charms? As long as the interfaces and endpoints are well documented - the internal charming details should be irrelevant.

Once upon a time long ago—and you’ll see some of our older docs reflect this—charms were written in “any language”. That caused lots of confusion because people didn’t really know what Juju was (is it just calling shell scripts?) and spurred a generation of charms that were poorly implemented and difficult to maintain.

So there’s a real tension between creating your own charming framework for yourself vs creating a framework for a whole community.

Some others have explored this area and/or have some thoughts:

@maaudet has written several dozen charms and has built up his own framework (loosely based on reactive I believe)
@Dmitrii is exploring a new Python-based framework that should eventually replace reactive
@chris.macnaughton and several of the OpenStack charmers played around with simplifying reactive for their purposes (see charms.openstack and the OpenStack charm guide)
@erik-lonroth, @jamesbeedy, @magicaltrout, and others have all invested lots of energy training people on charms and probably have lots of advice

jamesbeedy · 30 October 2019 02:25

@zicklag thanks for the feedback. Possibly what you perceive as “slow” could just be lag due to the way you have implemented things, like a handler not being triggered correctly so it doesn’t run until the next hook invocation so to the user it appears as if things are moving “slow”. I have experienced this happening before and I could see it giving the impression that things are slow. I have been using Juju/reactive for a long time and haven’t ever heard of or seen anything “slow” in or around the reactive framework. I would be interested in knowing what you think is slow about it? There are no computationally intensive tasks that reactive touches, so I’m not sure the language would have anything to do with the speed of anything here. Many people appreciate the logs because it gives us information about what is going on in the charm. Logs can be turned down on a per model basis if the user intends to quite them down.

thumper · 30 October 2019 03:17

As @timClicks mentioned, Canonical is investing in a new python framework for writing charms. It will become our “go to way of writing charms”.

Juju isn’t changing in its handling of the hook executions, so people are always free to write other frameworks, but there should be a simple guide that new people can follow that is the “blessed” way forward.

We should have something to show in more detail in the coming weeks. This framework is still very new, and we are ironing out some initial kinks with the initial charms that are being written in the new framework.

A key here is to have a method where you can produce a simple charm very simply, and gradually introduce complexity while having something that works. The intent is to avoid the very steep learning curve of the current reactive framework.

maaudet · 30 October 2019 15:49

I actually have added a layer on top of charms.reactive in order to simplify and clean the resulting code a bit by allowing maximum reusability, so I’ve been using reactive for some time. Basically, I have separated the logic parts into “actions” that are being called from the reactive parts. Therefore I can create lists of actions to execute for particular events which allows me to simplify the reactive parts. It’s mainly because I have logical paths that cross, so it became very hard to do using flags only (doable, but not that clean).

The upgrade-charm calls for apt update and apt upgrade are indeed sometimes very slow and I have been thinking about disabling those calls for my charms in favor of only doing one during the install hook. I actually created an apt-update and an apt-upgrade action that can be automated using external systems or run manually.

The only issue that I had with upgrade-charm is that it’s called whenever a file is attached. It kinda makes sense to execute apt update and apt upgrade when you actually upgrade a charm.

Although, each cases are unique.

zicklag · 30 October 2019 19:25

I agree that that would not be a good general approach. Rust is definitely a barrier to some people and I wouldn’t want to make everybody use it.

With Juju hooks as it is you really don’t have to use any specific framework to write charms, which is absolutely to Juju’s credit. In a sense it already is similar to web frameworks: Juju provides an execution environment and an API ( or in this case CLI, in the hook environment ) that the framework uses to interact with the world, very similar to how a browser does. I think that the biggest difference is that there just isn’t as large a community for Juju yet so there aren’t a lot of mature options for it.

Yes, I actually just finished walking through that old doc that just used Bash and the Juju hooks, but the thing about it was that it instantly made quite a bit more sense to me than the new reactive approach.

I completely see the value in the reactive model and I actually like it quite a bit conceptually. The problem I had with the reactive framework from a developer experience standpoint was that there seemed to be a little too much “magic” that I felt like I didn’t have a good explanation of. For example, I was a little confused where the pgsql argument to this function in hello-juju came from. I also found that while bash was supported in the reactive framework, there wasn’t enough examples for me to figure out how to interact with the Postgresql interface ( I figured it out later after a lot of investigation ). Still, most of that was just a need for documentation.

One thing I do want to point out is that I think that writing Juju charms using the hooks alone is still a perfectly valid way to write charms. The docs made me feel like it was a sort of “second class” way to do it, but as a DevOps engineer who has years of experience writing Docker containers and automating things with shell scripts, writing hooks in bash was a very natural way to approach writing a Docker-powered charm ( I haven’t finished it yet, so maybe I’ll run into problems with it ).

The layer-based approach of the reactive framework is still a great model and it definitely handles the code-reuse problem better, but I don’t think we should play off the hook-based model as an “old” or “out-of-date” way to do it because it can actually be easier to understand and approach, I think, regardless of your background. That isn’t to say that I don’t think we should look for a better “default” way to write charms.

I see your point. At first I was thinking that me and my team would likely just write something that would work well for us, because everybody else does have the reactive framework to use, but I do think it would be beneficial to the community if we did try to design it with other people’s use-cases in mind as well.That way we could help grow the Juju community and get some valuable comparison with existing frameworks while hopefully learning more about patterns and how to write charms well.

We were thinking of maybe writing the framework itself in Rust but providing bindings for at least Python ( something I have some experience in ) and a CLI for use in Bash scripts. Then we could possibly provide bindings for JavaScript/WASM later. You could even throw Ruby in there if it was useful to enough people.

That way we could support Python, etc. while still being able to use Rust ourselves if we wanted to.

What seems slow is that it can spend around 15 seconds ( rough estimate, I might time it later to make sure ) logging the fact that it is setting 166 flags, at least once every time any hook for the Postgesql charm is run. It would seem to me that the setting of those flags should take a fraction of a second. I’m still working to understand what Juju is doing at different times, but as I wait for my apps to come up and I am watching the logs, it spends a lot of time just printing the fact that it is setting flags.

I have absolutely nothing against the debug logging or how verbose it is. That makes total sense and it is great that Juju will tell you that much about what is going on. What it seemed like was that the majority of the time my app was coming up was spent while those “tracer: set flag apt.installed.postgresql-10” logs were going and it seems like it shouldn’t be taking that long to set flags.

I’m doing my testing on AWS instances with 7Ghz cores and 1GB of ram so the CPU speed should not be a problem.

I don’t have any solid evidence yet, but I could probably get the Juju logs with timestamps in them to analyze. I could be misinterpreting where the time is spent and maybe I’m a little too impatient but I’m used to Docker Swarm where things can be changed quickly with very little downtime and I’m trying to get as close to that as I can in Juju.

I’m very glad to hear both of those. Having a blessed way forward is important to uniting the community, and for giving newcomers something they can be confident in learning initially. Preserving the ability to write your own frameworks is important, too, because it means that Juju is much less likely to be insufficient for people with special needs. For example, my team will most likely use Docker heavily to power our Juju charms, so it may prove valuable to us to create a Docker-specific charm-writing framework that allows us to create our own “Docker orchestrator” through Juju.

Sounds great.

That sounds like a pretty good idea. It kind of combines the more procedural approach with the reactive approach. At this point I’m very interested in different ways to model charms and I’ll probably try that out.

Not that everybody should use Docker, but that is one case where Docker containers work very well. Each container image and each version ( tag ) of that image is isolated and the apt upgrades and updates are run during the container build phase so that when you need to run or upgrade the application all you have to do is pull the Docker image and it will have what you need in it.

I’m hoping that that will help the deployment speed a bit.

timClicks · 30 October 2019 22:14

Just picking up this thread (will respond to some of your other points when I get the time)… some “Juju driving Docker” experience has been written up:

https://discourse.jujucharms.com/t/deploy-your-docker-container-to-any-cloud-with-charms/1135

maaudet · 31 October 2019 15:17

Another cool thing about that is that I usually will create a base layer that includes all the actions and 1 or more charm layers that will use that base layer that will override some actions to change how it works. That way I have less maintenance to do for charms like MariaDB + Galera MariaDB or Redis + Redis w/ Sentinel. It allows me to quickly create variants and reuse most of the code. It also keeps the code a lot cleaner since there are only the Galera or Sentinel-specific code for the charm layers. I also have a few “global” actions that I usually use everywhere, so I only have to change it once for it to apply to all my charms.

zicklag · 31 October 2019 20:16

Thanks, I’ve read through that and started experimenting with it a bit.

I talked to my partner and we decided that, for us, it will make the most sense to create a charm framework that is specific to helping your write Docker powered charms. Docker solves for some common problems that you might have when making a charming framework, such as the apt update and apt upgrade scenarios. You skip the need for most charms to install applications on the host because the software is already prepared inside of the Docker image. It also makes it easier to co-locate apps on a server without having to worry about the software for each application conflicting with each-other. We realized that pretty much every charm that we will make will most-likely be powered by Docker and that there will be a large overlap between the requirements of each of those charms.

What that will mean for the community is that while we will not be providing a fully general purpose charming framework, we will still be providing a charming framework that anybody can use to easily create Docker powered charms.

Everything is in flux right now, but we are thinking that we will start out with a more hook-based approach to writing the charm code and focus on letting you write the charm code in bash, or any other executable format, just like Juju does for its hooks. Bash will likely be second-nature to anybody who writes Docker containers so it will be a natural fit for our target audience.

I will probably be creating some documentation to outline our design plans as we work through research and development. I’ll post the documentation link here once it is up.

zicklag · 2 November 2019 18:42

I just finished the first draft of the design documentation for our “Lucky” charm framework. The repository is on GitHub:

https://github.com/katharostech/lucky

The documentation is here:

https://katharostech.github.io/lucky

timClicks · 2 November 2019 18:45

Love the name. Well done getting a first iteration released so quickly!

zicklag · 11 November 2019 17:06

Something worth noting here is that I just found in the docs:

I didn’t realize that, and I had been running both my app charm and PostgreSQL charm on the same host in most of my tests. That still doesn’t defeat the motivation behind making the Lucky charm framework, but it is a good point to realize that can effect the speed at which your charms get updated.

jamesbeedy · 12 November 2019 22:44

s/system/unit/ - I think the wording is incorrect in the docs.

timClicks · 12 November 2019 22:50

Have updated the docs.

zicklag · 13 November 2019 17:52

Ah, OK, that makes more sense to me. Cool.

timClicks · 14 November 2019 07:41

There are places where unit agents will acquire a machine lock. For example, only one charm can execute apt commands at a time.

zicklag · 8 December 2019 23:40

I’ve got an updated on the progress with our Lucky tool for writing Juju charms.

We have been making steady progress with a focus on the developer experience, ease of use, and documentation. In order to make sure that developers can get started quickly and understand how to use Lucky, we have created a built-in command-line documentation viewer, of which I have a screenshot below. This will put the documentation for Lucky at the developer’s fingertips and make sure that they can access the essential help information even when offline. The viewer is cross-platform and has no external dependencies.

Additionally we just finished creating packages for Windows, Mac, and Linux. We have a Chocolatey package for windows, a Homebrew package for Mac, and a Snap package for Linux. For each platform you can now install/upgrade Lucky with a single command.

As far as the technical features of Lucky, we have progressed far enough to actually create a Juju charm with lucky and deploy it. This was an extremely minimal test equivalent to a “Hello World” charm, but it proved that the concept was sound. With our UX targeted features in place now, we are going to focus on adding the features necessary to create a useful charm with Lucky.

Things are going very well and we are excited to continue development.

sabdfl · 29 December 2019 10:28

Hi there. This looks like great work! As I think you know, there is work ongoing on a next-generation framework to replace the reactive approach, and it would be super to get your input into that effort. While it’s fine for there to be multiple charm frameworks, it’s obviously better if we can get the most rounded set of tools with a common core approach.

matuskosut · 29 December 2019 13:25

I think this will be actually quite a lot of work to handle again for Docker. Current way to use bundles and application units in lxd containers seems to me very smooth. It solves not just co-location of apps, but also linking them. Assuming each app has separate charm, one can define custom relations in between them. We have a scale of interfaces, and charms take care of the rest.

I would then expect the same from docker, that a container is deployed as an application (looking from juju abstractions), so one could model infrastructure in a similar way that juju already enables. Having a docker swarm behind was also one of the thoughts. When I was thinking about such docker solution, I was considering either docker-app subordinates (juju app being a docker container/service) or having it as juju cloud (like kubernetes). The second one, similar to kubernetes, feels like a cleaner solution to me, but I haven’t used it (yet) and of course I would expect a lot more work there.

Do you plan to make such an abstraction to connect more with juju approach? I didn’t feel clear about it, but I definitely like using docker for quite a few things…

zicklag · 29 December 2019 20:42

Yes I was aware of that and I am eager to try it out as soon as it is ready for testing. Maybe it is already and I missed it.

A lot of the motivation for making another charming framework was that me and my team thought that we could make a framework that would be more suited to the way we work by making it specific to Docker. We know that not everybody will want to use Docker so Lucky may not be suitable for everyone, but by making it specific to Docker, we can add features that make life easier when using Docker that we would not otherwise have in a more generic framework.

That being said, it appears that Lucky may end up being perfectly suitable for non-Docker charms, though that was not originally a goal. That could make it more comparable to the next-gen charming framework.

A big difference between Lucky and the upcoming charming framework is that the new charming framework targets Python only, while Lucky is designed specifically to be very easy to create charms with shell scripts. The Lucky CLI is used to perform all charm actions and can be called from shell or any other language such as Python. It is also very possible that we will have a native Python library later so that you don’t have to spawn Lucky CLI calls when using Python to write your charms.

While I agree that having one solid way to write charms would be good, I think both frameworks take a different approach and have different strengths for different use-cases. I will definitely want to do some comparison between the frameworks and see what Lucky might be able to learn from the upcoming charming framework.

Lucky’s design will facilitate that abstraction in a rather un-magical way that doesn’t require a lot of work. I’ll explain some of how it is put together:

Firstly, almost every concept that is already established about Juju charms stays the same. You write a charm for each application and you define the interfaces that that application provides/requires so that you can relate them to each-other. A Lucky charm in the charm store will be indistinguishable from a reactive charm or any other kind of charm.

The Lucky framework gives you an easy way to run Docker images as a part of your charm. For example, if you are making a CodiMD charm, the Lucky framework will make it easier to run and update the CodiMD docker container based on Juju hooks and relations.

Lucky works more similarly to the out-of-the-box hook-based charm creation strategy than it does to the reactive framework. You can specify a set of scripts that will be run in direct response to Juju hooks. Lucky allows you to run these scripts both inside the container or outside the container on the host. For example, you could have a healthcheck script that runs on Juju’s update-status hook that gets executed inside of the container where it can check the health of the application, or you could have an install hook that runs preparations on the host that are needed before running the Docker container. Lucky handles installing Docker on the host and running the Docker container according to the properties that you set using the Lucky CLI.

In addition to the hook-based charm development strategy we are also going to add a reactive feature that allows you to specify scripts to be run when certain predicates are satisfied in the Lucky key-value store.

Lucky provides a CLI for setting key-value pairs in a unit-local key-value store. Lucky will allow you to specify that container or host scripts should be run in response to keys and values in the store. For example, you could have an upgrade-database.sh script that you want to be run in the container whenever the needs-upgrade key is set to true and the installing key is set to false. This allows for a workflow similar to the reactive framework.

Something Lucky will not have yet that the reactive framework does have is layers. Each Lucky charm will be standalone without building on other charms or layers. This may change in the future if it seems that such an addition would bring worth. To a certain extent the motivation for having layers is less prevalent in Lucky because all of your application’s dependencies should be, for the most part, bundled into a pre-built Docker image. If you need something similar to layers for your application installation, you could instead use base Docker images to build multiple variants of a Docker image on the same base image(s).

By using Docker images for the application deployment and installation, Lucky charms should install faster than many existing charms. Lucky charms will almost never have to communicate to apt repositories to install software. Docker images with the software packages already installed will be downloaded instead.

The overhead of the Lucky charm framework should also be minimal. All of the logic in the framework itself runs almost instantaneously and we should not run into the problem that we had with the reactive framework where the framework iself was slowing down the execution of charm logic.

I hope that gives you a better picture of how the Lucky framework will work. If you have any other questions do not hesitate to ask!

We are getting close to having the framework capable of creating real-life charms and will have a getting started guide and documentation once it is ready. We’re glad to get any ideas or feedback.