Background
Juju k8s charms communicate to Juju the artifacts needed to provision their workloads.
The guiding principle is that we model everything that is generic.
The initial model was deliberately quite small in the entities that were defined.
As the kubeflow k8s charms were developed, additional k8s functionality was necessary to be delivered in the pod YAML that Juju passed through to “deploy” the charms.
V1 Features
As a reminder, here’s what was done for v1.
Juju defines a substrate agnostic model which the charms use to specify what they want.
Key concepts include:
- containers
- image path, access secrets
- ports
- resource limits via constraints (mem, cpu power supported)
- affinity via constraint tags
- config files created on the workload filesystem
- workload config via environment variables
- storage (via the standard Juju storage modelling)
- security (run as root, allow privilege escalation etc)
- k8s specific custom resources
Charms specify what they need in a YAML file and send to the controller using a pod-spec-set
hook command.
A curated subset of K8s specific sections were included in the primary YAML file, eg liveliness probe.
V2 Features
Firstly, we introduce a version attribute to allow us to maintain compatibility with v1 and also allow subsequent improvements (eg additions to what we model).
version: 2
k8s specific artifacts will be specified using a separate YAML file to keep a clean separation of what’s modelled and what’s k8s. What is added to the k8s specific YAML is an opinionated and curated subset of what’s possible using kubectl and native k8s YAML directly.
We add support for missing features:
- config maps
- service accounts
- workload permissions and capabilities
- secrets
- custom resources
We split the YAML into 2 files - one for core modelling concepts that map well to the Juju model, and the other for k8s specific things like CustomResourceDefinitions, Custom Resources, and Secrets.
$ podspec-set spec.yaml --k8s-resources resources.yaml
Most charms will not need any k8s specific resources so that yaml file is passed as an optional parameter.
Charm metadata.yaml
We added a minimum k8s version attribute, similar to minVersion for Juju. This will live with the other k8s deployment attributes.
deployment:
min-version: x.y
type: stateless | stateful
service: loadbalancer | cluster | omit
Note: service omit
is now used instead of omitServiceFrontend
in the podspec YAML.
Changes to the podspec YAML
The following sections describe v2 specific changes to the podspec YAML file passed as the first argument to pod-spec-set
.
Workload permissions and capabilities
We allow a set of rules to be associated with the application to confer capabilities to the workload; a set of rules constitutes a role. If a role is required for an application, Juju will create a service account for the application with the same name as the application. Juju takes care of the internal k8s details like creating a role binding etc automatically.
Some applications may require that cluster scoped roles are used. Used global: true
if cluster scoped rules are required.
serviceAccounts:
automountServiceAccountToken: true
# roles are usually scoped to the model namespace, but
# some workloads like istio require binding to cluster wide roles
# use global = true for cluster scoped roles
global: true
#
# these rules are based directly on role rules supported by k8s
rules:
- apiGroups: [""] # "" indicates the core API group
resources: ["pods"]
verbs: ["get", "watch", "list"]
- nonResourceURLs: ["*"]
verbs: ["*"]
Config Maps
These are essentially named databags.
configMaps:
mydata:
foo: bar
hello: world
Scale Policy
As well as setting annotations, it’s now possible to set the scale policy for services, ie how should the workload pods be started, serially one at a time, or in parallel. The default is parallel
.
service:
scalePolicy: serial
annotations:
foo: bar
k8s specific container attributes
k8s specific container attributes like liveliness probes and security context info are now in their own section under each container definition.
containers:
- name: gitlab
image: gitlab/latest
kubernetes:
securityContext:
runAsNonRoot: true
privileged: true
livenessProbe:
initialDelaySeconds: 10
httpGet:
path: /ping
port: 8080
readinessProbe:
initialDelaySeconds: 10
httpGet:
path: /pingReady
port: www
K8s Specific YAML
This YAML includes things like custom resources and their associated custom resource definitions, as well as secrets etc. All of the following are passed to Juju by placing any required sections in the file passed via the --k8s-resources
argument to pod-spec-set
.
The YAML syntax is curated from the native k8s YAML to remove the boilerplate and other unnecessary cruft, leaving the business attributes. Here’s an example of defining a custom resource definition and a custom resource. These could well be done by different charms, but are shown together here for brevity.
kubernetesResources:
customResourceDefinitions:
tfjobs.kubeflow.org:
group: kubeflow.org
scope: Namespaced
names:
kind: TFJob
singular: tfjob
plural: tfjobs
versions:
- name: v1
served: true
storage: true
subresources:
status: {}
validation:
openAPIV3Schema:
properties:
spec:
properties:
tfReplicaSpecs:
properties:
# The validation works when the configuration contains
# `Worker`, `PS` or `Chief`. Otherwise it will not be validated.
Worker:
properties:
replicas:
type: integer
minimum: 1
PS:
properties:
replicas:
type: integer
minimum: 1
Chief:
properties:
replicas:
type: integer
minimum: 1
maximum: 1
tfjob1s.kubeflow.org1:
group: kubeflow.org1
scope: Namespaced
names:
kind: TFJob1
singular: tfjob1
plural: tfjob1s
versions:
- name: v1
served: true
storage: true
subresources:
status: {}
validation:
openAPIV3Schema:
properties:
spec:
properties:
tfReplicaSpecs:
properties:
# The validation works when the configuration contains
# `Worker`, `PS` or `Chief`. Otherwise it will not be validated.
Worker:
properties:
replicas:
type: integer
minimum: 1
PS:
properties:
replicas:
type: integer
minimum: 1
Chief:
properties:
replicas:
type: integer
minimum: 1
maximum: 1
customResources:
tfjobs.kubeflow.org:
- apiVersion: "kubeflow.org/v1"
kind: "TFJob"
metadata:
name: "dist-mnist-for-e2e-test"
spec:
tfReplicaSpecs:
PS:
replicas: 2
restartPolicy: Never
template:
spec:
containers:
- name: tensorflow
image: kubeflow/tf-dist-mnist-test:1.0
Worker:
replicas: 8
restartPolicy: Never
template:
spec:
containers:
- name: tensorflow
image: kubeflow/tf-dist-mnist-test:1.0
tfjob1s.kubeflow.org1:
- apiVersion: "kubeflow.org1/v1"
kind: "TFJob1"
metadata:
name: "dist-mnist-for-e2e-test11"
spec:
tfReplicaSpecs:
PS:
replicas: 2
restartPolicy: Never
template:
spec:
containers:
- name: tensorflow
image: kubeflow/tf-dist-mnist-test:1.0
Worker:
replicas: 8
restartPolicy: Never
template:
spec:
containers:
- name: tensorflow
image: kubeflow/tf-dist-mnist-test:1.0
- apiVersion: "kubeflow.org1/v1"
kind: "TFJob1"
metadata:
name: "dist-mnist-for-e2e-test12"
spec:
tfReplicaSpecs:
PS:
replicas: 2
restartPolicy: Never
template:
spec:
containers:
- name: tensorflow
image: kubeflow/tf-dist-mnist-test:1.0
Worker:
replicas: 8
restartPolicy: Never
template:
spec:
containers:
- name: tensorflow
image: kubeflow/tf-dist-mnist-test:1.0
Secrets
Secrets will ultimately be modelled by Juju. We’re not there yet so we add the secrets definitions to the k8s specific YAML file (initially). The syntax and supported attributes are tied directly the the k8s spec. Both string and base64 encoded data are supported.
secrets:
- name: build-robot-secret
type: Opaque
stringData:
config.yaml: |-
apiUrl: "https://my.api.com/api/v1"
username: fred
password: shhhh
- name: another-build-robot-secret
type: Opaque
data:
username: YWRtaW4=
password: MWYyZDFlMmU2N2Rm
Pod Attributes
k8s specific pod attributes are defined in their own section.
pod:
restartPolicy: OnFailure
activeDeadlineSeconds: 10
terminationGracePeriodSeconds: 20
securityContext:
runAsNonRoot: true
supplementalGroups: [1,2]
readinessGates:
- conditionType: PodScheduled
dnsPolicy: ClusterFirstWithHostNet