Streamlining Kubernetes deployment with Jsonnet: Part 2

Part 1 gave an overview of jsonnet syntax and semantics. I’m now going to suggest some ways to use these with kubernetes. These are not hard-and-fast rules, but some patterns you may or may not find useful.

To illustrate this, I’m going to build a complete configuration of the docker registry container. It will require a Deployment, a Service, an Ingress, and a PersistentVolumeClaim. As well as being a good example of a traditional container application, it’s a useful piece of local infrastructure to deploy (if you don’t have one already).

The Deployment

Let’s work on the Deployment, which is the main resource which defines how to spin up the pod. Our starting point is this YAML resource definition:

apiVersion: apps/v1
kind: Deployment
metadata:
name: docker-registry
labels:
app: docker-registry
spec:
revisionHistoryLimit: 1
replicas: 1
selector:
matchLabels:
app: docker-registry
strategy:
type: Recreate
template:
metadata:
labels:
app: docker-registry
spec:
shareProcessNamespace: true
containers:
- name: docker-registry
image: registry
ports:
- containerPort: 5000
protocol: TCP
volumeMounts:
- name: storage
mountPath: /var/lib/registry
env:
- name: REGISTRY_HTTP_ADDR
value: :5000
- name: REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY
value: /var/lib/registry
volumes:
- name: storage
persistentVolumeClaim:
claimName: registry-pvc

That’s fine. Indeed there’s nothing much we want to parameterise yet. But let’s turn it into jsonnet anyway, using the conversion tool:

// deployment.jsonnet
{
apiVersion: 'apps/v1',
kind: 'Deployment',
metadata: {
name: 'docker-registry',
labels: {
app: 'docker-registry',
},
},
spec: {
revisionHistoryLimit: 1,
replicas: 1,
selector: {
matchLabels: {
app: 'docker-registry',
},
},
strategy: {
type: 'Recreate',
},
template: {
metadata: {
labels: {
app: 'docker-registry',
},
},
spec: {
shareProcessNamespace: true,
containers: [
{
name: 'docker-registry',
image: 'registry',
ports: [
{
containerPort: 5000,
protocol: 'TCP',
},
],
volumeMounts: [
{
name: 'storage',
mountPath: '/var/lib/registry',
},
],
env: [
{
name: 'REGISTRY_HTTP_ADDR',
value: ':5000',
},
{
name: 'REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY',
value: '/var/lib/registry',
},
],
},
],
volumes: [
{
name: 'storage',
persistentVolumeClaim: {
claimName: 'registry-pvc',
},
},
],
},
},
},
}

Now, let’s say we want to add optional registry authentication (HTTP Basic Auth). The container has this as a feature: to use it, we need to make a htpasswd file available (as a volume and volumeMount), and set some environment variables.

From the point of view of jsonnet, we’d like to pass in a flag which says whether we want authentication, and to add the extra fields if required. OK, so let’s turn this into a function: add at the top

function(conf)

Then we need to modify some sections. First the environment:

            env: [
{
name: 'REGISTRY_HTTP_ADDR',
value: ':5000',
},
{
name: 'REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY',
value: '/var/lib/registry',
},
] + if conf.auth then [
{
name: 'REGISTRY_AUTH_HTPASSWD_REALM',
value: 'docker-registry-realm',
},
{
name: 'REGISTRY_AUTH_HTPASSWD_PATH',
value: '/auth/htpasswd',
},
] else [],

and the volumeMounts:

            volumeMounts: [
{
name: 'storage',
mountPath: '/var/lib/registry',
},
] + if conf.auth then [
{
name: 'htpasswd',
mountPath: '/auth',
},
] else [],

and the volumes:

        volumes: [
{
name: 'storage',
persistentVolumeClaim: {
claimName: 'registry-pvc',
},
},
] + if conf.auth then [
{
name: 'htpasswd',
secret: {
secretName: 'registry-htpasswd',
},
},
] else [],

Now, this works — and it’s how you’d do this in a helm template— but it’s ugly. The authentication configuration is scattered across our resource template, obscuring the overall structure, and will get more and more complex as we add further options, such as other authentication types.

Let’s take a step back. What would the user do, if we hadn’t provided an authentication feature? They could patch the generated resource themselves. We can structure our resource in the same way: separate out the auth modifications as a patch, which we apply or not as required.

It’s doable, but turns out to be a bit tricker than expected. The problem is that containers is an array, and modifying an element of an array is not straightforward:

local base = import 'deployment.jsonnet';local patch = {
spec+: {
template+: {
spec+: {
containers: [ // note
super.containers[0] { // here
volumeMounts+: [
{
name: 'htpasswd',
mountPath: '/auth',
},
],
env+: [
{
name: 'REGISTRY_AUTH_HTPASSWD_REALM',
value: 'docker-registry-realm',
},
{
name: 'REGISTRY_AUTH_HTPASSWD_PATH',
value: '/auth/htpasswd',
},
],
},
],
volumes+: [
{
name: 'htpasswd',
secret: {
secretName: 'registry-htpasswd',
},
},
],
},
},
},
};
base + patch

We had to create a new containers array, where the first element is a modified instance of the first element of the old array. If there were other containers, we’d have to append them too. We’ve also hard-coded that the container we’re interested in is at index 0. This is messy and brittle.

We can instead use an array comprehension to select a container by name to patch, and pass through any others, which is better but still awkward:

        ...
containers: [
if container.name == 'docker-registry' then container {
volumeMounts+: [
{
name: 'htpasswd',
mountPath: '/auth',
},
],
env+: [
{
name: 'REGISTRY_AUTH_HTPASSWD_REALM',
value: 'docker-registry-realm',
},
{
name: 'REGISTRY_AUTH_HTPASSWD_PATH',
value: '/auth/htpasswd',
},
],
}
else container
for container in super.containers
],
...

There are also arrays for env, volumeMounts and volumes. In this case they are easy to append to with +:. However in general if we wanted to change the value of any env variable, it would be similarly messy. We’d have to use an array comprehension to scan through the array and modify or replace the env variable(s) of interest, leaving the others unchanged.

We would like a better approach for dealing with kubernetes collections, and the jsonnet kubernetes page helpfully points us in the right direction.

Objects are much easier to manipulate in jsonnet than arrays. And our collections all have entries with unique names (container name, env variable name, volume name etc). So the trick is to build a hidden object which contains these items as named fields, and then build arrays from them.

Here is a helper function which does the conversion¹:

{
// convert {"foo":"bar"} to [{"name":"foo","value":"bar"}]
// convert {"foo":{...}} to [{"name":"foo",...}]
namedList(tab):: [
( if std.isObject(tab[k])
then tab[k]
else { value: tab[k] }
) + { name: k }
for k in std.objectFields(tab)
],
}

Store this as k8s.libsonnet. But what does it do? Test it with the following examples:

local k8s = import 'k8s.libsonnet';k8s.namedList({
foo: "bar",
baz: "bap",
qux: {
configMapRef: {
name: 'hello',
key: 'world',
}
},
})

The output is:

[
{
"name": "baz",
"value": "bap"
},
{
"name": "foo",
"value": "bar"
},
{
"configMapRef": {
"key": "world",
"name": "hello"
},
"name": "qux"
}
]

That’s how you’d do env variables. Here’s another one to try, simulating containers and volumes:

local k8s = import 'k8s.libsonnet';k8s.namedList({
'docker-registry': {
volumes: k8s.namedList({
htpasswd: {
secret: {
secretName: 'registry-htpasswd',
},
},
}),
},
}
)

In short: namedList converts an object with (field, value) pairs into an array of {name: <fieldname>, ...rest of value... } objects. How does it work? The outer part is an array comprehension:

  [
some-expression
for k in std.objectFields(tab)
],

This creates an array of objects, where the value of each one is some-expression. For each iteration, k is set to one of the (visible) field names from object tab. The inner expression is:

(
if std.isObject(tab[k])
then tab[k]
else {
value: tab[k],
}
) + {
name: k,
}

If the value is an object already then we just use its value as-is, otherwise we create a new object { value: v }. Then we add { name: k } to that object. Neat.

But kubernetes doesn’t want to see the original object, so we create it as a hidden field, and present the list form in a visible field. The final snippet for env looks like this:

...
env: k8s.namedList(self.envObj),
envObj:: {
REGISTRY_HTTP_ADDR: ':5000',
REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY: '/var/lib/registry',
},
...

Now envObj is easily patched, to add or modify any environment variable by name, and any changes in this (hidden) field are reflected in the generated env array.

The array which namedList creates is ordered alphabetically by the names of the fields. Normally this doesn’t matter, but on rare occasions it does. If a pod has multiple initContainers then they are run in the order the array gives. If one environment variable depends on another, then they must be in the right order. This example fails when used in a podSpec:

local k8s = import 'k8s.libsonnet';{
env: k8s.namedList(self.envObj),
envObj:: {
MYSQL_HOST: 'mydb',
DB_HOST: '$(MYSQL_HOST).localdomain'
},
}

because the env elements generated are in the wrong order:

{
"env": [
{
"name": "DB_HOST",
// Fails to expand: MYSQL_HOST is defined later
"value": "$(MYSQL_HOST).localdomain"
},
{
"name": "MYSQL_HOST",
"value": "mydb"
}
]
}

If this matters, you’ll need to modify the approach.

We can now rewrite the deployment using envObj, volumeMountsObj etc. Then we can write a separate patch which applies all the changes to enable authentication; and we can make the result conditional on what the user requested. Here is the result in full:

local k8s = import 'k8s.libsonnet';function(conf)  // the base deployment
local base = {
apiVersion: 'apps/v1',
kind: 'Deployment',
metadata: {
name: 'docker-registry',
labels: {
app: 'docker-registry',
},
},
spec: {
revisionHistoryLimit: 1,
replicas: 1,
selector: {
matchLabels: {
app: 'docker-registry',
},
},
strategy: {
type: 'Recreate',
},
template: {
metadata: {
labels: {
app: 'docker-registry',
},
},
spec: {
shareProcessNamespace: true,
containers: k8s.namedList(self.containersObj),
containersObj:: {
'docker-registry': {
image: 'registry',
ports: [
{
containerPort: 5000,
protocol: 'TCP',
},
],
volumeMounts: k8s.namedList(self.volumeMountsObj),
volumeMountsObj:: {
storage: {
mountPath: '/var/lib/registry',
},
},
env: k8s.namedList(self.envObj),
envObj:: {
REGISTRY_HTTP_ADDR: ':5000',
REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY: '/var/lib/registry',
},
},
},
volumes: k8s.namedList(self.volumesObj),
volumesObj:: {
storage: {
persistentVolumeClaim: {
claimName: 'registry-pvc',
},
},
},
},
},
},
};

// The patch to add authentication
local auth = {
spec+: {
template+: {
spec+: {
containersObj+: {
'docker-registry'+: {
volumeMountsObj+: {
htpasswd: {
mountPath: '/auth',
},
},
envObj+: {
REGISTRY_AUTH_HTPASSWD_REALM: 'docker-registry-realm',
REGISTRY_AUTH_HTPASSWD_PATH: '/auth/htpasswd',
},
},
},
volumesObj+: {
htpasswd: {
secret: {
secretName: 'registry-htpasswd',
},
},
},
},
},
},
};

// the final result
base + (if conf.auth then auth else {})

It might look complex, but it’s easy to develop and test: jsonnet generates good error messages. The structure means we have kept the authentication changes completely separate from the main deployment.

Once again, the caller is free to patch the returned object further. They can use containersObj, envObj, volumesMountsObj and volumesObj as extension points they can modify if they wish.

Managing a collection of resources

Our application needs four resources: Deployment, Service, Ingress, PVC. We could put them into four jsonnet files, and import them separately, and there’s nothing wrong with that approach.

Alternatively, we could generate all four objects at once, wrapped inside an outer object:

function(conf)
{
deployment: ...
service: ...
ingress: ...
pvc: ...
}

This combined object isn’t something we can send to kubectl apply though.

What we can do is to turn this into an array of resources:

function(conf)
local objects = {
deployment: ...
service: ...
ingress: ...
pvc: ...
};
[
objects.deployment,
objects.service,
objects.ingress,
objects.pvc,
]

Using the -y flag to jsonnet, it will format this as a YAML sequence, and that can be applied to kubernetes. Yay! (as long as the user remembers -y).

But there’s an even better option. There is a little-known pseudo-resource type in kubernetes called List, where the items are individual resources. Sending this wrapper resource to kubectl apply is the same as applying the individual items. So we can write:

function(conf)
{
// hidden definitions
deployment:: ...
service:: ...
ingress:: ...
pvc:: ...

// generated output
apiVersion: 'v1',
kind: 'List',
items: [
self.deployment,
self.service,
self.ingress,
self.pvc,
],
}

Now we’re really rocking. If you run this through jsonnet and supply the conf argument then the output can be directly applied to kubernetes. But the value returned can still be used and composed in different ways: e.g. you could pick out just one resource of interest, or you could apply patches to any or all of the individual resources.

Sometimes there are cross-cutting concerns. A common case is that you have a list of resources, and want to apply a namespace to all of them.

No problem: just write a function which iterates over the collection and updates the namespace. Add this to k8s.libsonnet:

{
...
// apply namespace to all objects
replaceNamespace(data, namespace):: data {
items: [
item { metadata+: { namespace: namespace } }
for item in super.items
],
},
}

Then apply it to the list of items generated:

local reg = import 'registry-resources.libsonnet';
local k8s = import 'k8s.libsonnet';
local conf = {
auth: true,
};
k8s.replaceNamespace(reg(conf), namespace='registry')

Once you are comfortable with jsonnet, making this sort of function is much easier than hunting for a kustomize module that does exactly what you want. Furthermore, since jsonnet code has no side-effects, it’s safe to run anywhere.

Conclusion

The complete application configuration, with a small amount of additional refactoring, can be found on github. Please feel free to use, copy or draw whatever inspiration you like from this.

I really like this approach. Writing jsonnet code to make new resources, and writing jsonnet code to patch someone else’s resources, uses the same skill set. And as a way of templating configurations, writing real code in jsonnet is much more robust than processing YAML files with a text templating engine.

Epilogue

If you find jsonnet doesn’t work well for you then take a look at CUE instead.

¹ It’s based on two of the helper functions from the jsonnet kubernetes page, pairList and namedObjectList, simplified and merged. My version works with env lists which mix plain string values with structured values like configMapRef or secretRef.