Plume Deployment ⋅ Plume

Well, I've previously made posts about cross-building with podman & buildah, and setting up a kubernetes cluster. This will cover getting Plume up and running.

Building

Over on my gitea, I have a plume repository with a custom script called build.sh. This script does a number of things:

Checks if docker is set up for cross-building, and enables cross-building if the check fails
Builds a custom cross-compile container with plume's build dependencies such as OpenSSL, libpq, and gettext.
Uses the cross tool to build aarch64-unknown-linux-gnu versions of plume and plm (this uses the custom container built in the previous step via the Cross.toml file)
Builds a plume image from a dockerfile

That last step is fairly complex. If we take a look at the dockerfile, we can see that there's two steps involved.

Build plume-front in an amd64 container. This is because it produces WebAssembly, so building on my computer's native architecture is much faster
Copy all the built packages into an arm64v8 ubuntu 18.04 container with a couple runtime dependencies installed

The build script pushes finished containers to my dockerhub, which can be seen here for cross-building and here for the final product

How I build plume:

$ ./build.sh r7 0.4.0-alpha-4

Deploying

I don't run my Postgres database on Kubernetes. With my setup, it would be far too slow to have a network-mounted drive for postgres to run on top of. For this example, I'm assuming a Postgres database is running somewhere and accessible via an IP Address.

Once the container is ready, we can get over to our kubernetes host.

Dependencies

Firstly, we need to enable DNS for our kubernetes cluster, so services can see each other without doing Advanced IP Address Manipulation.

$ microk8s.enable dns

In the first kubernetes post, we set up some glusterfs storage. We'll need to use that as Plume's media storage backend. It's time to write our first Kubernetes Configuration File.

# plume-deps.yml
apiVersion: v1
kind: Namespace
metadata:
  name: plume
---
apiVersion: v1
kind: Endpoints
metadata:
  name: storage
  namespace: plume
subsets:
- addresses:
  - ip: 192.168.6.32
  ports:
  - port: 1
---
apiVesrion: v1
kind: Service
metadata:
  name: storage
  namespace: plume
spec:
  ports:
  - port: 1

So what have we done here?

First, we've defined a Namespace that we'll be creating our configurations in. This is so we can ask Kubernetes specifically about our Plume information without getting other info crowding our response.
Next, we've defined an Endpoints configuration that references the IP Address of one of our GlusterFS nodes. For GlusterFS specifically, the port on the Endpoints entry doesn't matter, so we set it to 1.
Finally, we create a Service. Services, when combined with DNS, act as routers in Kubernetes. In this example, we create a service called storage the references port 1, which is what our endpoint is "listening on"

Now we have a storage backend for our Plume container, but Plume also relies on a Postgres database. How do we configure that? It turns out this is a very similar process

# plume-deps.yml
---
apiVersion: v1
kind: Endpoints
metadata:
  name: postgres
  namespace: plume
subsets:
- addresses:
  - ip: 192.168.6.15
  ports:
  - port: 5432
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: postgres
  name: postgres
  namespace: plume
spec:
  ports:
  - port: 5432

Once again, we've created an Endpoints configuration and a Service configuration. This time, the port does matter, so we tell it to use port 5432, which is postgres' default listening port. We also applied a label to our service this time. This is just for easier querying in the future, and isn't required.

Configuration

Now we can start configuring plume. By reading through the Plume docs, we can pick out some environment variables that we'll need to set.

Public configration

BASE_URL: the URL of the plume server we'll be running

Secret Configuration

DATABASE_URL: how Plume talks to postgres
ROCKET_SECRET_KEY: a key used for crypto

I've separated the environment variables into two sections because some contain potentially private information, and some do not.

Let's create our secrets first:

# plume-secrets.yml
apiVersion: v1
kind: Secret
metadata:
  name: plume-secrets
  namespace: plume
type: Opaque
stringData:
  DATABASE_URL: postgres://plume:plumepassword@postgres:5432/plume
  ROCKET_SECRET_KEY: This key can be generated by running `openssl rand -base64 32` on a system with openssl installed

We've created a configuration defining our secret environment variables as a Secret kind. This lets kubernetes know to shield these values from inspection. Notice that in the DATABASE_URL, we referenced our postgres host as just postgres. This will use Kubernete's built-in DNS function to look up the service called postgres that we created, and route traffic to our external postgres installed defined in our endpoints.

Now, let's set the rest of our values

# plume-config.yml
apiVersion: v1
kind: ConfigMap
metadata:
  name: plume-config
  namespace: plume
data:
  BASE_URL: blog.asonix.dog

Migrations

Since Plume relies on Postgres, it provides it's own migrations to keep the database schema up-to-date. Before plume will properly run, the database must be migrated.

# plume-migrate-0.4.0.yml
apiVersion: batch/v1
kind: Job
metadata:
  name: plume-migrate-0.4.0
  namespace: plume
spec:
  backoffLimit: 2
  template:
    spec:
      restartPolicy: Never
      containers:
      - name: plume-migrate-0.4.0
        image: asonix/plume:0.4.0-alpha-4-r7-arm64v8
        command: ["plm"]
        args: ["migration", "run"]
        envFrom:
        - configMapRef:
            name: plume-config
        - secretRef:
            name: plume-secrets

So here, we define a job that runs plm migration run in the container we built earlier. It loads environment variables from our ConfigMap and Secret that we defined earlier. It has a backoff of 2, meaning it will try two additional times if it fails.

Since migrations need to happen before plume can run, let's go ahead and apply this configuration.

$ kubectl apply \
    -f plume-deps.yml \
    -f plume-secrets.yml \
    -f plume-config.yml \
    -f plume-migrate-0.4.0.yml

The Plume Container

While our migration is deploying, and now that we've gotten all our dependencies out of the way, we can go ahead and get plume running.

# plume.yml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: plume
  namespace: plume
spec:
  replicas: 1
  selector:
    matchLabels:
      app: plume
  template:
    metadata:
      labels:
        app: plume
    spec:
      containers:
      - name: plume
        image: asonix/plume:0.4.0-alpha-4-r7-arm64v8
        command: ["sh"]
        args: ["-c", "plm search init && plume"]
        ports:
        - name: web
          containerPort: 7878
        livenessProbe:
          httpGet:
            path: /
            port: 7878
          timeoutSeconds: 10
          initialDelaySeconds: 10
          periodSeconds: 60
        envFrom:
        - configMapRef:
            name: plume-config
        - secretRef:
            name: plume-secrets
        volumeMounts:
        - name: data
          mountPath: /opt/plume/static/media
          readOnly: false
        resources:
          requests:
            memory: 100Mi
          limits:
            memory: 150Mi
      volumes:
      - name: data
        glusterfs:
          endpoints: storage
          path: plume # use the `plume` glusterfs volume

That's a lot of config, but what it boils down to is

Deploy an app called plume
Use the container we built earlier
Run plm search init && plume when the container starts
Mount our GlusterFS storage at /opt/plume/static/media
Check every 20 seconds to see if plume's / path is responding okay
Use our configurations we made earlier as Environment Variables
Expect Plume to use 100MiB of RAM
Kill (and restart) Plume if it uses more than 150MiB of RAM

The Plume service

There's one last step we need to make Plume easily accessible within Kubernetes, and that's creating a Service for it.

# plume.yml
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: plume
  name: plume
  namespace: plume
spec:
  type: ClusterIP
  selector:
    app: plume
  ports:
  - port: 7878
    targetPort: 7878
    name: web

This Service creates a DNS entry in kubernetes called plume that we can use to reference the running container. It specifies that the app it delegates to is called plume and that it exposes the port 7878.

Now that we've written everything out, let's get it running.

$ kubectl apply -f plume.yml

While we wait for the deployment to happen, we can see what kubernetes is up to.

$ watch microk8s.kubectl get pods -n plume -o wide

We can keep track of which "pods" are running in Kubernetes. In our case, we're looking for a single pod containing a single container called plume. Specifying -n plume shows us only the pods in the plume namespace that we defined, and specifying -o wide will show us more information about the pod, such as which node it ended up on.

We can take a look at our jobs, deployments, and services, as well

$ kubectl get jobs -n plume
$ kubectl get deployments -n plume
$ kubectl get svc -n plume

If our pod fails to spin up for any reason, we can take a look at why with a couple commands.

If the pod fails to spin up at all, we can describe it. First, get the name of the pod from kubectl get pods -n plume, then run the following command with the pod's name

$ kubectl describe pods -n plume plume-588b98658f-s2cmg

If the pod finished starting, but failed for other reasons (bad configuration, poorly-written application, etc), we can check the logs.

$ kubectl logs -n plume -lapp=plume

Here, we've asked for logs for all pods in the plume namespace withe the label app: plume.

If you've read through this post and thought to yourself "Why would I write all this and more just to run a single container?" It's because Kubernetes does a lot more than other tools like docker-compose. In my previous post on kubernetes, I attached many nodes to my cluster. By writing out all this configuration, we've told kubernetes to "Pick one node and put plume on it." Plume is accessible to any node, though, as kubernetes has created an overlay network spanning all connected nodes. Any connected node can reference plume as 'plume' through the DNS addon. If the active plume container fails a health check, (readinessProbe), or if it crashes for any reason, kubernetes will spin up a new container running Plume. It might spin up on the same node it was previously running on, or kubernetes could elect to put it on a new node.

Part of how Kubernetes decides where to put a container is based on the resources configration. I told kubernetes that I expect Plume to use 100MiB, and that it should never use more than 150MiB. Kubernetes will take a look at all connected nodes, see how much RAM is free on each one, and put Plume on a node that has enough space for it. This is incredibly important on my network, where each of my Raspberry Pis has at most 2GB of RAM. By enabling me to define RAM constraints for each container I run, Kubernetes can be smarter about which nodes run which services.

Further, I prefer deploying with Kubernetes because it allows me to centralize & back up all of my configurations. Only one node in the kubernetes cluster needs to act as the 'master node', and the rest of the nodes act as nothing more than extra hardware that kubernetes can deploy services to.

My current backup strategy is actually terrible, When I make a change to a config and apply it, I tar up my config folder and scp it to my personal computer. It works well enough that I've been able to tear down my entire cluster and build it back up in under an hour, but it's not automated, and it's not professional.

In a future post, I'll detail exposing Plume to the world, since right now it can only be accessed from inside kubernetes.

Comments

北市真 KitaitiMakoto

April 18, 2020 15:44

Great article! You use some techniques I didn't know. Thank you.

Have you increased replicas of Plume deployment? I'm also running Plume on a Kubernetes cluster. When I increase replicas to 2, the second pod shows an error that it failed build thread pool. How about you and, if you're the same, how did you solve it?

Respond

April 18, 2020 16:33

The error message is:

thread 'main' panicked at 'main: database pool initialization error, src/main.rs:80:18

May 7, 2020 04:54

I identified problems:

PostgreSQL's max connections exhausted, and
The first Pod locks shared search index directory and the second Pod and later couldn't read from it

Plume Deployment

On Kubernetes