CI/CD: Automation of Everything

A great example of a complete process from push-to-git to deploy-in-production-via-canary: https://youtu.be/XNXJtxkUKeY

I highly recommend Viktor’s YouTube channel if you want to see what exists in the DevOps space: he plays with all available tools a bit and thus can compare them well and (relatively) objectively.

Kustomize vs Helm

While both Helm and Kustomize are often used as if they are alternatives, I do not see them this way:

Kustomize

Kustomize allows your K8S yaml files to have “variations”: you define a base, and on top of that you add some modifications: You create a base configuration (YAML files), and on top of that you create a DEV and a QA variation: e.g. DEV uses a different image, a different port, different namespace, but otherwise it matches QA. And variations can have further variations too. Nice. Elegant. To install the DEV variation:

kubectl apply -k dev

What I like about it is that is very close how we did OS configuration at work about 15 years ago: a base (global) configuration, then a directory for the region, then one per country, and one per data center. Differences between DEV and QA were handled in those scripts too at the earliest possible layer. It worked then, was immediately understood by everyone (KISS) and was robust and extensible.

It’s also built-in into kubectl.

Helm

Helm is more similar to a package manager like apt or yum: you just ask for “I’d like WordPress to be installed” and it gets all its dependencies and installs them.

Creating a Helm chart for your own purposes is not easy though and it seems to me to be overkill for simple projects unless you plan to distribute it as a simple-to-install software package. Microservices is my main target, so they tend to stay simple.

Which one to use?

If you are a developer, Kustomize makes much more sense to me. If you want to distribute a non-trivial K8S application which possibly has dependencies, then Helm does that well.

That said, both still force you to understand and spell out all those K8S resources which the application needs. One can hope that OAM will address this.

Metacontroller!

Controllers are great when they work and when it’s easy enough to create them. But they are not trivial at all although several attempts are in progress to simplify this:

I tried the latter as it allows to create controllers in your language of choice and their example list looks sufficient to handle plenty typical use-cases.

And it works. 2 weeks ago it did not. No idea if it was my K8S cluster or something else, but it’s working out-of-the-box. kubebuilder and the operatorframework need way more Go skills and time.

Deleting stuck namespaces in K8S

Somehow I get namespaces which are in “Terminating” state forever:

❯ kubectl get ns
NAME                        STATUS        AGE
default                     Active        22d
ingress-nginx               Active        11d
kube-node-lease             Active        22d
kube-public                 Active        22d
kube-system                 Active        22d
memcached-operator-system   Terminating   37m
olm                         Terminating   69m
operators                   Active        69m

Root cause is the finalizers which…don’t finalize. No idea yet why. Until then, this is how to delete those never-terminating namespaces:

❯ NS="olm"
❯ kubectl get namespace $NS -o json > $NS.json
# Edit $NS.json and delete items in spec.finalizers
❯ more $NS.json
{
    "apiVersion": "v1",
    "kind": "Namespace",
    "metadata": {
        "creationTimestamp": "2021-06-21T09:08:48Z",
        "deletionTimestamp": "2021-06-21T09:28:32Z",
        "name": "olm",
        "resourceVersion": "2779704",
        "uid": "6bf2112a-85bc-42c0-b17f-cf9010f7dab7"
    },
    "spec": {
        "finalizers": []
    },
    "status": {
...
❯ kubectl replace --raw "/api/v1/namespaces/$NS/finalize" -f ./$NS.json

k3s – local persistent storage

When using k3s and the built-in local persistent storage provider, once in a while you have to edit those files. While that usually works, sometimes you have to replace a 150kB binary file and when containers usually don’t have scp installed, there’s a problem…

The fix is to modify the storage from outside the container. That depends on the persistent storage provider. If it’s NFS, mount by NFS from another machine. If it’s an S3 bucket, edit it directly etc.

k3s has a local persistent storage driver called “local-path”. But where are those files so I can replace one of them? Turns out they are on /var/lib/rancher/k3s/storage/ on a node. Which node and what directory inside storage/ ?

Finding Your PVC

To find the PVC named “grafana-lib”, do

❯ kubectl describe persistentVolumeClaim grafana-lib
Name:          grafana-lib
Namespace:     default
StorageClass:  local-path
Status:        Bound
Volume:        pvc-a89cee51-0000-47d7-a095-2d48400768e3
Labels:        <none>
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: rancher.io/local-path
               volume.kubernetes.io/selected-node: knode5
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      3Gi
Access Modes:  RWO
VolumeMode:    Filesystem
Mounted By:    grafana-deployment-669fc6d658-l78z7
Events:        <none>

and the volume shows where it is: knode5:/var/lib/rancher/k3s/storage/pvc-a89cee51…
A bit of jq magic and you get a complete list of all PVCs:

❯ kubectl get persistentVolumeClaims -o json | jq '[.items[] | { "Name": .metadata.name, "Volume": .spec.volumeName, "Node": .metadata.annotations."volume.kubernetes.io/selected-node" }]'
[
  {
    "Name": "influxdb-data",
    "Volume": "pvc-bbac2312-0000-450e-aee1-41a0d5517adb",
    "Node": "knode6"
  },
  {
    "Name": "grafana-log",
    "Volume": "pvc-22814c7b-0000-4b8b-99b6-ab4a4ca6c65c",
    "Node": "knode5"
  },
  {
    "Name": "grafana-lib",
    "Volume": "pvc-a89cee51-0000-47d7-a095-2d48400768e3",
    "Node": "knode5"
  }
]

Installing HashiCorp’s Vault

HashiCorp Vault Icon

Trying to use Vault at work to keep secrets in there. However knowing not much about it makes me want to test it at home first.

Installing on k8s seemed most sensible since I already have 3 node k3s cluster. Since k3s does support Helm v3 and Vault can be installed via Helm v3 charts, that’s what I did.

Installing

See https://www.vaultproject.io/docs/platform/k8s/helm.html, but in short:

$ helm repo add hashicorp https://helm.releases.hashicorp.com
"hashicorp" has been added to your repositories
$ helm install vault hashicorp/vault

Initialize and Unseal

Initialize is a one-time action. Unsealing is always needed when you restart Vault. See https://learn.hashicorp.com/tutorials/vault/kubernetes-minikube. Short summary:

# Initialize and get the keys
$ kubectl exec vault-0 -- vault operator init -key-shares=1 -key-threshold=1 -format=json > cluster-keys.json

# Unseal
$ VAULT_UNSEAL_KEY=$(cat cluster-keys.json | jq -r ".unseal_keys_b64[]")
$ kubectl exec vault-0 -- vault operator unseal $VAULT_UNSEAL_KEY

# Show pod status
$ kubectl get pods -l app.kubernetes.io/name=vault
NAME      READY   STATUS    RESTARTS   AGE
vault-0   1/1     Running   2          16h

# Show Vault status
$ kubectl exec vault-0 -- vault status
Key             Value
---             -----
Seal Type       shamir
Initialized     true
Sealed          false
Total Shares    1
Threshold       1
Version         1.5.2
Cluster Name    vault-cluster-f6c361da
Cluster ID      a757fd57-3032-59ec-d03a-4ad0556536ea
HA Enabled      false

The important part is: the Vault pod(s) run and it’s Sealed=False.

k3s – Half Size k8s

Followed https://rancher.com/docs/k3s/latest/en/ and it’s great. Took me a while to be able to access the dashboard though.

How to connect to the dashboard

What made this part work:

How to connect the normal way

To connect the “normal” way via https://giga.lan:8443 do this: After installing it (see https://rancher.com/docs/k3s/latest/en/installation/kube-dashboard/) the trick was to expose it via the LoadBalancer:

❯ kubectl expose deployment kubernetes-dashboard --type=LoadBalancer --name=dash --namespace=kubernetes-dashboard
❯ kubectl get services --namespace=kubernetes-dashboard

NAME                        TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)          AGE
kubernetes-dashboard        ClusterIP      10.43.193.11    <none>          443/TCP          22h
dashboard-metrics-scraper   ClusterIP      10.43.122.142   <none>          8000/TCP         22h
dash                        LoadBalancer   10.43.164.78    192.168.21.39   8443:31610/TCP   3m36s

Instead of the expose command you can also do:

❯ cat dashboard-service.yaml
apiVersion: v1
kind: Service
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: dash
  namespace: kubernetes-dashboard
spec:
  ports:
  - protocol: TCP
    port: 8443
    targetPort: 8443
  selector:
    k8s-app: kubernetes-dashboard
  type: LoadBalancer
❯ kubectl apply -f dashboard-service.yaml

And now I can access the dashboard from anywhere via https://giga.lan:8443