How I successfully did a major upgrade of Apache NiFi using a deprecated Helm Chart.

At $DAYJOB, we’ve been running Apache NiFi on Kubernetes using this Helm Chart from the fine folks at CETIC (thanks!).
Unfortunately, the Helm Chart was deprecated last year, but that was not an issue for us, because there was no requirement of upgrading NiFi. Also, the new major version of NiFi brings some breaking changes, which would also require time from developers.

This week, however, I was asked to upgrade NiFi to the latest version (2.2.0 at the time of writing), and while there are some newer alternatives (notably NiFiKop) I was asked to stick with the current setup.

NOTE: this is not supported by any means.
If I had to start today setting up NiFi from the ground up I would not use this.

The problem

These are the 2 main issues I faced:

  1. One of the main changes of NiFi 2 is the deprecation of XML flow files, in favor of JSON Flow Definitions. This means that the data in the flow.xml.gz file is not compatible with the new version, and the application will horribly break as soon as it starts trying to parse an XML file with a JSON parser.

  2. The NiFi TLS Toolkit has been removed, which means that the Helm Chart’s ca deployment does not work anymore.

The solution

1. Flow Definition

We have a particular way of managing the flow definition, so we don’t need to worry about the flow file at startup (basically, a different service will drop it and recreate the correct flow later). I just need to start NiFi with an empty flow without it blowing up in my face.
To do this, I tried running NiFi 2.2.0 using Docker and exporting the empty flow from the UI, but the JSON I got was not being imported correctly. What did work was copying the flow.json.gz file from the conf directory inside the Docker container and unzipping it. There are some small differences between the two files, but I honestly didn’t have time to investigate further.
Then, I created a ConfigMap with the JSON flow definition. Here it is:

# basic-flow-configmap.yaml
apiVersion: v1
kind: ConfigMap
data:
  basic-flow.json: |
    {
      "encodingVersion": {
        "majorVersion": 2,
        "minorVersion": 0
      },
      "maxTimerDrivenThreadCount": 10,
      "maxEventDrivenThreadCount": 10,
      "registries": [],
      "parameterContexts": [],
      "parameterProviders": [],
      "controllerServices": [],
      "reportingTasks": [],
      "flowAnalysisRules": [],
      "rootGroup": {
        "identifier": "dddb4465-d40a-3093-a2bd-61fde52db64b",
        "instanceIdentifier": "23818972-0195-1000-4d89-9c091c449d93",
        "name": "NiFi Flow",
        "comments": "",
        "position": {
          "x": 0,
          "y": 0
        },
        "processGroups": [],
        "remoteProcessGroups": [],
        "processors": [],
        "inputPorts": [],
        "outputPorts": [],
        "connections": [],
        "labels": [],
        "funnels": [],
        "controllerServices": [],
        "defaultFlowFileExpiration": "0 sec",
        "defaultBackPressureObjectThreshold": 10000,
        "defaultBackPressureDataSizeThreshold": "1 GB",
        "scheduledState": "ENABLED",
        "executionEngine": "INHERITED",
        "maxConcurrentTasks": 1,
        "statelessFlowTimeout": "1 min",
        "flowFileConcurrency": "UNBOUNDED",
        "flowFileOutboundPolicy": "STREAM_WHEN_AVAILABLE",
        "componentType": "PROCESS_GROUP"
      }
    }
metadata:
  name: basic-flow-json
  namespace: my-namespace

Then, I just added the following to our installation script:

kubectl apply -f /path/to/basic-flow-configmap.yaml -n my-namespace

This ConfigMap will be loaded by the NiFi Helm Chart, and the JSON Flow Definition will be used instead of the flow.xml.gz file.

TLS

NiFi needs a JKS keystore to work, and the Helm Chart used to create it using the NiFi TLS Toolkit. Luckily, we can use the cert-manager sidecar container to create it for us.

Helmfile

We use Helmfile, so I had to make some changes to the helmfile-nifi.yaml. These are the relevant parts of the values.yaml and helmfile.yaml files:

# values.yaml
domain: my-domain.com
versions:
  nifi:
    chartVersion: "1.2.1" # This is the version of the NiFi Helm Chart, which stayed the same
    imageVersion: "2.2.0" # This is the version of the NiFi image, which I had to change from 1.x.y
# helmfile-nifi.yaml
repositories:
  - name: bitnami
    url: "https://charts.bitnami.com/bitnami"
  - name: cetic
    url: "https://cetic.github.io/helm-charts"
---
releases:
  - name: nifi
    namespace: my-namespace
    chart: cetic/nifi
    version: {{ .Values.versions.nifi.chartVersion | quote }} # specify the version of the CETIC NiFi Helm Chart
    needs:
      - cert-manager/cert-manager
    values:
      - ca:
          enabled: false # disable the ca section, which used the deprecated TLS Toolkit
        certManager:
          enabled: true # enable the cert-manager sidecar container
          clusterDomain: {{ .Values.domain }}
          additionalDNSNames:
            - "nifi.{{ .Values.domain }}"
        configmaps:
          - name: basic-flow-json # mount the ConfigMap with the JSON Flow Definition
            keys:
              - basic-flow.json # specify the key in the ConfigMap
            mountPath: /opt/nifi/basic-flow-json # mount path in the NiFi container
        customFlow: /opt/nifi/basic-flow-json/basic-flow.json # specify the path to the JSON Flow Definition, so NiFi does not try to load the old and unsupported flow.xml.gz
        image:
          tag: {{ .Values.versions.nifi.imageVersion | quote }} # specify the version of the NiFi image

I left out some parts of the values.yaml and helmfile.yaml files but I hope you get the idea. Feel free to reach out if you have any questions or suggestions.
Till next time!