Migration from previous Nexeed infrastructure versions

The migration from an older Helm-based version to a new Helm-based one is in general non-disruptive, however some changes in parametrization might be required (if additional mandatory parameters are introduced, for example). Migration is only supported between consecutive major releases.

The value override yaml in ias is called custom-values.yaml, for the same in the nexeed-infra chart we will refer it as custom-values-infra.yaml to avoid confusion. However, the value override file names for both charts can be arbitrarily selected.

Migration to 2025.02

RabbitMQ on 2025.02 and later is upgraded to 4.x major version. The RabbitMQ version in nexeed-infra 2025.02 would be 4.1.2.

For clusters have RabbitMQ < 3.13, to successfully upgrade to 4.0.x or 4.1.x versions, you must upgrade RabbitMQ to 3.13.x versions first.

In RabbitMQ 3.x, we use the mirrored queue type for data replication across different nodes in a clustered setup. In RabbitMQ 4.x, this type of queue is no longer supporting replication, hence in a clustered environment, mirrored queue is no longer usable.

Upgrading to RabbitMQ 4.x will lose the mirrored queue feature in 3.x. If any
of your modules are not migrated to Quorum queue, please do not update
`nexeed-infra` to 2025.02 or newer. Please update RabbitMQ to 4.x when all of
your modules are Quorum queue ready.

Pre-requisite

Please disable classic mirrored queue policy creation in Ansible Operator by adding the following code to custom-values.yaml before migrating to RabbitMQ 4.

global:
  ...

# Add the following code on the same level of global
ansible-operator:
  local:
    rabbitMqClassicMirrorqueuesPolicy: disabled

Apply the ias chart again and you may need to restart the rabbitmq-account-operator-deployment by performing the following kubectl command:

kubectl rollout restart -n aops deployment/rabbitmq-account-operator-deployment

Please check the RabbitMQ policy via the web UI: Admin → Policies and confirm that there is no policy named mirror-queues any more for all virtual hosts.

The entire process may take 30 minutes, or you can safely remove the policies manually.

Migration to quorum queue

Nexeed IAS modules should always handle their own queues, types and exchanges. Please consult the respective operational manual for migration steps of the individual module.

In case the administrator of RabbitMQ wants to perform manual migration of the queue type from Classical mirrored queue to quorum queue, you can follow these principals and steps below.

There are different ways to migrate from mirrored queue to quorum queue, but ultimately they follow these principals:

If your module re-uses the same name of the queue, then the queue needs to be removed and re-created as quorum queue:
1. Use UserPermissionModification to put a stop on write permission for the module RabbitMQ user
2. (Optional) Stop the producer pod by scaling down the deployment to 0 in Kubernetes
3. Wait until consumer has consumed all messages in the old classic mirrored queue
4. Scale down the producer and the consumer deployment to 0, then delete the mirrored queue
5. Add back the write permission to the module user
6. (Optional) You can create quorum queue manually in the RabbitMQ management WebUI.
7. Configure and deploy the new producer and consumer to use the new quorum queue
If data loss is acceptable, you can recreate the entire RabbitMQ cluster anew with quorum queues only. See RebuildRabbitMQ for more info.
If your module uses another queue name for quorum queue than before:
1. Make sure all messages are consumed before the deployment to quorum queue, you can use UserPermissionModification before the deployment to block writes to the classic mirrored queue
2. Configure and deploy the new version of the module to use quorum queue
3. Re-establish the write permission on the module user in RabbitMQ

Rebuild RabbitMQ from scratch

Data Loss WILL happen. Only perform this where data loss is acceptable.

To re-create the RabbitMQ cluster with a set of clean disk:

Scale down RabbitMQ to 0 nodes: kubectl scale -n shared --replicas=0 sts/rabbitmq-statefulset
Remove the PersistentVolumesAndPersistentVolumeClaims associated with RabbitMQ kubectl delete -n shared pvc/rabbitmq-rabbitmq-statefulset-X where X is the pod count.

Migration to 2025.01.02

The RabbitMQ Helm Chart introduced a new parameter in the custom-values.yaml:

global:
  modules:
    rabbitmq:
      podManagementPolicy: "Parallel"

Please consult Pod management policy for more info.

Migration to 2025.01

The RabbitMQ Helm Chart introduced the feature to configure its Cloud LoadBalancer via Kubernetes annotations. This applies to the cloud deployment of RabbitMQ in LoadBalancer style.

Currently only Azure Cloud (and its variant) is supported.

This chart will configure the additional annotations to the Service type of Kubernetes manifest with type: LoadBalancer for RabbitMQ:

annotations:
  service.beta.kubernetes.io/azure-disable-load-balancer-floating-ip: "true"
  service.beta.kubernetes.io/azure-load-balancer-resource-group: '<resource-group-name>'
  service.beta.kubernetes.io/azure-load-balancer-ipv4: 'x.x.x.x'

In your custom-values-infra.yaml file, under the global.modules.rabbitmq section, please add the following azure content block:

global:
  modules:
    rabbitmq:
      loadBalancer:
        enabled: true
        ip: "x.x.x.x"
        sourceRanges:
          - "y.y.y.y/32"
        azure:
          public_ip_resource_group: <the resource group of the external IP>

Migration to version 2024.02.01 from ias chart

The RabbitMQ Helm chart is no longer included in the ias chart. Instead, the nexeed-infra chart will provide this component.

You should deploy the nexeed-infra Helm chart before performing the business module migration.

This manual assumes the default namespace for the RabbitMQ cluster installation is shared.

Same namespace deployment scenario

This suits the case if your RabbitMQ cluster deployed by nexeed-infra would stay at the same namespace as the original ias chart.

It is necessary to perform the following migration steps for the existing RabbitMQ cluster that is going to be managed by the nexeed-infra Helm chart without downtime.

Alter your Kubeconfig (usually <home_dir>/.kube/config) to the correct Kubernetes context.
You can find the RabbitMQ statefulset name and its namespace via command: kubectl get sts -A -l app=rabbitmq
- You need to fill the placeholders in the script, command and the yaml files mentioned below
- You can add or modify global.modules.rabbitmq.namespaceSuffix to change the value for the default namespace
Execute the RabbitMQ objects Helm annotations change commands via the bash script:

for resource in $(kubectl get sts,ing,svc,secret,cm,roles,rolebindings,sa -n shared | awk '{print $1}' | grep -i rabbitmq)
do
  kubectl annotate --overwrite $resource -n <namespace> meta.helm.sh/release-name=<nexeed-infra-release-name>
  kubectl annotate --overwrite $resource -n <namespace> meta.helm.sh/release-namespace=<nexeed-infra-release-namespace>
done

After executing the script, you:

Move the global.modules.rabbitmq section from custom-values.yaml to custom-values-infra.yaml file. Add the remote-rabbitmq reference in your business modules in custom-values.yaml, refer to RMQCustomValueRemote
Apply the new ias Helm chart and the nexeed-infra chart to your cluster. No RabbitMQ Pods will be restarted as long as the image version used are the same. Migration is completed.

Custom-values.yaml update

In the custom-values.yaml file, the RabbitMQ cluster installed by nexeed-infra must be mentioned in the global.serverInstances section:

# custom-values.yaml
global:
  serverInstances:
    remote-rabbitmq:
      host: rabbitmq-service.<namespace>.svc.cluster.local
      port: 5672
      adminPort: 15672
      tls: false
      default: true
      adminUser: admin
      adminPassword: <same_as_global.embeddedRabbitMQAdminPassword>
      type: RABBITMQ

And your business modules should reference this remote-rabbitmq in their messaging section, if the global.serverInstances.remote-rabbitmq.default is not true:

# custom-values.yaml
global:
  modules:
    your-module1:
      messaging:
        <messaging_name>:
          serverInstance: remote-rabbitmq

Your custom-values-infra.yaml should look like this in the global.modules.rabbitmq section:

# custom-values-infra.yaml
global:
  modules:
    rabbitmq:
      enabled: true
      # more rabbitmq optional parameters

Please note that the global.embeddedRabbitMQAdminPassword value should be defined and kept the same across the two custom-value files.

Migrate RabbitMQ to a different namespace

To migrate the existing RabbitMQ cluster to a new namespace, the overall logic is:

Deploy the new RabbitMQ cluster with the same major.minor version (i.e. 3.13.2 vs 3.13.6) in the new namespace.
On the new pods, join themselves to the old cluster one by one.
Reference the new cluster in your custom-values.yaml for your applications, and apply it with the upgrade maintenance window.
- This will also remove the old RabbitMQ pods and most of its related Kubernetes resources.
Remove the old cluster information from the new RabbitMQ cluster, point the old RabbitMQ management ingress controller to the new cluster.

To deploy a new cluster of RabbitMQ via nexeed-infra Helm chart, you need to temporarily alter the global.modules.rabbitmq.contextPath to be a different one as the old ias chart. For example:

# custom-values-infra.yaml
global:
  modules:
    rabbitmq:
      contextPath: rabbitmq_new
      namespaceSuffix: rmq2

Then you can deploy nexeed-infra with this value override.

Assuming the new RabbitMQ is already deployed in the rmq2 namespace with 3 replicas, they are up and running with no client connecting to it.

Execute the shell on the new rabbitmq-statefulset-0 pod in rmq2 namespace
Join the old cluster with rabbitmqctl join_cluster rabbit@rabbitmq-statefulset-0.rabbitmq-service-headless.<old_namespace>.svc.cluster.local
Perform the same commands in the other 2 new pods
Wait until everything syncs, please confirm its content via the management portal and the pod log
- In the pod log, you should see the string Peer discovery: all known cluster nodes are up.
- In the management portal, you should see all old and new nodes in the Nodes section
- Check any large queues, all nodes should have been synchronized
Scale down the old RabbitMQ statefulset to 1
On one of the new RabbitMQ pods' shell, perform the command to forget old pod 1 and 2: rabbitmqctl forget_cluster_node rabbit@rabbitmq-statefulset-<id>.rabbitmq-service-headless.<old_namespace>.svc.cluster.local
Add the remote-rabbitmq information with the new namespace in your new ias Helm chart custom-values.yaml, see RMQCustomValueRemote
Remove the old RabbitMQ cluster by upgrading to 2024.02.01 ias Helm chart
- This should remove the old RabbitMQ cluster and most of its related Kubernetes resources
- Maintenance window starts and new modules containers will point to the new RMQ cluster (possible downtime, should be back as soon as the new pods are up and running)
- Check by kubectl get pods -n <old_namespace> -l app=rabbitmq
Remove the last old pod from any of the new cluster pods: rabbitmqctl forget_cluster_node rabbit@rabbitmq-statefulset-0.rabbitmq-service-headless.<old_namespace>.svc.cluster.local
Update the global.modules.rabbitmq.contextPath in custom-values-infra.yaml to the original value in custom-values.yaml (or remove the override)
Re-deploy the nexeed-infra Helm chart with the updated custom-values-infra.yaml
Check the RabbitMQ cluster status to make sure only new pods is forming this cluster in shell: rabbitmqctl cluster_status
After confirming the content, you may delete the old PersistentVolumesAndPersistentVolumeClaims for the old RabbitMQ cluster in the old namespace