SOT
    • Introduction
    • Release notes
      • 2025.03.00
        • RC2
        • RC1
      • 2025.02.01
        • SP10
        • SP9
        • SP8
        • SP7
        • SP6
        • SP5
        • SP3
        • SP2
        • SP1
      • 2025.02.00
        • SP25
        • SP24
        • SP23
        • SP22
        • SP21
        • SP20
        • SP19
        • SP18
        • SP17
        • SP16
        • SP15
        • SP14
        • SP13
        • SP12
        • SP11
        • SP10
        • SP9
        • SP8
        • SP7
        • SP6
        • SP5
        • SP4
        • SP3
        • SP2
        • SP1
    • Getting started
      • Getting access
      • Login
      • Main screen
      • Welcome dashboard
      • Detecting process anomalies
      • Analyzing data and detecting event sequences
      • Analyzing KPIs
    • How-tos
      • Monitors on production lines
        • Configuring the automatic login in the Smart Operations Toolkit
        • Configuring the automatic login to the identity provider with the Windows user
        • Setting cookies in the browser
        • Configuring the automatic logout in the Smart Operations Toolkit
        • Configuring the command line parameters in the browser
        • Known limitations and troubleshooting
      • Try out the APIs
    • Integration guide
      • Underlying concepts
        • Underlying concepts
        • Onboarding
        • Security
        • Communication
      • Integration journey
      • Example integrations
        • Node-RED
        • Power BI
      • Overview of APIs
    • Operations manual
      • Release
      • System architecture and interfaces
      • System requirements
        • Cluster requirements
        • Database requirements
        • Support for service meshes
      • Migration from previous SOT versions
      • Setup and configuration
        • Deployment process
        • Deployment with Helm
        • Advanced configuration
        • Integrations with external secret management solutions
        • Context paths
        • Service accounts and authorizations
        • Validation tests
        • Setup click once
        • Database user setup and configuration
      • Start and shutdown
      • Regular operations
        • User management & authentication
        • How to add additional tenants
        • How to access the cluster and pods
        • Automatic module role assignments in customer tenants
        • User credentials rotation - database and messaging secrets
      • Failure handling
        • Failure handling guidelines
        • Ansible operator troubleshooting
        • How to reach BCI for unresolved issues
      • Backup and restore
      • Logging and monitoring
        • The concept and conventions
        • ELK stack
        • ELK configurations aspects for beats
        • Proxy setup for ELK
        • Health endpoints configurations
      • Known limitations
      • Supporting functions
      • Security recommendations
        • Kubernetes
        • Security Best Practices for Databases
        • Certificates
        • Threat detection tools
    • Infrastructure manual
      • Release
      • System architecture and interfaces
        • RabbitMQ version support
      • System requirements
      • Migration from previous SOT infrastructure versions
      • Setup and configuration
        • Deployment process of the SOT infrastructure Helm chart
        • Deployment with Helm
      • Start and shutdown
      • Regular operations
        • RabbitMQ
          • User management & authentication
          • Disk size change
          • Upgrade performance with high performant disk type
          • Pod management policy
      • Failure handling
        • Connection failures
        • Data safety on the RabbitMQ side
        • Fix RabbitMQ cluster partitions
        • Delete unsynchronized RabbitMQ queues
        • How to reach BCI for unresolved issues
      • Backup and restore
      • Logging and monitoring
      • Known limitations
    • Training
    • Glossary
    • Further information and contact
Smart Operations Toolkit
  • Smart Operations Toolkit
    • Deviation Processor
    • Multitenant Access Control
    • Notification Service
    • Ticket Management
    • Web Portal
  • Shopfloor Management
    • Andon Live
    • KPI Reporting
    • Operational Routines
    • Shift Book
    • Shopfloor Management Administration
  • Product & Quality
    • Process Quality
    • AI Services
  • Machine & Equipment
    • Condition Monitoring
    • Device Portal
  • Enterprise & Shopfloor Integration
    • Information Router
    • Master Data Management

SOT Learning Portal

  • Smart Operations Toolkit
  • Operations manual
  • Regular operations
  • User credentials rotation - database and messaging secrets

User credentials rotation - database and messaging secrets

It is a industry-wide best practice to regularly rotate user credentials.

To support operators with this secret rotation task, the system offers an automated solution using Ansible, which requires a tool called Reloader in order to minimize the impact. Reloader is not provided as part of SOT and can be manually installed.

This method only works for secrets managed by the database or messaging specific ansible operators and not for databases or messaging instances managed externally.

As a result, the jobs will NOT impact secrets for externally managed instances.

Other methods can also be used to restore the service, including the manual trigger of a rollout restart of deployments/statefulsets.

Reloader

Prerequisite: Reloader is installed on the system.

Reloader is a solution offered by Stakater that can watch changes in ConfigMap and Secret and do rolling upgrades on Pods with their associated DeploymentConfigs, Deployments, Daemonsets Statefulsets and Rollouts.

Reloader installation

#To install Reloader on the machine
helm install stakater/reloader reloader

#To create the values file for Reloader
touch values.yaml

#Add the the contents in the values.yaml file
cat > values.yaml <<EOF
reloader.reloadOnCreate=true
reloader.syncAfterRestart=true
reloader.enableHA=true
reloader.deployment.replicas=2
EOF

Ansible operator Helmchart

The Ansible operator contains two jobs for password rotation and secrets restoration (prefixed with password-rotation-job and secret-restoration-job, respectively).

In order to run these jobs we need to do a manual step:

kubectl create job --from=cronjob/<our-cronjob-name> <a-specific-name>

This runs our job and does the password rotation or secret recreation.

Secret restoration

The secret restoration job is used in order to be able to quickly revert changes in case the password rotation fails (e.g due to missing admin credentials). In that case, all that is needed to be done is to run the above command to trigger the secrets restore job.

MACMA and Keycloak password and secrets rotation

For Macma & keycloak password and secrets rotation, the solution is to update the values for the corresponding password or secrets in the helm custom values.

Helm creates an additional job called change-macma-secrets-job-* which allows the update of the following credentials:

  • MACMA admin user and/or password

  • Keycloak admin-cli client secret - this is used by MACMA to connect to keycloak

  • Keycloak user and/or password

  • MACMA client (macma) secret

Limitations of the job:

  • The automation relies on helm lookup function which only works during helm install & helm upgrade. If one uses helm templating output to integrate with Git-ops tools like ArgoCD the job succeeds but it doesn’t change anything.

  • The change of the secrets restarts the MACMA pods introducing an unavailability window. The time of the unavailability is higher when the admin-cli secret is changed, because the pods with the new secret state will not be able to talk with Keycloak till the job is changing the secret.

The reason for this limitation is the breaking change introduced by the change in the admin-cli secret - MACMA pods have to be recreated to pick the new secret containing the updated value of admin-cli secret.

  • Since it is difficult to know exactly when the new MACMA pods are up, the job might fail to change all secrets so it will need one or two retries to converge. Having several retries is not an issue.

Troubleshooting failed runs

If the password and secret rotation job fails for any reason, one has to investigate the job’s pod logs to see which step is failing.

For now, the only identified failure might be triggered by an attempt to change the main tenant admin password with one from the history.

You can recover by changing the password with a string which wasn’t used in the last five set passwords and re-run the helm upgrade.

Contents

© Robert Bosch Manufacturing Solutions GmbH 2023-2026, all rights reserved

Changelog Corporate information Legal notice Data protection notice Third party licenses