Nexeed
    • Introduction
    • User manual
      • Condition monitoring and its tabs
        • Live
        • Counters
        • Measurements
        • Events
        • Rules
        • View configuration
        • Details
      • Rules management
        • Rule types and standard functions
        • Rule details
      • Function configuration
      • Condition Monitoring widgets
      • Access Management
        • Application Roles
        • Fine-Grained Access Control and Configuration
        • How to Configure Organization Roles
    • Operations manual
      • Overview
      • System architecture and interfaces
        • System components
      • System requirements
        • General notes
        • cm/condition-monitoring-core
        • cm/rule-service-app
        • cm/rule-function-executor
        • cm/rule-result-aggregator
        • cm/rule-value-aggregator
        • cm/rule-value-provider
        • cm/stateful-function-executor
      • Migration from previous versions
        • Migration to 2.1+
        • Migration from CPM 1.5.4 to CM and RM 3.0.x (Nexeed IAS 2023.02.00.xx)
          • CPM to CM relational database migration
          • CPM to RM relational database migration
          • CM Influx database migration
          • Deletion of an old CPM installation
        • Resources mapping from MES to IAS Condition Monitoring
        • Migration to 4.0.0+ (Nexeed IAS 2024.01.00.xx)
        • Migration to 4.3.x (Nexeed IAS 2024.02.01.x)
        • Migration to 4.5.x (Nexeed IAS 2025.01.00.x)
        • Migration to 4.6.x (Nexeed IAS 2025.01.01.x)
        • Migration to 4.8.x (Nexeed IAS 2025.02.00.x)
        • Migration to 4.9.x (Nexeed IAS 2025.02.01.x)
      • Setup and configuration
        • Manual MACMA configuration after setting up a new tenant
        • RabbitMQ
        • Influx configuration
        • Kafka topics
        • Condition Monitoring - Helm Configuration
        • Advanced configuration parameters
          • cm/condition-monitoring-core
            • Common shared variables
            • Portal shared variables
            • MDM shared variables
            • RabbitMQ shared variables
            • OTEL shared variables
          • cm/rule-service-app
            • Rules Management shared variables
            • KAFKA shared variables
          • cm/rule-function-executor
          • cm/rule-result-aggregator
          • cm/rule-value-aggregator
          • cm/rule-value-provider
          • cm/stateful-function-executor
      • Start and shutdown
      • Regular operations
      • Failure handling
        • Rule Management Light Helm installation failing when Kafka is disabled or Kafka is not configured at all
        • User manual injection into Rule Management
        • Infrastructure outages: health verification Endpoints
        • OPP/PPMP are not received in CM
        • Master data (Devices, Facilities, Measuring Points, DeviceTypes) is missing in CM
        • CM is not visible in the portal
        • How to verify if the broker is out of sync
      • Backup and Restore
      • Logging and monitoring
        • General logging characteristics
        • Required monitoring
        • General logging format
        • Request-based logging format
        • Security logging format
        • Lifecycle logging format
        • Module health Endpoints and K8s probes
      • Known limitations
    • API documentation
      • Condition Monitoring HTTP API
      • Rules Management HTTP API
    • Glossary
Condition Monitoring
  • Industrial Application System
  • Core Services
    • Block Management
    • Deviation Processor
    • ID Builder
    • Multitenant Access Control
    • Notification Service
    • Ticket Management
    • Web Portal
  • Shopfloor Management
    • Andon Live
    • Global Production Overview
    • KPI Reporting
    • Operational Routines
    • Shift Book
    • Shopfloor Management Administration
  • Product & Quality
    • Product Setup Management
    • Part Traceability
    • Process Quality
    • Setup Specs
  • Execution
    • Line Control
    • Material Management
    • Order Management
    • Packaging Control
    • Rework Control
  • Intralogistics
    • AGV Control Center
    • Stock Management
    • Transport Management
  • Machine & Equipment
    • Condition Monitoring
    • Device Portal
    • Maintenance Management
    • Tool Management
  • Enterprise & Shopfloor Integration
    • Archiving Bridge
    • Data Publisher
    • Direct Data Link
    • Engineering UI
    • ERP Connectivity
    • Gateway
    • Information Router
    • Master Data Management
    • Orchestrator

Nexeed Learning Portal

  • Condition Monitoring
  • Operations manual
  • Logging and monitoring
  • Required monitoring
preview 4.10.0

Required monitoring

Condition Monitoring supports OpenTelemetry for tracing and monitoring. For configuration details, see Chapter "11.5. OpenTelemetry Integration" in the central NEXEED Industrial Application System Operations Manual.

Logs

Alert rules maybe use 'JSONPath' for addressing values in structured logs. This is indicated with a '$'.

Table 1. Required log monitoring
Message Logger Alert rule Context Symptoms Solution

Service crashed

LIFE-CYCLE

(($stackTraces[*].failureType = 'liquibase.exception.LockException') > 0) in one minute

Database migration

Pod does not start up / pod restarts

Assert if the kubernetes deployment, listed in the 'LOCKEDBY' column of table 'CM_CORE_DATABASECHANGELOGLOCK' or RM_DATABASECHANGELOGLOCK', is running. If not, delete the entry from the database. If so, wait for the migration to complete.

Health status is [DOWN]. <further details>

LIFE-CYCLE

(($status = 'UNHEALTHY') > 0) in one minute

System availability

Loss of functionality, loss of data, system unavailability

Inspect the further details of the message. This contains a detailed report of the problem.

Open Telemetry

To see the open telemetry configuration, please check the general IAS Operations Manual. Under logging monitoring concepts you can find the Open Telemetry integration.

Monitoring RabbitMQ queues

Below are the recommended thresholds for monitoring RabbitMQ queues used by Condition Monitoring services. .Rabbitmq Queue Monitoring Thresholds

Queue

MaxLength

Alert Threshold (Upper Limit)

q.cm.core.opp.v09.machineEquipment.measurementTimeSeries.v09

1000

750

q.cm.core.opp.v09.machineEquipment.machine.v09

1000

750

q.cm.core.ppmp.v2.message.measurement

1000

750

q.cm.core.ruleResult.positive

1000

750

q.cm.rs.ppmp.machineMsg.enriched

1000

750

q.cm.rs.ppmp.measurement.enriched

1000

750

q.cm.sfe.machineRuleExecutionMsg.received

50000

35000

q.cm.sfe.measurementRuleExecutionMsg.received

50000

35000

Horizontal pod scaling guidance

Currently, automatic pod scaling is not enabled for Condition Monitoring services. It is strongly recommended to monitor the CPU and memory usage of each pod:

  • Trigger an alert if CPU or memory usage of a pod exceeds 80%.

  • When such alerts are triggered, evaluate scaling the affected service horizontally by increasing the number of pods to ensure continued performance and reliability.

Based on reference customer data (scale factor 1), load tests were performed to determine appropriate scaling for each service. The following tables provide guidance for scaling services according to system load and scale factor.

Table 2. Message and Rule Amount by Scale Factor

Measurement msg/sec (Queue: q.cm.core.opp.v09.machineEquipment.measurementTimeSeries.v09)

Machine msg/sec (Queue: q.cm.core.opp.v09.machineEquipment.machine.v09)

All Devices

Active Devices Sending Measurement

All Rules (active measurement rule, active machine rule)

Measurement Rules Considered for Rule Execution

Machine Rules Considered for Rule Execution

Scale Factor

1500

10

870

433

1830 (209, 1009)

133

5

1

3000

20

1740

866

3660 (418, 2018)

266

10

2

4500

30

2610

1299

5490 (627, 3027)

399

15

3

6000

40

3480

1732

7320 (836, 4036)

532

20

4

7500

50

4350

2165

9150 (1045, 5045)

665

25

5

9000

60

5220

2568

10980 (1254, 6054)

798

30

6

The following tables show the recommended number of service instances (based on resource (cpu and memory) requests and limits of the services) for each scale factor. Adjust these values as needed based on observed system performance and monitoring data.

Table 3. Service Scaling Reference (Scale Factor 1 and 2)

Service Name

Service Instances

condition-monitoring-core

2

rule-service-app

2

rule-function-executor

2

rule-value-provider

2

rule-value-aggregator

2

rule-result-aggregator

2

Table 4. Service Scaling Reference (Scale Factor 3)

Service Name

Service Instances

condition-monitoring-core

3

rule-service-app

2

rule-function-executor

2

rule-value-provider

2

rule-value-aggregator

2

rule-result-aggregator

3

Table 5. Service Scaling Reference (Scale Factor 4)

Service Name

Service Instances

condition-monitoring-core

4

rule-service-app

2

rule-function-executor

2

rule-value-provider

2

rule-value-aggregator

2

rule-result-aggregator

4

Table 6. Service Scaling Reference (Scale Factor 5)

Service Name

Service Instances

condition-monitoring-core

5

rule-service-app

2

rule-function-executor

2

rule-value-provider

2

rule-value-aggregator

2

rule-result-aggregator

5

Table 7. Service Scaling Reference (Scale Factor 6)

Service Name

Service Instances

condition-monitoring-core

6

rule-service-app

3

rule-function-executor

2

rule-value-provider

3

rule-value-aggregator

2

rule-result-aggregator

6

Contents

© Robert Bosch Manufacturing Solutions GmbH 2023-2025, all rights reserved

Changelog Corporate information Legal notice Data protection notice Third party licenses