Module health Endpoints and K8s probes

Condition Monitoring provides Kubernetes probes and health endpoints. There is a central module health endpoint, and each service also exposes its own health endpoints. Unlike the probes, health endpoints are not intended for use by the container management system. Instead, they are designed for privileged users (authorization required), such as operators or operator tools. These endpoints are exposed to provide more detailed information than a simple OK/NOK status.

Module Health EndPoint

In order to check the overall health status of the Condition Monitoring module or services, users must have one of the following roles assigned:

Required Roles (one of the following)

Operator (condition-monitoring-operator)
Condition Monitoring Administrator (condition-monitoring-admin)
Custom role with the static resource below:

ResourceType: urn:com:bosch:bci:cm:all:operation
ResourceID: health
ResourceName: Health endpoint - Provides information about the services and their dependencies

The request:

HTTP Method: GET
Url: https://<domain>/cm/health

Possible responses:

HTTP 200 OK - Returned when given token is valid with authorized resource
HTTP 403 FORBIDDEN - Returned when given token isn’t authorized (ex. Missing resource and permission)
HTTP 401 UNAUTHORIZED - Returned when given token is invalid

An authorized request returns a detailed response:

{
	"name": "Condition Monitoring Module",
	"description": "Condition Monitoring Module - Custom Health Endpoint",
	"instanceId": "",
	"startupTime": "2025-12-18T11:34:25.883348480Z",
	"version": "master-dev-SNAPSHOT",
	"ready": true,
	"health": "healthy",
	"onStateSince": "2025-12-18T12:56:05.899540031Z",
	"dependencies": [
		{
			"name": "cmCore",
			"available": true,
			"details": {
				"service": "Condition Monitoring"
			}
		},
		{
			"name": "kafka",
			"available": true,
			"details": {
				"clusterId": "XzLspcMFRvmShsc3Pk8o-w"
			}
		},
		{
			"name": "valueAggregator",
			"available": true,
			"details": {
				"service": "Value Aggregator"
			}
		},
		{
			"name": "mmpd",
			"available": true
		},
		{
			"name": "functionExecutor",
			"available": true,
			"details": {
				"service": "Function Executor"
			}
		},
		{
			"name": "resultAggregator",
			"available": true,
			"details": {
				"service": "Result Aggregator"
			}
		},
		{
			"name": "valueProvider",
			"available": true,
			"details": {
				"service": "Value Provider"
			}
		},
		{
			"name": "macma",
			"available": true
		},
		{
			"name": "rabbit",
			"available": true,
			"details": {
				"version": "4.1.2"
			}
		},
		{
			"name": "db",
			"available": true
		}
	]
}

Kubernetes probes

Liveness Probe A liveness probe can return
1. "Yes, I’m alive!" which will be evaluated by the runtime environment and no further action is executed.
2. "No I’m not alive!" which will be evaluated by the runtime environment. The container will be killed and restarted by the runtime environment.
3. Or it simply fails since the container is not able to respond to the probe for some reason. The container will be killed and restarted by the runtime environment.
Readiness Probe A readiness probe can return
1. "Yes, I’m ready and can serve functionality to my clients!" which will be evaluated by the runtime environment and no further action is executed.
2. "No, I’m not ready!" which will be evaluated by the runtime environment and no traffic will be routed to the container. The runtime environment will ask again with a configurable amount of retries and with a configurable delay. If this probe fails everytime the pod will be marked as unready.
3. Or it simply fails since the container is not able to respond to the probe for some reason. The runtime environment will stop routing traffic to the container. The runtime environment will ask again with a configurable amount of retries and with a configurable delay. If this probe fails everytime the pod will be marked as unready
Startup Probe A startup probe is intended to suppress other probes during containers startup or initialization phases for preventing restarts. Condition Monitoring uses the liveness probe as startup probe. It can return
1. "I’m succesfully started." From now on the runtime environment will start executing the liveness probes.
2. "I’m in the startup phase." the runtime environment will ask again with a configurable amount of retries and with configurable delay. If this probe fails everytime the pod might be restarted (depending on the pods restart policy).
3. Or it simply fails since the container is not able to respond to the probe for some reason.

Services Health Endpoints

Below you will find details about each service health endpoints and the impact of outages to dependencies.

Condition Monitoring Core

Endpoint	Description
Health endpoint cm/core/health	200 when service available Health endpoint is showing some details about the RabbitMQ, Influx, Database, Deviation Processor (SMDP), MACMA, MDM and Portal connection issue (only for authenticated users).
Liveness endpoint cm/core/internal/health/liveness	200 The Micro service will still be alive, the internal liveness state is valid and do not need to get restarted.
Readiness endpoint cm/core/internal/health/readiness	RabbitMQ, Deviation Processor, Portal, MDM 200 Rest api requests are still accepted to the microservice Database , MACMA, InfluxDB 503 The requests will not be accepted to the microservice

Endpoint

Description

Health endpoint
cm/core/health

200 when service available

Health endpoint is showing some details about the RabbitMQ, Influx, Database, Deviation Processor (SMDP), MACMA, MDM and Portal connection issue (only for authenticated users).

Liveness endpoint
cm/core/internal/health/liveness

200

The Micro service will still be alive, the internal liveness state is valid and do not need to get restarted.

Readiness endpoint
cm/core/internal/health/readiness

RabbitMQ, Deviation Processor, Portal, MDM

200

Rest api requests are still accepted to the microservice

Database , MACMA, InfluxDB

503

The requests will not be accepted to the microservice

Dependency	Use cases	Impact
RABBITMQ	Lost connection to RabbitMQ. Reasons: RabbitMQ instance crashed, restarted, etc. Network issue between microservice and RabbitMQ	The lost connection to RabbitMQ will be logged as warning. The microservice is trying to reconnect infinitely. We keep the Service alive to keep the UI running. RabbitMQ objects (e.g. queues and bindings) will be recreated once the Microservice reconnected API will be still functional and entities should be saved in DB Publishing of messages is retried till RabbitMQ is up and running No messages are lost since durable queues are used
INFLUXDB	Lost connection to InfluxDB. Reasons: InfluxDB instance crashed, restarted, etc. Network issue between microservice and RabbitMQ	The lost connection to Influx will be logged on error. The service is kept alive but it won’t be ready to accept any requests. After restarting InfluxDB, the service will accept requests. Several functionalities will not work anymore e.g. no OPP/PPMP data will be shown no rules are triggered no deviations are created no sequence detection will work …
DATABASE	Lost connection to the Database. Reasons: Database instance crashed, restarted etc. Network issue between microservice and database	The lost connection to database will be logged as warning. The microservice is trying to reconnect infinitely. The service is kept alive, but it won’t be ready to accept any requests. The API will no longer be functional
MACMA	Lost connection to MACMA. Reasons: MACMA instance crashed, restarted etc. Network issue between microservice and MACMA	The lost connection to MACMA will be logged as error in CM Core. The microservice is trying to reconnect 3 hours default (configurable via MACMA_RETRYER_MAXATTEMTPS). The service is kept alive, but it won’t be ready to accept any requests. The API will no longer be functional. After restarting MACMA, everything works fine again (get token, send requests) without restarting condition monitoring core.
PORTAL	Lost connection to Portal. Reasons: Portal instance crashed, restarted etc. Network issue between microservice and Portal	The lost connection/failed requests to Portal will be logged as error. The microservice is trying to reconnect infinitely default (configurable via PORTAL_RETRYER_RETRYINTERVALINSECONDS). After restarting Portal, Portal registration works fine again without restarting condition monitoring core. The Nexeed IAS UI cannot be accessed. Api calls to Rule Service App via postman are still working.
MDM	Lost connection to MDM. Reasons: MDM instance crashed, restarted etc. Network issue between microservice and MDM.	The lost connection/failed requests to MDM will be logged as error. The microservice is trying to reconnect 3 hours default (configurable via MDM_RETRYER_MAXATTEMTPS). The API is still functional with the existing master data in CM database. After restarting MDM, everything works fine again without restarting rule service. Master data changes (Integration Events) after MDM stopped is not reflected to CM Core, MDM needs to retry publishing. After triggering master data Reload via API, master data appears.
SMDP	Lost connection to Smdp. Reasons: Smdp instance crashed, restarted etc. Network issue between microservice and Smdp.	The lost connection/failed requests to Smdp will be logged as error. The microservice is trying to reconnect once by default (configurable via SMDP_RETRYER_MAXATTEMTPS). Deviations to Deviation Processor cannot be sent. After restarting smdp, deviations are sent again without restarting CM Core.

Dependency

Use cases

Impact

RABBITMQ

Lost connection to RabbitMQ.

Reasons:

RabbitMQ instance crashed, restarted, etc.
Network issue between microservice and RabbitMQ

The lost connection to RabbitMQ will be logged as warning.
The microservice is trying to reconnect infinitely.
We keep the Service alive to keep the UI running.
RabbitMQ objects (e.g. queues and bindings) will be recreated once the Microservice reconnected
API will be still functional and entities should be saved in DB
Publishing of messages is retried till RabbitMQ is up and running
No messages are lost since durable queues are used

INFLUXDB

Lost connection to InfluxDB.

Reasons:

InfluxDB instance crashed, restarted, etc.
Network issue between microservice and RabbitMQ

The lost connection to Influx will be logged on error.
The service is kept alive but it won’t be ready to accept any requests.
After restarting InfluxDB, the service will accept requests.
Several functionalities will not work anymore e.g.
no OPP/PPMP data will be shown
no rules are triggered
no deviations are created
no sequence detection will work
…

DATABASE

Lost connection to the Database.

Reasons:

Database instance crashed, restarted etc.
Network issue between microservice and database

The lost connection to database will be logged as warning.
The microservice is trying to reconnect infinitely.
The service is kept alive, but it won’t be ready to accept any requests.
The API will no longer be functional

MACMA

Lost connection to MACMA.

Reasons:

MACMA instance crashed, restarted etc.
Network issue between microservice and MACMA

The lost connection to MACMA will be logged as error in CM Core.
The microservice is trying to reconnect 3 hours default (configurable via MACMA_RETRYER_MAXATTEMTPS).
The service is kept alive, but it won’t be ready to accept any requests.
The API will no longer be functional.
After restarting MACMA, everything works fine again (get token, send requests) without restarting condition monitoring core.

PORTAL

Lost connection to Portal.

Reasons:

Portal instance crashed, restarted etc.
Network issue between microservice and Portal

The lost connection/failed requests to Portal will be logged as error.
The microservice is trying to reconnect infinitely default (configurable via PORTAL_RETRYER_RETRYINTERVALINSECONDS).
After restarting Portal, Portal registration works fine again without restarting condition monitoring core.
The Nexeed IAS UI cannot be accessed.
Api calls to Rule Service App via postman are still working.

MDM

Lost connection to MDM.

Reasons:

MDM instance crashed, restarted etc.
Network issue between microservice and MDM.

The lost connection/failed requests to MDM will be logged as error.
The microservice is trying to reconnect 3 hours default (configurable via MDM_RETRYER_MAXATTEMTPS).
The API is still functional with the existing master data in CM database.
After restarting MDM, everything works fine again without restarting rule service.
Master data changes (Integration Events) after MDM stopped is not reflected to CM Core, MDM needs to retry publishing. After triggering master data Reload via API, master data appears.

SMDP

Lost connection to Smdp.

Reasons:

Smdp instance crashed, restarted etc.
Network issue between microservice and Smdp.

The lost connection/failed requests to Smdp will be logged as error.
The microservice is trying to reconnect once by default (configurable via SMDP_RETRYER_MAXATTEMTPS).
Deviations to Deviation Processor cannot be sent.
After restarting smdp, deviations are sent again without restarting CM Core.

Rule Service App

Endpoint	Description
Health endpoint cm/rm/rule-manager/health	200 when service available Health endpoint is showing some details about the RabbitMQ, Database, Kafka, MACMA, MDM connection issue (only for authenticated users).
Liveness endpoint cm/rm/rule-manager/internal/health/liveness	200 The microservice will still be alive, the internal liveness state is valid and do not need to get restarted.
Readiness endpoint cm/rm/rule-manager/internal/health/readiness	RabbitMQ, Kafka , MDM 200 Rest API requests are still accepted by the microservice. Database , MACMA 503 The requests will not be accepted by the microservice.

Endpoint

Description

Health endpoint
cm/rm/rule-manager/health

200 when service available

Health endpoint is showing some details about the RabbitMQ, Database, Kafka, MACMA, MDM connection issue (only for authenticated users).

Liveness endpoint
cm/rm/rule-manager/internal/health/liveness

200

The microservice will still be alive, the internal liveness state is valid and do not need to get restarted.

Readiness endpoint
cm/rm/rule-manager/internal/health/readiness

RabbitMQ, Kafka , MDM

200

Rest API requests are still accepted by the microservice.

Database , MACMA

503

The requests will not be accepted by the microservice.

Dependency	Use cases	Impact
RABBITMQ	Lost connection to RabbitMQ. Reasons: RabbitMQ instance crashed, restarted, etc. Network issue between microservice and RabbitMQ	The lost connection to RabbitMQ will be logged as warning. The microservice is trying to reconnect infinitely. We keep the service alive to keep the UI running. RabbitMQ objects (e.g. queues and bindings) will be recreated once the microservice reconnected. API will be still functional and entities should be saved in DB. Publishing of messages is retried till RabbitMQ is up and running. No messages aare lost since durable queues are used. Rules are executed and events are created again when Rabbitmq is available.
Kafka	Lost connection to Kafka. Reasons: Kafka instance crashed, restarted etc. Network issue between microservice and database.	The lost connection to Kafka will be logged as warning. The microservice is trying to reconnect infinitely. We keep the service alive to keep the UI running. Kafka topics will be reconnected/recreated (when auto create is enabled) once the microservice reconnected. API will still be functional and entities should be saved in DB. Publishing of messages is retried till Kafka is up and running. No messages are lost. Rules are executed and events are created again when Kafka is available.
DATABASE	Lost connection to Database. Reasons: Database instance crashed, restarted etc. Network issue between microservice and database.	The lost connection to database will be logged as warning. The microservice is trying to reconnect infinitely. The service is kept alive, but it won’t be ready to accept any requests. The API will no longer be functional.
MACMA	Lost connection to MACMA. Reasons: MACMA instance crashed, restarted etc. Network issue between microservice and MACMA.	The lost connection to MACMA will be logged as error in Rule Service App. The microservice is trying to reconnect 3 hours default (configurable via MACMA_RETRYER_MAXATTEMTPS). The service is kept alive, but it won’t be ready to accept any requests. The API will no longer be functional. After restarting MACMA, everything works fine again (get token, send requests) without restarting Rule Service App.
MDM	Lost connection to MDM. Reasons: MDM instance crashed, restarted etc. Network issue between microservice and MDM.	The lost connection/failed requests to MDM will be logged as error. The microservice is trying to reconnect 3 hours default (configurable via MDM_RETRYER_MAXATTEMTPS). The API is still functional with the existing master data in RM Database. After restarting MDM, everything works fine again without restarting Rule Service App. Master data changes (Integration Events) after MDM stopped are not reflected to Rule Service App, MDM needs to retry publishing (stop Rule Service App. Create new device in MDM, link it to a facility. Stop MDM, start Rule Service App). After triggering master data reload via API, master data appears.

Dependency

Use cases

Impact

RABBITMQ

Lost connection to RabbitMQ.

Reasons:

RabbitMQ instance crashed, restarted, etc.
Network issue between microservice and RabbitMQ

The lost connection to RabbitMQ will be logged as warning.
The microservice is trying to reconnect infinitely.
We keep the service alive to keep the UI running.
RabbitMQ objects (e.g. queues and bindings) will be recreated once the microservice reconnected.
API will be still functional and entities should be saved in DB.
Publishing of messages is retried till RabbitMQ is up and running.
No messages aare lost since durable queues are used.
Rules are executed and events are created again when Rabbitmq is available.

Kafka

Lost connection to Kafka.

Reasons:

Kafka instance crashed, restarted etc.
Network issue between microservice and database.

The lost connection to Kafka will be logged as warning.
The microservice is trying to reconnect infinitely.
We keep the service alive to keep the UI running.
Kafka topics will be reconnected/recreated (when auto create is enabled) once the microservice reconnected.
API will still be functional and entities should be saved in DB.
Publishing of messages is retried till Kafka is up and running.
No messages are lost.
Rules are executed and events are created again when Kafka is available.

DATABASE

Lost connection to Database.

Reasons:

Database instance crashed, restarted etc.
Network issue between microservice and database.

The lost connection to database will be logged as warning.
The microservice is trying to reconnect infinitely.
The service is kept alive, but it won’t be ready to accept any requests.
The API will no longer be functional.

MACMA

Lost connection to MACMA.

Reasons:

MACMA instance crashed, restarted etc.
Network issue between microservice and MACMA.

The lost connection to MACMA will be logged as error in Rule Service App.
The microservice is trying to reconnect 3 hours default (configurable via MACMA_RETRYER_MAXATTEMTPS).
The service is kept alive, but it won’t be ready to accept any requests.
The API will no longer be functional.
After restarting MACMA, everything works fine again (get token, send requests) without restarting Rule Service App.

MDM

Lost connection to MDM.

Reasons:

MDM instance crashed, restarted etc.
Network issue between microservice and MDM.

The lost connection/failed requests to MDM will be logged as error.
The microservice is trying to reconnect 3 hours default (configurable via MDM_RETRYER_MAXATTEMTPS).
The API is still functional with the existing master data in RM Database.
After restarting MDM, everything works fine again without restarting Rule Service App.
Master data changes (Integration Events) after MDM stopped are not reflected to Rule Service App, MDM needs to retry publishing (stop Rule Service App. Create new device in MDM, link it to a facility. Stop MDM, start Rule Service App). After triggering master data reload via API, master data appears.

Stateful Function Executor

Endpoint	Description
Health endpoint cm/rm/stateful-function-executor/health	200 when service available. Health endpoint is showing some details about the RabbitMQ and InfluxDB connection issue (only for authenticated users).
Liveness endpoint cm/rm/stateful-function-executor/internal/health/liveness	200 The microservice will still be alive, the internal liveness state is valid and do not need to get restarted.
Readiness endpoint cm/rm/stateful-function-executor/internal/health/readiness	200 When service is able to connect to RabbitMQ. When service is able to connect to InfluxDB. 503 When service is not able to connect to RabbitMQ. When service is not able to connect to InfluxDB.

Endpoint

Description

Health endpoint
cm/rm/stateful-function-executor/health

200 when service available.

Health endpoint is showing some details about the RabbitMQ and InfluxDB connection issue (only for authenticated users).

Liveness endpoint
cm/rm/stateful-function-executor/internal/health/liveness

200

The microservice will still be alive, the internal liveness state is valid and do not need to get restarted.

Readiness endpoint
cm/rm/stateful-function-executor/internal/health/readiness

200

When service is able to connect to RabbitMQ.
When service is able to connect to InfluxDB.

503

When service is not able to connect to RabbitMQ.
When service is not able to connect to InfluxDB.

Dependency	Use cases	Impact
RABBITMQ	Lost connection to RabbitMQ. Reasons: RabbitMQ instance crashed, restarted, etc. Network issue between microservice and RabbitMQ	The lost connection to RabbitMQ will be logged as warning. The microservice is trying to reconnect infinitely. Rule execution will not be done at this time since messages cannot be consumed. Rule execution is done after RabbitMQ connection is established.
INFLUXDB	Lost connection to InfluxDB. Reasons: InfluxDB instance crashed, restarted, etc. Network issue between microservice and RabbitMQ	The lost connection to Influx will be logged on error. The service is kept alive but it won’t be ready to accept any requests. After restarting InfluxDB, the service will accept requests. Several functionalities will not work anymore e.g. no rules with previous are triggered

Dependency

Use cases

Impact

RABBITMQ

Lost connection to RabbitMQ.

Reasons:

RabbitMQ instance crashed, restarted, etc.
Network issue between microservice and RabbitMQ

The lost connection to RabbitMQ will be logged as warning.
The microservice is trying to reconnect infinitely.
Rule execution will not be done at this time since messages cannot be consumed.
Rule execution is done after RabbitMQ connection is established.

INFLUXDB

Lost connection to InfluxDB.

Reasons:

InfluxDB instance crashed, restarted, etc.
Network issue between microservice and RabbitMQ

The lost connection to Influx will be logged on error.
The service is kept alive but it won’t be ready to accept any requests.
After restarting InfluxDB, the service will accept requests.
Several functionalities will not work anymore e.g.
- no rules with previous are triggered

Rule Value Aggregator

Endpoint	Description
Health endpoint cm/rm/value-aggregator/health	200 when service available. Health endpoint is showing some details about the Kafka and Kafka Streams State(only for authenticated users).
Liveness endpoint cm/rm/value-aggregator/internal/health/liveness	503 When Kafka Streams is not running or re-balancing, then service is also not alive, the service should be restarted. 200 The microservice will still be alive, the internal liveness state is valid and do not need to get restarted.
Readiness endpoint cm/rm/value-aggregator/internal/health/readiness	503 While connection is lost to Kafka and Kafka Streams is not in the RUNNING state. The messages will not be consumed from service instance. 200 While service is connected to Kafka and Kafka Streams is in the RUNNING state. The messages will be consumed from service instance.

Endpoint

Description

Health endpoint
cm/rm/value-aggregator/health

200 when service available.

Health endpoint is showing some details about the Kafka and Kafka Streams State(only for authenticated users).

Liveness endpoint
cm/rm/value-aggregator/internal/health/liveness

503

When Kafka Streams is not running or re-balancing, then service is also not alive, the service should be restarted.

200

The microservice will still be alive, the internal liveness state is valid and do not need to get restarted.

Readiness endpoint
cm/rm/value-aggregator/internal/health/readiness

503

While connection is lost to Kafka and Kafka Streams is not in the RUNNING state.

The messages will not be consumed from service instance.

200

While service is connected to Kafka and Kafka Streams is in the RUNNING state.

The messages will be consumed from service instance.

Dependency	Use cases	Impact
Kafka	Lost connection to Kafka. Reasons: Kafka instance crashed, restarted etc. Network issue between microservice and database.	The lost connection to Kafka will be logged as warning. The microservice is trying to reconnect infinitely. Kafka topics will be reconnected once the microservice reconnected. Rule execution will not be done at this time since messages cannot be consumed. Rule execution is done after Kafka connection is established.

Dependency

Use cases

Impact

Kafka

Lost connection to Kafka.

Reasons:

Kafka instance crashed, restarted etc.
Network issue between microservice and database.

The lost connection to Kafka will be logged as warning.
The microservice is trying to reconnect infinitely.
Kafka topics will be reconnected once the microservice reconnected.
Rule execution will not be done at this time since messages cannot be consumed.
Rule execution is done after Kafka connection is established.

Rule Value Provider

Endpoint	Description
Health endpoint cm/rm/value-provider/health	200 when service available. Health endpoint is showing some details about the Kafka and Kafka Streams State(only for authenticated users).
Liveness endpoint cm/rm/value-provider/internal/health/liveness	503 When Kafka Streams is not running or re-balancing, then service is also not alive, the service should be restarted. 200 The microservice will still be alive, the internal liveness state is valid and do not need to get restarted.
Readiness endpoint cm/rm/value-provider/internal/health/readiness	503 While connection is lost to Kafka and Kafka Streams is not in the RUNNING state. The messages will not be consumed from service instance. 200 While service is connected to Kafka and Kafka Streams is in the RUNNING state. The messages will be consumed from service instance.

Endpoint

Description

Health endpoint
cm/rm/value-provider/health

200 when service available.

Health endpoint is showing some details about the Kafka and Kafka Streams State(only for authenticated users).

Liveness endpoint
cm/rm/value-provider/internal/health/liveness

503

When Kafka Streams is not running or re-balancing, then service is also not alive, the service should be restarted.

200

The microservice will still be alive, the internal liveness state is valid and do not need to get restarted.

Readiness endpoint
cm/rm/value-provider/internal/health/readiness

503

While connection is lost to Kafka and Kafka Streams is not in the RUNNING state.

The messages will not be consumed from service instance.

200

While service is connected to Kafka and Kafka Streams is in the RUNNING state.

The messages will be consumed from service instance.

Dependency	Use cases	Impact
Kafka	Lost connection to Kafka. Reasons: Kafka instance crashed, restarted etc. Network issue between microservice and database.	The lost connection to Kafka will be logged as warning. The microservice is trying to reconnect infinitely. Kafka topics will be reconnected once the microservice reconnected. Rule execution will not be done at this time since messages cannot be consumed. Rule execution is done after Kafka connection is established.

Dependency

Use cases

Impact

Kafka

Lost connection to Kafka.

Reasons:

Kafka instance crashed, restarted etc.
Network issue between microservice and database.

The lost connection to Kafka will be logged as warning.
The microservice is trying to reconnect infinitely.
Kafka topics will be reconnected once the microservice reconnected.
Rule execution will not be done at this time since messages cannot be consumed.
Rule execution is done after Kafka connection is established.

Rule Result Aggregator

Endpoint	Description
Health endpoint cm/rm/result-aggregator/health	200 when service available. Health endpoint is showing some details about the Kafka and Kafka Streams State(only for authenticated users).
Liveness endpoint cm/rm/result-aggregator/internal/health/liveness	503 When Kafka Streams is not running or re-balancing, then service is also not alive, the service should be restarted. 200 The microservice will still be alive, the internal liveness state is valid and do not need to get restarted.
Readiness endpoint cm/rm/result-aggregator/internal/health/readiness	503 While connection is lost to Kafka and Kafka Streams is not in the RUNNING state. The messages will not be consumed from service instance. 200 While service is connected to Kafka and Kafka Streams is in the RUNNING state. The messages will be consumed from service instance.

Endpoint

Description

Health endpoint
cm/rm/result-aggregator/health

200 when service available.

Health endpoint is showing some details about the Kafka and Kafka Streams State(only for authenticated users).

Liveness endpoint
cm/rm/result-aggregator/internal/health/liveness

503

When Kafka Streams is not running or re-balancing, then service is also not alive, the service should be restarted.

200

The microservice will still be alive, the internal liveness state is valid and do not need to get restarted.

Readiness endpoint
cm/rm/result-aggregator/internal/health/readiness

503

While connection is lost to Kafka and Kafka Streams is not in the RUNNING state.

The messages will not be consumed from service instance.

200

While service is connected to Kafka and Kafka Streams is in the RUNNING state.

The messages will be consumed from service instance.

Dependency	Use cases	Impact
Kafka	Lost connection to Kafka. Reasons: Kafka instance crashed, restarted etc. Network issue between microservice and database.	The lost connection to Kafka will be logged as warning. The microservice is trying to reconnect infinitely. Kafka topics will be reconnected once the microservice reconnected. Rule execution will not be done at this time since messages cannot consumed. Rule execution is done after Kafka connection is established.

Dependency

Use cases

Impact

Kafka

Lost connection to Kafka.

Reasons:

Kafka instance crashed, restarted etc.
Network issue between microservice and database.

The lost connection to Kafka will be logged as warning.
The microservice is trying to reconnect infinitely.
Kafka topics will be reconnected once the microservice reconnected.
Rule execution will not be done at this time since messages cannot consumed.
Rule execution is done after Kafka connection is established.

Rule Function Executor

Endpoint	Description
Health endpoint cm/rm/function-executor/health	200 when service available. Health endpoint is showing some details about the Kafka and Kafka Streams State(only for authenticated users).
Liveness endpoint cm/rm/function-executor/internal/health/liveness	503 When Kafka Streams is not running or re-balancing, then service is also not alive, the service should be restarted. 200 The microservice will still be alive, the internal liveness state is valid and do not need to get restarted.
Readiness endpoint cm/rm/function-executor/internal/health/readiness	503 While connection is lost to Kafka and Kafka Streams is not in the RUNNING state. The messages will not be consumed from service instance. 200 While service is connected to Kafka and Kafka Streams is in the RUNNING state. The messages will be consumed from service instance.

Endpoint

Description

Health endpoint
cm/rm/function-executor/health

200 when service available.

Health endpoint is showing some details about the Kafka and Kafka Streams State(only for authenticated users).

Liveness endpoint
cm/rm/function-executor/internal/health/liveness

503

When Kafka Streams is not running or re-balancing, then service is also not alive, the service should be restarted.

200

The microservice will still be alive, the internal liveness state is valid and do not need to get restarted.

Readiness endpoint
cm/rm/function-executor/internal/health/readiness

503

While connection is lost to Kafka and Kafka Streams is not in the RUNNING state.

The messages will not be consumed from service instance.

200

While service is connected to Kafka and Kafka Streams is in the RUNNING state.

The messages will be consumed from service instance.

Dependency	Use cases	Impact
Kafka	Lost connection to Kafka. Reasons: Kafka instance crashed, restarted etc. Network issue between microservice and database.	The lost connection to Kafka will be logged as warning. The microservice is trying to reconnect infinitely. Kafka topics will be reconnected once the microservice reconnected. Rule execution will not be done at this time since messages cannot be consumed Rule execution is done after Kafka connection is established.

Dependency

Use cases

Impact

Kafka

Lost connection to Kafka.

Reasons:

Kafka instance crashed, restarted etc.
Network issue between microservice and database.

The lost connection to Kafka will be logged as warning.
The microservice is trying to reconnect infinitely.
Kafka topics will be reconnected once the microservice reconnected.
Rule execution will not be done at this time since messages cannot be consumed
Rule execution is done after Kafka connection is established.