Failure handling

When encountering errors, first please go through the Quick Diagnosis to handle the errors. The following sub-chapter Failure handling guidelines provides more specific error handling for some cases. For unresolved issues, please get in touch with us as specified in the sub-chapter How to reach BCI for unresolved issues.

Quick Diagnosis

Identify the failure
- Since Nexeed IAS is a modular application, the first step in handling a failure is identifying which module is causing the issue.
- Check if the failure is affecting multiple modules or only the one investigated in the first place (triggered by alerts or end-user feedback)
Look for the specific pod causing the failure
- Identify the pod causing the issue by checking the state of the pods in kubernetes. If some pods are not running or are in an error state, describe the pods and also check for any errors or warnings that may indicate a problem.
Check for known errors
- Check for available solutions or workarounds in the Nexeed IAS documentation or support resources to see if the issue is a known error.
Check infrastructure services
- If the issue is unrelated to a specific module, check the infrastructure services, such as the gateway, databases, or messaging queue, to see if any problems could affect the entire system.
Check the health of the Kubernetes cluster
- If you have followed all the above steps and the issue persists, check the health of the Kubernetes cluster, the nodes, and the control plane to see if any problems could affect the entire system.
Failure handling guidelines
Ansible operator troubleshooting
How to reach BCI for unresolved issues