CM Influx database migration
In the course of the split of the Condition Process Monitoring (CPM) module into Condition Monitoring (CM) and Process Quality (PQ) the existing CPM data in the Influx database needs to be migrated to a new Influx database for each respective module.
This explains the CPM to CM data migration. For CPM to PQM data migration, please check the PQM migration docs.
To migrate the CM Influx data use the following guides:
For more details on how to use the Influx data migration tool see Appendix: Influx migration tool
Prerequisites
-
Prepare the new Influx Databases
-
Clarify with the business which data needs to be migrated
-
Should all data be migrated or just the data after a specific date?
-
Should the original data be deleted after the migration?
-
-
Inspect the retention policies in the existing Influx DB and their disk sizes
-
What is a suitable chunk size for migrating the data so that we don’t run out of memory?
-
-
Derive the fitting parameters for the migration tool (see below for details) and create a migration plan according to your previous findings
-
Install Java JDK 17 on the machine on which you want to execute the migration tool
-
Add the needed Certificates in the Java KeyStore (optional if 'disable-ssl' option is enabled).
-
The JAR file can be found in artifactory or network share (use the newest released version):
-
\\rb-repobci.de.bosch.com\InternalShare\IAS\IAS2023.02\MigrationCPMToCM\CM\influx
| Make sure that there is a valid and stable connection between the Source, Target and the host where the Influx Migration Application is running. |
The migration mainly depends on 3 variables: the data size per Influx Point that is to be migrated (an Event is small), the query-chunk-size and the quality of the HTTP connection between the Source, Target and host where the Influx Migration application is running.
Therefore, there is no golden value for setting the query-chunk-size; it all depends on how much data there is to migrate and how fast and reliable the HTTP connection is.
|
Please be advised that K9s’s port forwarding feature proved to be very unreliable during our tests.
If you rely on this feature to do the migration, it is a high chance that you will encounter problems.
However, there is a workaround by setting ok-http-logging-interceptor-level to BODY.
This will result in lots of output information, so grep for errors to make sure the migration ran successfully.
We advise to find a better solution and DO NOT rely on this workaround.
|
Migrating Condition Monitoring timeseries data except raw measurements
Events and measurements have a small data size per Influx point.
Thus, a larger query-chunk-size can be used to speed up the migration process, but there are multiple source-retentions that need to be migrated.
It has to be decided if the measurement raw data has to be migrated, as this can be a lot of data and can take some time. That is why we propose to migrate measurement raw data in a separate step (Migrating raw measurements only).
If the raw data should not be migrated in this step, then rp_msm_raw should be added to the source-retention-exclusions (as shown in the example).
If the migration is done within a trusted network, the --disable-ssl option can be used in order to trust the certificates of the Source and Target. This implies that there will be no need to add the self-signed certificates to the Java TrustStore.
During our test phase, we used the following query to migrate Condition Monitoring timeseries data except raw measurements (rp_event,rp_msm_deferred,rp_msm_level1,rp_msm_level2,rp_msm_level3,rp_ppmp_meta_data):
Exclusions: rp_processes and rp_msm_raw
java -jar influx-migration-<release_version>.jar \
--tenant-id=7311ea8c-5d48-43fe-acf9-980eedf24b6c \
--query-chunk-size=40000 \
--source-url=https://si0vmc3101.de.bosch.com:8086 \
--source-db=cpm \
--source-username=cpm \
--source-password=aklnnNBUIf-efekla \
--target-url=http://localhost:8086 \
--target-db=cpm \
--target-username=admin \
--target-password=admin_password \
--skip-continuous-queries=true \
--source-retention-exclusions=autogen,rp_process,rp_msm_raw
Migrating raw measurements only
Raw measurements have a small data size per Influx point.
Thus, a larger query-chunk-size can be used to speed up the migration process.
If the migration is done within a trusted network, the --disable-ssl option can be used in order to trust the certificates of the Source and Target. This implies that there will be no need to add the self-signed certificates to the Java TrustStore.
During our test phase, we used the following query to migrate raw measurements only (rp_msm_raw):
Exclusions: rp_event,rp_msm_deferred,rp_msm_level1,rp_msm_level2,rp_msm_level3,rp_ppmp_meta_data, rp_processes
java -jar influx-migration-<release_version>.jar \
--tenant-id=7311ea8c-5d48-43fe-acf9-980eedf24b6c \
--query-chunk-size=40000 \
--source-url=https://si0vmc3101.de.bosch.com:8086 \
--source-db=cpm \
--source-username=cpm \
--source-password=aklnnNBUIf-efekla \
--target-url=http://localhost:8086 \
--target-db=cpm \
--target-username=admin \
--target-password=admin_password \
--skip-continuous-queries=true \
--source-retention-exclusions=autogen,rp_event,rp_msm_deferred,rp_msm_level1,rp_msm_level2,rp_msm_level3,rp_ppmp_meta_data,rp_process
General notes
-
This tool streams the Influx points into memory and then writes them to the target DB. So make sure that you have good internet connection between the source host, the host where you execute the tool and the target host.
-
This tool does NOT create, delete, or modify rights to a specific database in any way
-
As continuous queries do not belong to a retention policy, they will either all be copied or none (use
--skip-continuous-queries) -
This tool does NOT delete any Continuous Queries whatsoever. The reason for this is that not all data is guaranteed to be moved to the new database, and then the Continuous Query might still be relevant to the remaining data in the original database.
-
One of the stretch goals for this tool was to be able to continue the migration at a later stage (to continue where it was previously stopped). Although this was not specifically implemented, the tool will not duplicate any Data, Measurements, Retention Policies, Continuous Queries or anything else when it is run again with the same parameters.
-
BE CAREFUL when choosing to add the
--delete-sourcetag. The results are permanent and immediate. (It has been set to default=false to prevent this accidentally happening) -
ALWAYS check and double-check the results, especially Retention Policies and Continuous Queries that were created.
-
The tool will output the number of points copied per chunk to give a sense of progress. You can use a count query to know how many points to expect before starting the migration tool
-
If the tool doesn’t output progress for some time (> 30s) and seems stuck, hit
CTRL-Cand decrease the chunk size
with--query-chunk-size
-
-
Migrate raw measurements and other Condition Monitoring timeseries data in separate steps, see Migrating raw measurements only and Migrating Condition Monitoring timeseries data except raw measurements
-
Figure out which retention policies you do not want to migrate and use
--source-retention-exclusionsto exclude them -
If the tool crashes at some time, you can run it again - existing points will just be overridden. If you know what data is missing, you can tweak it with the
--from-timestampand--to-timestampoptions. -
There might be some points that run out of retention when writing them into the database!
Using the migration tool
The provided migration tool can be run via the command line.
Prerequisite: Java 17 installed.
The following parameters needs to be provided and should be specified before running the tool:
Source Database Configuration
| Parameter | Required | Description | Example |
|---|---|---|---|
source-db |
yes |
Name of the migration source database |
cpm |
source-url |
yes |
URL to the migration source Influx DB instance |
https://rngvmc0129.de.bosch.com:8086 |
source-username |
yes |
Username to read data from source database. Recommendation: use user with admin rights. |
admin |
source-password |
yes |
Password to the respective username to read data from source database |
influx-db-admin-secret! |
Target Database Configuration
| Parameter | Required | Description | Example |
|---|---|---|---|
target-db |
yes |
Name of the migration target database |
cm |
target-url |
yes |
URL to the migration target Influx DB instance |
https://rngvmc0129.de.bosch.com:8086 |
target-username |
yes |
Username to write data to target database. Recommendation: use user with admin rights. |
admin |
target-password |
yes |
Password to the respective username to write data to target database |
influx-db-admin-secret! |
Migration Related Parameters
| Parameter | Required | Description | Default | Example |
|---|---|---|---|---|
tenant-id |
yes |
Id of the tenant whose data should be migrated |
- |
7311ea8c-5d48-43fe-acf9-980eedf24b6c |
query-chunk-size |
yes |
The number of data points to be migrated at the same time. |
10000 |
10000 |
source-retention-exclusions |
no |
Comma-separated list of retention policies to be excluded from migration.If not set or empty, will migrate all retention policies |
- |
rp_process, rp_event |
skip-continuous-queries |
no |
Whether the continuous queries shall be skipped |
false |
true |
from-timestamp |
no |
Only data newer than this UTC ISO timestamp will be migrated |
- |
2022-11-30T11:09:35+01:00 |
to-timestamp |
no |
Only data up until this timestamp will be migrated |
The UTC timestamp when the migration tool was started |
2022-11-30T12:09:35+01:00 |
delete-source |
no |
Whether the Source data should be deleted or not |
false |
false |
disable-ssl |
no |
Whether to disable the SSL certificate checks |
false |
true |
ok-http-logging-interceptor-level |
no |
Sets the logging level of the OK HTTP Client. Accepted values: NONE, BASIC, HEADERS or BODY |
NONE |
BODY |
ok-http-read-timeout |
no |
OK HTTP Client read timeout in seconds |
30 |
30 |
ok-http-write-timeout |
no |
OK HTTP Client write timeout in seconds |
30 |
30 |
To get the help of all options:
java -jar influx-migration-<version>.jar --help
Example Run in Command Line:
java -jar influx-migration-0.0.1-SNAPSHOT.jar \ --source-url=https://rngvmc0129.de.bosch.com:8086 \ --source-db=cpm \ --source-username=admin \ --source-password=pool31-admin-influxdb-secret! \ --target-url=https://rngvmc0129.de.bosch.com:8086 \ --target-db=cpm_migration \ --target-username=admin \ --target-password=pool31-admin-influxdb-secret! \ --tenant-id=7311ea8c-5d48-43fe-acf9-980eedf24b6c \ --query-chunk-size=50 \ --source-retention-exclusions=rp_process,rp_msm_raw \ --from-timestamp=2022-11-30T11:09:35+01:00 \ --delete-source=false
With the above parameters, the tool will use the SSL protocol to establish the connections to both Source and Target and it will create all retention policies of the source database 'cpm' except 'rp_process' and 'rp_msm_raw' on the target database 'cpm_migration'. Next, it will copy all data newer than 11/30/2022 9:35:01 from all retention policies except 'rp_process' and 'rp_msm_raw' from database 'cpm' to database 'cpm_migration' in batches of 50 data points each. It will also create Continuous Queries on 'cpm_migration' that existed on 'cpm' and are related to any but the excluded retention policies, and will NOT delete the original source data