3.3.8 Automated & manual backup and manual restore

Critical information required to understand how to regularly backup and restore your citizen registration data in case of a server problem.

Automated backup process to an external server

If you configured a separate backup server and used the external backup server optional SSH properties in the Ansible script in step 3.3.2 Install dependencies, then every day OpenCRVS automatically backs up all databases to the following directories on the manager node.

Every 7 days the Mongo data is deleted to save disk space. The Ansible playbook sets this up as an option in this step. The files are backed up to that server in the middle of the night.

TO AVOID CITIZEN DATA LOSS... You must configure and test this automated backup process in production. Operationallly, we highly recommend that once a week, these files should be saved to a password protected and encrypted external harddrive and stored in a secure and approved location.

Backup files

Mongo backups are saved as mongo .gz zip files using the date of the backup here:

/data/backups/mongo/hearth-dev-<date>.gz
/data/backups/mongo/openhim-dev-<date>.gz
/data/backups/mongo/user-mgnt-<date>.gz
/data/backups/mongo/application-config-<date>.gz
/data/backups/mongo/metrics-<date>.gz 
/data/backups/mongo/webhooks-<date>.gz // Only exists if a webhook integration is configured

Elasticsearch snapshot files and indices are saved here.

The entire elasticsearch folder contains all snapshots and must be preserved indefinitely

/data/backups/elasticsearch

InfluxDB backup files are saved into a date named directory here:

/data/backups/influxdb/<date>

Commencing with OpenCRVS v1.2, Minio attachments are saved as a .gz zip file using the date of the backup here:

/data/backups/minio/ocrvs-<date>.tar.gz

Manual backup process

You can manually run the automated backup script at any point.

  1. SSH into your server and navigate to the following directory:

cd /opt/opencrvs/infrastructure/

2. Ensure that your database secrets are available to the script as environment variables. You can do this by running:

export ELASTICSEARCH_ADMIN_USER=elastic \
export ELASTICSEARCH_ADMIN_PASSWORD=<your elastic password> \
export MONGODB_ADMIN_USER=< your mongo username> \
export MONGODB_ADMIN_PASSWORD=<your mongo password>

3. The backup script is run like this, with the parameters explained below. Even if you have not set up an external server that the manager node can SSH into, the files will still backup to the directory locations above. Just the SSH part will fail. So once the script finishes you can use rsync to copy the files manually off your server and into a local directory.

If you are migrating between versions of OpenCRVS, you could use the SSH details for your new server running a newer version of OpenCRVS.

Replace and separate the <parameters> with a space when calling the script

bash ./emergency-backup-metadata.sh --ssh_user=<SSH_USER> --ssh_host=<SSH_HOST> --ssh_port=<SSH_PORT> --production_ip=<PRODUCTION_IP> --remote_dir=<REMOTE_DIR> --replicas=<REPLICAS> --label=<LABEL>
Parameter
Optional or Mandatory
Descrption

SSH_USER

Mandatory

This is the SSH user for your backup server. If running manually you can enter anything, such as: root

SSH_HOST

Mandatory

This is the IP address for your backup server. If running manually you can enter anything, such as: 132.42.41.15

SSH_PORT

Mandatory

This is the SSH port for your backup server. If running manually you can enter anything, such as: 22

PRODUCTION_IP

Mandatory

This is the public IP address for the manager server node for your OpenCRVS cluster. It must be entered correctly.

REMOTE_DIR

Mandatory

This is the directory path on the backup server that you wish to copy the backup files into. E.G. /root/opencrvs-backups. If running manually you can enter anything.

REPLICAS

Mandatory

The number of servers in your OpenCRVS cluster. This value can only be equal to: 1, 3 or 5

LABEL

Optional

Normally, today's date is used as a suffix to name the backup files in the directories as explained above. If you want to name the backup files something different, you can enter a lowercase string here that will be used as a replacement file name suffix.

If using a custom LABEL above, it must be lowercase, otherwise Elasticsearch cannot make a snapshot.

4. When complete, exit your server:

exit

5. This command copies files from a server to your current local directory using rsync:



rsync -av -r --ignore-existing  --progress <ssh-user>@<opencrvs-manager-server-ip>:/data/backups .

Manual restore process

Before you attempt a data restore, please ensure that you have read and understand these 2 warnings ...

Warning 1: Ensure you are restoring to a new, CLEAN, OpenCRVS installation that has NOT BEEN SEEDED.

You should only restore to a clean installation of OpenCRVS: TO AVOID CITIZEN DATA LOSS and avoid potential issues that may be very difficult to debug. THESE COMMANDS ENTIRELY REPLACE ALL DATA AND BACKUPS ON YOUR OPENCRVS SERVER! THEY CANNOT BE REVERTED!

Warning 2: Prepare operationally for a restore. Test a restore on a test server first. Schedule the production restore when staff can cease operations.

If you are restoring a backup from a previous version of OpenCRVS, we have some large data migrations to run. These migrations may take several hours to complete. In this case, we recommend performing a data restore on an entirely new set of servers. When the restore and subsequent migrations are complete, you can then change your DNS settings to point to your new server with confidence. CEASE CIVIL REGISTRATION ACTIVITIES DURING A RESTORE. CONSIDER PERFORMING A RESTORE DURING NATIONAL HOLIDAYS TO AVOID RISK OF DATA LOSS.

  1. To perform a restore, ensure that you have backup files in the day's folders you wish to restore from. If this is a new environment, you would need to copy the backed up files and folders into the locations using scp or rsync.

Copy an Elasticsearch backup onto the server like this:

rsync -av --delete --progress <local-path-to-dir>/backups/elasticsearch <your-ssh-user>@<opencrvs-manager-server-ip>:/data/backups

Copy a Mongo backup onto the server like this:

rsync -av --delete --progress <local-path-to-dir>/backups/mongo <your-ssh-user>@<opencrvs-manager-server-ip>:/data/backups

Copy an InfluxDB backup onto the server like this:

rsync -av --delete --progress <local-path-to-dir>/backups/influxdb <your-ssh-user>@<opencrvs-manager-server-ip>:/data/backups

Copy a Minio backup onto the server like this:

rsync -av --delete --progress <local-path-to-dir>/backups/minio <your-ssh-user>@<opencrvs-manager-server-ip>:/data/backups

Copy a VS Export backup onto the server like this:

rsync -av --delete --progress <local-path-to-dir>/backups/vsexport <your-ssh-user>@<opencrvs-manager-server-ip>:/data/vsexport

As of OpenCRVS v1.3.0, copy a VS Export backup onto the server like this:

rsync -av --delete --progress <local-path-to-dir>/backups/metabase <your-ssh-user>@<opencrvs-manager-server-ip>:/data/metabase

2. SSH into your server and navigate to the following directory:

cd /opt/opencrvs/infrastructure/
  1. Restart Elasticsearch to pick up the new snapshots availables to it

docker service update --force --update-parallelism 1 --update-delay 30s opencrvs_elasticsearch

4. The script is run like this, with the parameters explained below. Replace and separate the <parameters> with a space when calling the script

./emergency-restore-metadata.sh --label=<LABEL> --replicas=<REPLICAS>
Parameter
Optional or Mandatory
Description

LABEL

Mandatory

Normally, a date is used as a suffix to name the backup files. E.G. 01-01-2023 but if you named the suffix in a custom way using the LABEL parameter when creating the backup, enter the lowercase file suffix for your backups here.

REPLICAS

Mandatory

The number of servers in your OpenCRVS cluster. This value can only be equal to: 1, 3 or 5

The script will prompt you to answer "Yes" or "No" to continue as this is a destructive and irreversible action.

  1. You next need to take down and re-deploy OpenCRVS. This is because, every time an OpenCRVS deployment is made, database secrets that microservices use are rotated. But as you have just restored and migrated from a backup, the database secrets are now out of date. You must take OpenCRVS down. SSH into the server and run:

docker stack down opencrvs

6. Exit the server:

exit

7. Now, re-deploy following the instructions with the --clear_data property set to "no" of course, so as not to clear the databases.

8. Once data has been restored to your server, migrations will optionally run on your data depending on how old the backup is. You can check the progress status of your migrations in Kibana.

If you are restoring from a backup that was created using the currently deployed version of OpenCRVS, migrations will not need to run. You can skip this step and proceed to step 10 Otherwise, if you are restoring from a backup that was created using a previous version of OpenCRVS, migrations will need to run. In that case read the next warning and pay close attention to step 10.

We have an example production server, in which we have created 50,000 test birth and death registrations. Migrating the data on that server between a backup made using v1.1.* and v1.2.* took 5 hours to complete. DO NOT ATTEMPT TO USE OPENCRVS WHILE MIGRATIONS ARE RUNNING

To login to Kibana, visit https://kibana.<your_domain> and login using the username: "elastic" and the ELASTICSEARCH_SUPERUSER_PASSWORD you configured in this step.

In the left navigation, select "Observability", then "Logs"

In the search bar, enter this text: "tag: migration"

Click the "Stream live" button for live output from the migration microservice. You will see output like this, potentially for many hours!!! NOTE: After a few minutes the "Stream live" feature stops working to save resources. To get latest results, try refreshing the page in this scenario.

When the migrations are completed, you will see the word "Done".

9. You can login to OpenCRVS and test that all your data has restored successfully.

If for whatever reason, migrations do not run due to an error, you will see the word "PENDING" next to each migration. Please contact us on this occurance as there may be a bug requiring a hotfix. Once a hotfix has been applied, you can always re-run migrations at any time by running this command on the server:

docker service update --force --update-parallelism 1 --update-delay 30s opencrvs_migration

Last updated