Setup infrastructure
Data Center
OpenCRVS should only be provisioned on servers located in an equivalent minimum of a certified Tier 2 or 3 Datacenter.
Implementers should refer to the “Uptime Institute” design documents for specific requirements associated with Tier 2 & 3 certification. At a high-level, the datacenter should have:
Uninterrupted power supply with independent, backup power generation
Air conditioning
24/7 security access for authorised technical staff only
Automatic server backup off-site
Failsafe internet connectivity
Security policies and procedures in place
Network administrator staff capable of configuring and maintaining a scalable VPN solution
We appreciate that connectivity is a challenge in many countries where we work. The data centre should have an absolute minimum of a 10Mbps internet connection to the servers otherwise deploying to the servers will be unworkable.
Server environments
Before proceeding to discuss server specifications, it is important to understand the following server environment glossary that we will be referring to in our example countryconfig reference implementation and further sections.
production
A live environment containing citizen data e.g: personally identifiable information (PII).
2FA codes generated for production user access
staging (pre-production / mirror)
A mirror of a live environment, used for final Quality Assurance of a production deployment containing a daily restored backup of citizen data (PII) from the previous day.
2FA codes generated for production user access
qa
A quality assurance environment for tester, trainer & developer use supporting the Quality Assurance of releases, training staff.
Test 2FA codes of 6 zeros allow test user access.
backup
A low specification environment that simply stores encrypted backups from production for long term recovery.
Not applicable. OpenCRVS software does not run on this environment.
development
An environment you can use for training and development purposes only. NOT FOR PRODUCTION USE!!
Test 2FA codes of 6 zeros allow test user access.
Before proceeding to discuss network specifications, it is important to understand the following other concepts:
vpn: All servers must be protected behind a government virtual private network (VPN). Users must authenticate via the VPN to access OpenCRVS in a browser. The country should provide and operate the VPN. When using self-hosted GitHub Actions runners, place those runners inside the VPN or on the internal network so they can reach servers directly; no VPN tunnel from GitHub-hosted services is required.
Continuous provisioning & deployment via GitHub Actions: OpenCRVS provides GitHub Actions workflows for automated provisioning and deployment. A GitHub organisation is required. Self-hosted runners deployed within your VPN/internal network (recommended).
bastion or jump: An optional bastion (jump) host can consolidate and control SSH access to servers behind the VPN without distributing VPN credentials. Bastions are useful for administrative SSH access, auditing and as an alternative deployment hop even when using self-hosted runners inside the VPN.
Server specifications
Refer to these minimum server specifications for the above environments. Note that the hard-disk space specifications are illustrative. Depending on the population size, number of records to migrate and number of supporting documents that are required to be captured during civil registration business processes, you may require more RAM / disk-space.
These are absolute minimum specifications.
Regardless your system administrators must be capable of monitoring and increasing server disk-space on demand. :
Minimum server specifications
development (learning / proof-of-concept) / qa
16 GB RAM · 4 vCPU · 320 GB disk · Ubuntu 24.04 LTS x64 (headless)
production / staging
16 GB RAM · 8 vCPU · disk space calculated using formula above · Ubuntu 24.04 LTS x64 (headless)
backup
1 GB RAM · 2 vCPU · disk space calculated using formula above (recommend 2× application server size) · Ubuntu 24.04 LTS x64 (headless)
Notes:
Disk-space values are minimums and illustrative; adjust based on population, attachments and retention needs.
Ensure administrators can monitor and expand disk capacity on demand.
Production clusters should follow recommendations in the "Server clusters by project" section for HA and scalability.
Ubuntu version
First, login as root, or if you only have sudoer access, do sudo -i.
If you are not using the correct version of Ubuntu, either recreate the server or upgrade Ubuntu.
Production / staging / backup disk space requirements
Calculated disk space doesn't include space for monitoring and logs. Please check "Monitoring disk space requirements" for more details.
Required disk space for production, staging and backup environments is calculated using the expected number of records per year and the estimated average number of attachments. The number of participating locations should be taken into account.
Please use the following formula: &#xNAN;attachments_per_year = number of births, deaths.. records per year * average number of attachments * 0.4MB
record_data_per_year = number of births, deaths.. records per year * 18.33kB &#xNAN;operating_system_requirements = 100GB
minimum_required_disk_space = operating_system_requirements + record_data_per_year + attachments_per_year
Using an average size country with population of 30M, crude birth rate of 17.299 births per 1000 people and crude death rate of 7.7 per 1000 we can calculate an estimate of records submitted every year Births per year: 518 970
Deaths per year: 231 000
Number of records per year: 749 970 per year
Choosing an average number of attachments of 3 we can calculate the total space needed per year
Attachments: 749 970 * 3 * 0.4 = 899 964 MB or 899 GB
Record data: 749 970 * 18.33 = 13 746 950 kB or 13.74 GB
Combining that with the minimum disk space reserved for the system, we conclude the minimum required disk space for application servers in this example is 1012.47 GB.
For backup servers, we recommend storage size twice the size of application servers so in this case 2024.94 Gb.
Work is ongoing in OpenCRVS to optimise storage in future versions.
Monitoring disk space requirements
In default configuration monitoring data is stored for 30 days. Monitoring data size depends on filebeat configuration (scrape frequency, collected metrics, labels, tags). OpenCRVS is using custom filebeat configuration file, optimised to store only valuable data. Average disk size for monitoring data is 200Mb host/day. In general value can be calculated by formula:
200Mb: disk size per daydays: number of days to store logshosts: number of hosts to store logs1Gb: is minimal extra-space for each VM
For single VM at least 7Gb of additional disk space is needed to store monitoring data for 30 days:
For Kubernetes cluster with 2 VMs at least 14Gb of additional disk space will be needed:
If scrape frequency, collected metrics, labels, tags were adjust then make sure disk size per day value is up to date.
Logging disk space requirements
Logging data is stored as Elasticsearch index and can be accessed any time in Kibana.
Logging data is generated by few different sources:
Elastic APM agents installed within critical OpenCRVS components
OpenCRVS application and datastores logs
Operating system logs
By default OpenCRVS monitoring helm chart is configured to store data for 1 week only. There is no way to estimate logging data usage, but it's recommended to keep at least 10Gb of disk space for logs.
Disk layout requirements
By default OpenCRVS stores citizens records, monitoring and logging in /data folder. There are few options available to define disk partitioning:
Single disk partition: disk partition mounted as
/has sufficient space to store all data produced by OpenCRVS. At provision time/datafolder is created by ansible scripts.Single disk partition with encryption: same as previous, with encryption enabled OpenCRVS will create encrypted file on disk
/cryptfs_file_sparse.img, allocate proper file size and mount file as/datapartition.
If your datacentre is physically secure we do not recommend encryption. If your data cerntre is insecure and you wish to enable encryption, pay close attention to the optional disk encryption for lower security data centres section below
Dedicated disk partition for data: System administrator may decide to use dedicated disk partition (e/g LVM, NAS) to store citizens data.
Other layouts are possible, but not supported by OpenCRVS installation scripts. OpenCRVS Dependencies helm chart allows to define other ways to store files by using Kubernetes storage classes.
Verify the disk has been partitioned correctly
We want to ensure the partition mounted to / has enough disk space. OpenCRVS citizen data will be stored in the following location:
Here is example of disk layout:
This server has 280GB available after the operating system has been deployed. You should set aside a further 50-75GB for Docker images. So only 205GB - 230GB is available. It is important to remember this value if you plan to configure disk encryption.
Regarding optional disk encryption for lower security data centres
Only use encryption if your data centre is equivalent to a Tier 2 or lower, where physical security may not be at its optimum. If your data centre tier is higher, and extremely secure, there should be no need to encrypt the disk.
To use 200GB, you would enter "200g" when prompted.
It is optional to LUKS encrypt this location so that your data is encrypted at rest. You will be asked if you wish to encrypt and how much server space you should apply to the encrypted disk.
The secret ENCRYPTION_KEY is used on reboot to decrypt and mount this folder. To take advantage of this feature, amend the location of the key to a secure location in infrastructure/server_setup/group_vars/all.yml:
All the secrets are explained in more detail in the section 4.3.1.1 Environment secrets and variables explained.
Server clusters by project
The number of servers required in a load balanced cluster is configurable depending on the project and population size. Please take note of these recommendations.
Proof-of-concept (P.O.C.)
For a proof-of-concept (P.O.C.) of OpenCRVS, we use 1 qa server with no backup, operating under the condition that no live citizen data is captured during a P.O.C: qa x 1
Pilot
A total of 4 servers are required for pilot implementations that capture citizen data. One for each environment: qa x 1, production x 1, staging x 1 & backup x 1.
National scale
For national scale implementations, we recommend deploying to a production server cluster of 2 - 5 production servers depending on population size.
It is recommended to deploy the production environment on a cluster of at least 2 servers. This ensures high availability and prevents downtime or data loss in the event of a server failure.
< 30M
qa x 1, production x 2, staging x 1 & backup x 1
30M - 60M
qa x 1, production x 3, staging x 1 & backup x 1
60M+
qa x 1, production x 5, staging x 1 & backup x 1
Network
Refer to the following network diagram as a reference example of how to network your server cluster.


Server administrator SSH access & permissions:
During provisioning, the server administrator requires SSH access through the provided VPN to all servers with sudo permissions.
During installation of OpenCRVS, SSH config to all servers will be modified, blocking password based SSH authentication, root user access, configuring 2FA authentication and alerting for all future SSH access.
Once provisioned, there should be no need for technical staff to ever SSH into a server during day-to-day operations. Every SSH access going forward is audited via a Slack notification to all technical staff thanks to these provisioned alerts.
User access
The following users will access 3 of the environments: qa, production & staging, via a VPN client:
Existing Civil Registration staff that access the OpenCRVS client using the Chrome browser on desktops/laptops/mobile devices.
3rd party approved government staff (e.g. Healthcare staff in hospitals) that access the OpenCRVS client using the Chrome browser on desktops/mobile devices.
Your development and QA team that access the OpenCRVS client using the Chrome browser on desktops/laptops/mobile devices.
Potential future automated integrations from approved healthcare services using our APIs with VPN access
Potential future automated integrations external gov services using our APIs with VPN access
Automated continuous deployment scripts from a private Github code repository.
All user workstations / tablets / smartphones and integrating APIs will require compatible VPN clients and accounts.
Egress (outbound) internet access
In addition to serving user traffic the OpenCRVS infrastructure needs to be able to communicate outbound. This egress traffic includes things like pulling in latest updates, monitoring and emails.
Check that the servers have internet connectivity. The servers must be able to access Dockerhub, Sentry and other internet services such as Ubuntu update repositories, Email & SMS apis for example. Therefore check if you can ping google.com from inside the servers.
If your VPN requires a whitelist of allowed domains, the following are the known domains which the servers require access to:
Email (SMTP) server
You must have a working SMTP server and SMTP user details to deploy OpenCRVS. Staff onboarding and monitoring requires an Email service.
Following variables are required to successfully deploy OpenCRVS on server environment
SMTP_HOST: Hostname or IP address of your smtp serverSMTP_PORT: Port where smtp server is listeningSMTP_SECURE: Use TLS for connectionSMTP_USERNAME: Username or email used to authenticate as a email client on smtp serverSMTP_PASSWORD: Password or API token depend on your email providerSENDER_EMAIL_ADDRESS: All emails will be send with this email in sender fieldALERT_EMAIL: Email address for alerting, this field is often used to integrate with Slack, Google Chart or any other corporate communication tool.
Last updated