What is the OpenHIM Platform and what can you use it for?
OpenHIM platform is an easy way to set up, manage and operate a Health Information Exchange (HIE). Specifically, it is the following:
A toolbox of open-source tools, grouped into packages, that are used within an HIE.
The glue that ties these tools together. These are often in the form of OpenHIM mediators which are just microservices that talk to OpenHIM.
A CLI tool to deploy and manage these packages.
The Problem
We at Jembi want to stop rebuilding solutions from near scratch each time we need an HIE implementation. It would be beneficial to us and others doing the same work to focus more on the unique needs of a country rather than the intricacies of a production deployment of an HIE.
Operating production-grade HIE systems is hard, because of these issues:
Need to support up to national scale
An always-present need for high level of security
Difficulty of deploying complex systems that have many components
The Solution
OpenHIM Platform provides an opinionated way to deploy, secure and scale highly-available services for an HIE environment. It provides a set of services to solve common HIE challenges:
Patient matching
FHIR support
Reporting services
OpenHIM Platform is powered by the .
Master Patient Index
Note: This recipe is in a pre-release alpha stage. It's usable but do so at your own risk.
This recipe sets up an HIE that deploys JeMPI behind the OpenHIM with a mapping mediator configured to allow for FHIR-based communication with JeMPI. It also deploys Keycloak for user management and authentication.
To launch this package in dev mode copy and paste this into your terminal in a new folder (ensure you have the instant CLI installed):
The OpenHIM Platform includes a number of base packages which are useful for supporting Health Information Exchanges Workflows. Each section below describes the details of these packages.
Package can be stood up individually using the instant package init -n <package_name> command, or they can be included in your own recipes. This can be accomplished by that includes the necessary packages and any custom configuration packages.
Central Data Repository with Data Warehousing
Note: This recipe is in a pre-release alpha stage. It's usable but do so at your own risk.
This recipe sets up an HIE that does the following:
Accept FHIR bundles submitted securely through an IOL (OpenHIM)
Stores Clinical FHIR data to a FHIR store (HAPI FHIR)
Stores Patient Demographic data to an MPI (JeMPI)
Pushes FHIR resources to Kafka for the reporting pipeline (and other systems) to use
Pulls FHIR data out of Kafka and maps it to flattened tables in the Data Warehouse (Clickhouse)
Allows for the Data Warehouse data to be visualised via a BI tool (Apache Superset)
To launch this package in dev mode copy and paste this into your terminal in a new folder (ensure you have the ):
Services
When deployed in --dev mode the location of the UIs will be as follows:
Service
URL
Auth
Extra UIs only exposed in --dev mode:
Service
URL
Auth
Example use
Use the following example postman collection to see interaction you cna have with the system and see how the system reacts.
Getting Started
What you need to start using OpenHIM Platform.
Prerequisites
Before getting started with OpenHIM Platform you will need to have Instant OpenHIE tool installed and functional. .
Central Data repository (no reporting)
Note: This recipe is in a pre-release alpha stage. It's usable but do so at your own risk.
This recipe sets up an HIE that does the following:
Accept FHIR bundles submitted securely through an IOL (OpenHIM)
Stores Clinical FHIR data to a FHIR store (HAPI FHIR)
Stores Patient Demographic data to an MPI (JeMPI)
Pushes FHIR resources to Kafka for other external systems to use
To launch this package in dev mode copy and paste this into your terminal in a new folder (ensure you have the instant CLI installed):
If you're a Windows user and are using WSL2 to be able to run the platform: you should limit the amount of RAM/CPU that will be used by WSL, for more details please check the following link: Limiting memory usage in WSL2.
Quick Start
Ensure Docker Swarm in initialised:
Download the latest OpenHIM Platform config file which configures Instant OpenHIE v2 to use OpenHIM Platform packages:
Download the latest environment variable file, which sets configuration options for OpenHIM Platform packages:
Launch some OpenHIM Platform packages, e.g.
This launches the OpenHIM and Kafka packages in dev mode (which exposes service ports for development purposes) using the config supplied in the env var file.
To destroy the setup packages and delete their data run:
Next, you might want to browse the recipes available in OpenHIM Platform. Each recipe bundles a set of packages and configuration to setup an HIE for a particular purpose.
For example, this command allows the most comprehensive recipe to be deployed with one command:
Alternatively you can also browse the individual set of packages that OpenHIM Platform offers. Each package's documentation lists the environment variables used to configure them.
For more information on how to start stop and destroy packages using the command line, see the Instant OpenHIE 2 CLI docs.
Please join us on Discord for support or to chat about new features or ideas.
Recipes
Pre-defined recipes for common use cases
OpenHIM platform comes bundled with a set of generic packages that can be deployed and configured to support a number of different use cases. To help users of OpenHIM Platform get started with something they can make use of immediately, a number of default OpenHIM Platform recipes are provided. These help you get started with everything you need setup and configured for a particular use case.
These recipes combine and configure multiple packages together so that a functional HIE is stood up that is pre-configured to support a particular use case.
We currently support the following default recipes:
Monitoring
A package for monitoring the platform services
The monitoring package sets up services to monitor the entire deployed stack. This includes the state of the servers involved in the docker swarm, the docker containers themselves and particular applications such as Kafka. It also captures the logs from the various services.
This monitoring package uses:
Grafana: for dashboards
Interoperability Layer Openhim
The interoperability layer that enables simpler data exchange between the different systems. It is also the security layer for the other systems.
A FHIR-based Shared Health record linked to an MPI for linking and matching patient demographics and a default reporting pipeline to transform and visualise FHIR data.
Central Data Repository
A FHIR-based Shared Health record linked to an MPI for linking and matching patient demographics. No reporting is include but all FHIR data is pushed to Kafka for external system to use.
Master Patient Index
A master patient index setup using JeMPI. it also includes OpenHIM as the API gateway providing security, a mapping mediator to allow FHIR-based communication with JeMPI and Keycloak to support user management.
Checking the transactions logs
Configuring the channels to route the events
User authentication logs
Service logs
Rerun the transactions tasks
Reprocess mediator launching
OpenHIM is based on two 3 main services, openhim-core as a backend, openhim-console as a frontend and mongo as a database.
It is a mandatory component in the stack and the entry point for all incoming requests from the external systems.
Node Exporter: for monitoring host machine metrics like CPU, memory etc
To use the monitoring services, include the monitoring package id to your list of package ids when standing up the platform.
Adding application specific metrics
The monitoring service utilises service discovery to discover new metric endpoints to scrape.
To use custom metrics for an application, first configure that application to provide a Prometheus compatible metrics endpoint. Then, let the monitoring service know about it by configuring specific docker service labels that tell the monitoring service to add a new endpoint to scrape. E.g. see lines 8-9:
prometheus-job-service lets Prometheus know to enable monitoring for this container and prometheus-address gives the endpoint that Prometheus can access the metrics on. By default this is assumed to be at the path /metrics by Prometheus.
By using the prometheus-job-service label prometheus will only create a single target for your application even if it is replicated via service config in docker swarm. If you would like to monitor each replica separately (i.e. if metrics are only captured for that replica and not shared to some central location in the application cluster) you can instead used the prometheus-job-task label and Prometheus will create a target for each replica.
A full list od supported labels are listed below:
prometheus-job-service - indicates this service should be monitored
prometheus-job-task - indicates each task in the replicated service should be monitored separately
prometheus-address - the service address Prometheus can scrape metrics from, can only be used with prometheus-job-service
prometheus-scheme - the scheme to use when scaping a task or service (e.g. http or https), defaults to http
prometheus-metrics-path - the path to the metrics endpoint on the target (defaults to /metrics)
prometheus-port - the port of the metrics endpoint. Only usable with prometheus-job-task, defaults to all exposed ports for the container if no label is present
All services must also be on the prometheus_public network to be able to be seen by Prometheus for metrics scraping.
Adding additional dashboards
To add additional dashboards simply use docker configs to add new Grafana dashboard json files into this directory in the Grafana container: /etc/grafana/provisioning/dashboards/
That directory will be scanned periodically and new dashboards will automatically be added to Grafana.
Grafana dashboard json file may be exported directly from the Grafana when saving dashboards or you may lookup the many existing dashboard in the Grafana marketplace.
As part of the Interoperability Layer setup we also do some initial config import for connecting the services together.
OpenHIM: Import a channel configuration that routes requests to the Data Store - HAPI FHIR service
This config importer will import channels and configuration according to the file openhim-import.json in the folder <path to project packages>/interoperability-layer-openhim/importer/volume.
Kafka Mapper Consumer
A Kafka consumer that maps FHIR resources to a flattened data structure.
String
Password of Grafana service
No
dev_password_only
GF_SECURITY_ADMIN_USER
String
Username of Grafana service
No
admin
GF_SECURITY_ADMIN_PASSWORD
Job Scheduler Ofelia
A job scheduling tool.
Environment Variables
Listed in this page are all environment variables needed to run Kafka mapper consumer.
Variable Name
Type
Relevance
Required
Default
KAFKA_HOST
String
Kafka hostname
No
Data Mapper Logstash
Generic Logstash pipeline for ELK stack.
Logstash provides a data transformation pipeline for analytics data. In the platform it is responsible for transforming FHIR messages into a flattened object that can be inserted into Elasticsearch.
Input
Logstash allows for different types of input to read the data: Kafka, HTTP ports, files, etc.
Local Development
Generic Logstash pipeline for ELK stack.
Adding pipelines and configs
To add Logstash config files, you can add files into the <path to project packages>/data-mapper-logstash/pipeline.
Environment Variables
Listed in this page are all environment variables needed to run Logstash.
Variable Name
Type
Relevance
Required
Default
Local Development
A Kafka consumer that maps FHIR resources to a flattened data structure
Kafka-mapper-consumer
A Kafka processor that will consume messages from Kafka topics. This messages will be mapped according to the mapping defined in the file called fhir-mapping.json.
This flattened data will be then sent to Clickhouse DB to be stored.
Local Development
A job scheduling tool.
Ofelia - Job Scheduler
The platform uses image: mcuadros/ofelia:v0.3.6 which has the following limitations:
Filters
With a set of filters and plugins, the data can be transformed, filtered, and conditioned.
This allows the creation of a structured and flattened object out of many nested and long resources.
Accessing the different fields will be much easier and we will get rid of the unused data.
Output
To save the data, Logstash provides a set of outputs such as: Elasticsearch, S3, files, etc.
Developing the Logstash configs locally
When seeking to make changes to the Logstash configs without having to repeatedly start and stop the service, one can set the LOGSTASH_DEV_MOUNT env var in your .env file to true to attach the service's config files to those on your local machine.
Cluster
When attaching Logstash to an Elasticsearch cluster ensure you use the ES_HOSTS environment variable. eg. ES_HOSTS="analytics-datastore-elastic-search-1:9200","analytics-datastore-elastic-search-2:9200","analytics-datastore-elastic-search-3:9200" and reference it in your logstash configs eg. hosts => [$ES_HOSTS]
Notes
With LOGSTASH_DEV_MOUNT=true, you have to set the LOGSTASH_PACKAGE_PATH variable with the absolute path to package containing your Logstash config files, i.e., LOGSTASH_PACKAGE_PATH=/home/user/Documents/Projects/platform/data-mapper-logstash.
WARNING: do not edit the pipeline files from within the Logstash container, or the group ID and user ID will change, and subsequently will result in file permission errors on your local file system.
Each topic has its own table mapping, plugin and filter and one topic may be mapped in different ways.
An example of fhir-mapping.json can be found in the package.
Each new message with new ID will be inserted as a new row in the table defined in the mapping. An update of the message will result on update in Clickhouse DB accordingly.
Link to GitHub repo: https://github.com/jembi/kafka-mapper-consumer.
kafka
KAFKA_PORT
Number
Kafka port
No
9092
CLICKHOUSE_HOST
String
Clickhouse hostname
No
analytics-datastore-clickhouse
CLICKHOUSE_PORT
String
Clickhouse port
No
8123
LOGSTASH_DEV_MOUNT
Boolean
DEV mount mode enabling flag
No
false
LOGSTASH_PACKAGE_PATH
String
Logstash package absolute path
yes if LOGSTASH_DEV_MOUNT is true
LS_JAVA_OPTS
String
JVM heap size, it should be no less than 4GB and no more than 8GB (maximum of 50-75% of total RAM)
No
-Xmx2g -Xms2g
ES_ELASTIC
String
ElasticSearch Logstash user password
Yes
dev_password_only
ES_HOSTS
String
Elasticsearch connection string
Yes
analytics-datastore-elastic-search:9200
KIBANA_SSL
Boolean
SSL protocol requirement
No
True
LOGSTASH_MEMORY_LIMIT
String
RAM usage limit
No
3G
LOGSTASH_MEMORY_RESERVE
String
Reserved RAM
No
500M
LOGSTASH_INSTANCES
Number
Number of service replicas
No
1
Ofelia does not support config.ini files when run in docker mode (which enables scheduling jobs with docker labels) thus we need to always use the config.ini file for creating jobs.
Ofelia does not support attaching to a running instance of a service.
Ofelia does not support job-run (which allows you to launch a job with a specified image name) labels on non-ofelia services (ie. you may not specify a job of type job-run within the nginx package as ofelia will not pick it up)
Ofelia only initializes jobs when it stands up and does not listen for new containers with new labels to update it's schedules, thus Ofelia needs to be re-up'd every time a change is made to a job that is configured on another service's label.
Example of a job config
An example of job config in the file config.example.ini existing in the folder <path to project packages>/job-scheduler-ofelia/.
Launching this package executes the following two steps:
Running Clickhouse service
Running config importer to run the initial SQL script
Initializing ClickHouse
The config importer will be launched to run a NodeJS script after ClickHouse has started.
It will run SQL queries to initialize the tables and the schema, and can also include initial seed data if required.
The config importer looks for two files clickhouseTables.js and clickhouseConfig.js found in <path to project packages>/analytics-datastore-clickhouse/importer/config.
For specific implementation, this folder can be overridden.
Analytics Datastore - Clickhouse
Clickhouse is a SQL datastore.
Analytics Datastore - Elasticsearch
Elasticsearch is the datastore for the Elastic (ELK) Stack.
Environment Variables
Listed in this page are all environment variables needed to run and initialize Elasticsearch.
Variable Name
Type
Relevance
Required
Default
ES_ELASTIC
String
Elasticsearch super-user password
Yes
dev_password_only
Environment Variables
Listed in this page are all environment variables needed to run Clickhouse.
Variable Name
Type
Relevance
Required
Default
CLICKHOUSE_HOST
String
The service name (host) of Clickhouse
No
analytics-datastore-clickhouse
CLICKHOUSE_PORT
Environment Variables
Listed in this page are all environment variables needed to run Kibana.
Variable Name
Type
Relevance
Required
Default
SANTEMPI_INSTANCES
Number
Number of service replicas
No
1
SANTEMPI_MAIN_CONNECTION_STRING
Note
The environment variable SANTEMPI_REPMGR_PARTNER_NODES will differ from cluster and single mode.
Default value for SANTEMPI_MAIN_CONNECTION_STRING:
Default value for SANTEMPI_AUDIT_CONNECTION_STRING:
ES_KIBANA_SYSTEM
String
The password for the user Kibana used to connect and communicate with Elasticsearch
Yes
dev_password_only
ES_LOGSTASH_SYSTEM
String
The password for the user Logstash used to map and transform the data before storing it in Elasticsearch
Yes
dev_password_only
ES_BEATS_SYSTEM
String
The password for the user the Beats use when storing monitoring information in Elasticsearch
Yes
dev_password_only
ES_REMOTE_MONITORING_USER
String
The password for the user Metricbeat used when collecting and storing monitoring information in Elasticsearch. It has the remote_monitoring_agent and remote_monitoring_collector built-in roles
Yes
dev_password_only
ES_APM_SYSTEM
String
The password for the user of the APM server used when storing monitoring information in Elasticsearch
Yes
dev_password_only
ES_LEADER_NODE
String
Specify the leader service name (the service name in case single mode and the leader service name in case cluster mode)
This is used for the config importer. Specifying the service name to initialize the mapping inside Elasticsearch
Yes
analytics-datastore-elastic-search
ES_HEAP_SIZE
String
The heap size is the amount of RAM allocated to the Java Virtual Machine of a node in Elasticsearch
It should be set -Xms and -Xmx to the same value (50% of the total available RAM to a maximum of 31GB)
No
-Xms2048m -Xmx2048m
ES_SSL
Boolean
This variable is used only for the config importer of Elasticsearch (internal connection between docker services the elastic and the importer)
No
false
ES_MEMORY_LIMIT
String
RAM usage limit of Elasticsearch service
No
3G
ES_MEMORY_RESERVE
String
Reserved RAM for Elasticsearch service
No
500M
ES_PATH_REPO
String
The path to the repository in the container to store Elasticsearch backup snapshots
No
/backups/elasticsearch
Number
The port that the service of Clickhouse is exposed to
No
8123
Running in Clustered Mode
Pre-Deploy Configuration
If running in clustered mode, take note that each machine has to have the following vm.max_map_count setting:
sysctl -w vm.max_map_count=262144
String
Connection string to SanteMPI
No
Check below table
SANTEMPI_AUDIT_CONNECTION_STRING
String
Audit connection string to SanteMPI
No
Check below table
SANTEMPI_POSTGRESQL_PASSWORD
String
SanteMPI postgreSQL password
No
SanteDB123
SANTEMPI_POSTGRESQL_USERNAME
String
SanteMPI postgreSQL username
No
santempi
SANTEMPI_REPMGR_PRIMARY_HOST
String
SanteMPI postgreSQL replicas manager primary host
No
santempi-psql-1
SANTEMPI_REPMGR_PARTNER_NODES
String
SanteMPI postgreSQL replicas manager nodes hosts
Yes
santempi-psql-1,santempi-psql-2,santempi-psql-
server=santempi-psql-1;port=5432; database=santedb; user id=santedb; password=SanteDB123; pooling=true; MinPoolSize=5; MaxPoolSize=15; Timeout=60;
server=santempi-psql-1;port=5432; database=auditdb; user id=santedb; password=SanteDB123; pooling=true; MinPoolSize=5; MaxPoolSize=15; Timeout=60;
Local Development
Elasticsearch is the datastore for the Elastic (ELK) Stack
Launching
Launching this package follows different steps:
[Cluster mode] Creating certificates and configuring the nodes
Running Elasticsearch
Setting Elasticsearch passwords
Importing Elasticsearch index
Importing
To initialize the index mapping in Elasticsearch, a helper container is launched to import a config file to Elasticsearch. The config importer looks for a field named fhir-enrich-report.json in <path to project packages>/analytics-datastore-elastic-search/importer.
The file fhir-enrich-report.json will contain the mapping of the index fhir-enrich-reports.
Elasticsearch will create a dynamic mapping for the incoming data if we don't specify one, this dynamic mapping may cause issues when we start sending the data as it doesn't necessarily conform 100% to the data types that we're expecting when querying the data out of Elasticsearch again.
Therefore, the mapping should be initialized in Elasticsearch using the config importer.
The file fhir-enrich-report.json is just an example, the name and the mapping can be overridden.
Running in Dev Mode
When running in DEV mode, Elasticsearch is reachable at:
http://127.0.0.1:9201/
Elasticsearch Backups
For detailed steps about creating backups see: .
Elasticsearch offers the functionality to save a backup in different ways, for further understanding, you can use this link: .
Elasticsearch Restore
To see how to restore snapshots in Elasticsearch: .
Local Development
Accessing the Service
Superset -
Client Registry - SanteMPI
A patient matching and deduplicater for the platform
The config importer written in JS will import the file superset-export.zip that exists in the folder <path to project packages>/dashboard-visualiser-superset/importer/config. The assets that will be imported to Superset are the following:
The link to the Clickhouse database
The dataset saved from Clickhouse DB
The dashboards
The charts
If you made any changes to these objects please don't forget to export and save the file as superset-export.zip under the folder specified above. NB! It is not possible to export all these objects from the Superset UI, you can check the Postman collection: CARES DISI CDR -> Superset export assets and you will find two requests. To do the export, three steps are required:
Run the Get Token Superset request to get the token (please make sure that you are using the correct request URL). An example of a response from Superset that will be displayed: { "access_token": "eyJ0eXAiOiJKV1...." }
Copy the access token and put it into the second request Export superset assets in the Authorization section.
Run the second request Export superset assets . You can save the response into a file called superset-export.zip under the folder specified above.
The config importer will import the file kibana-export.ndjson that exists in the folder <path to project packages>/dashboard-visualiser-kibana/importer.
The saved objects that will be imported are the index patterns and dashboards.
If you made any changes to these objects please don't forget to export them and save the file kibana-export.ndjson under the folder specified above.
When seeking to make changes to the Jsreport scripts/templates without having to repeatedly start and stop the service, one can set the JS_REPORT_DEV_MOUNT environment variable in your .env file to true to attach the service's content files to those on your local machine.
You have to run the set-permissions.sh script before and after launching Jsreport when JS_REPORT_DEV_MOUNT=true.
REMEMBER TO EXPORT THE JSREXPORT FILE WHEN YOU'RE DONE EDITING THE SCRIPTS. More info is available at
With JS_REPORT_DEV_MOUNT=true, you have to set the JS_REPORT_PACKAGE_PATH variable with the absolute path to the Jsreport package on your local machine, i.e., JS_REPORT_PACKAGE_PATH=/home/user/Documents/Projects/platform/dashboard-visualiser-jsreport
Export & Import
After editing the templates in Jsreport, you will need to save these changes, it is advised to export a file containing all the changes named export.jsrexport and put it into the folder <path to project packages>/dashboard-visualiser-jsreport/importer.
The config importer of Jsreport will import the export.jsrexport and then all the templates, assets, and scripts will be loaded in Jsreport.
Environment Variables
Listed in this page are all environment variables needed to run Ofelia.
The Ofelia service does not make use of any environment variables. However, when specifying jobs in the config.ini file(s) we can pass any environment variable in.
Example:
In the example above, OPENHIM_MONGO_URL is an environment variable.
Message Bus - Kafka
Kafka is a stream processing platform which groups like-messages together, such that the number of sequential writes to disk can be increased, thus effectively increasing database speeds.
Components
The message-bus-kafka package consists of a few components, those being Kafka, Kafdrop, and Kminion.
The services consuming from and producing to kafka might crash if Kafka is unreachable, so this is something to bear in mind when making changes to or restarting the kafka service.
Kafka
The core stream-processing element of the message-bus-kafka package.
Kafdrop
Kafdrop is a web user-interface for viewing Kafka topics and browsing consumer-groups.
Kminion
A prometheus exporter for Kafka.
Kafka Unbundler Consumer
A kafka processor to unbundle resources into their own kafka topics.
The kafka unbundler will consume resources of topix 2xx from Kafka, split them according to their resource type and send them again to Kafka to new topics.
Remember to shut down Jsreport before changing git branches if JS_REPORT_DEV_MOUNT=true, otherwise, the dev mount will persist the Jsreport scripts/templates across your branches.
Listed in this page are all environment variables needed to run Jsreport.
Variable Name
Type
Relevance
Required
Default
JS_REPORT_LICENSE_KEY
String
Service license key
Yes
Environment Variables
Listed in this page are all environment variables needed to run Kibana.
Variable Name
Type
Relevance
Required
Default
ES_KIBANA_SYSTEM
String
ElasticSearch auth username
Yes
KIBANA_INSTANCES
Environment Variables
Listed in this page are all environment variables needed to run Superset.
Variable Name
Type
Relevance
Required
Default
SUPERSET_USERNAME
String
Service username
No
admin
SUPERSET_FIRSTNAME
Dashboard Visualiser - Superset
Superset is a visualisation tool meant for querying data from a SQL-type database.
Version upgrade process (with rollback capability)
By default if you simply update the image that the superset service uses to a later version, when the container is scheduled it will automatically run a database migration and the version of superset will be upgraded. The problem, however, is that if there is an issue with this newer version you cannot rollback the upgrade since the database migration that ran will cause the older version to throw an error and the container will no longer start. As such it is recommended to first create a postgres dump of the superset postgres database before attempting to upgrade superset's version.
Exec into the postgres container as the root user (otherwise you will get write permission issues)
Run the pg_dump command on the superset database. The database name is stored in SUPERSET_POSTGRESQL_DATABASE and defaults to superset
Copy that dumpped sql script outside the container
Update the superset version (either through a platform deploy or with a docker command on the server directly -- docker service update superset_dashboard-visualiser-superset --image apache/superset:tag)
Rolling back upgrade
In the event that something goes wrong you'll need to rollback the database changes too, i.e.: run the superset_backup.sql script we created before upgrading the superset version
Copy the superset_backup.sql script into the container
Exec into the postgres container
Run the sql script (where -d superset is the database name stored in SUPERSET_POSTGRESQL_DATABASE)
Environment Variables
Listed in this page are all environment variables needed to run hapi-fhir package.
Variable Name
Type
Revelance
Required
Default
REPMGR_PRIMARY_HOST
String
Service name of the primary replication manager host (PostgreSQL)
No
postgres-1
JS_REPORT
String
Jsreport service password
No
dev_password_only
JS_REPORT_USERNAME
String
Jsreport service username
No
admin
JS_REPORT_SECRET
String
Secret password for the authentication of a cookie session related to the extension used in Jsreport
PostgreSQL replica set (host and port of the replicas)
Yes
postgres-1:5432
HAPI_FHIR_CPU_LIMIT
Number
CPU limit usage for hapi-fhir service
No
0 (unlimited)
HAPI_FHIR_CPU_RESERVE
Number
Reserved CPU usage for hapi-fhir service
No
0.05
HAPI_FHIR_MEMORY_LIMIT
String
RAM limit usage for hapi-fhir service
No
3G
HAPI_FHIR_MEMORY_RESERVE
String
Reserved RAM usage for hapi-fhir service
No
500M
HF_POSTGRES_CPU_LIMIT
Number
CPU limit usage for postgreSQL service
No
0 (unlimited)
HF_POSTGRES_CPU_RESERVE
Number
Reserved CPU usage for postgreSQL service
No
0.05
HF_POSTGRES_MEMORY_LIMIT
String
RAM limit usage for postgreSQL service
No
3G
HF_POSTGRES_MEMORY_RESERVE
String
Reserved RAM usage for hapi-fhir service
No
500M
HAPI_FHIR_INSTANCES
Number
Number of hapi-fhir service replicas
No
1
HF_POSTGRESQL_USERNAME
String
Hapi-fhir PostgreSQL username
Yes
admin
HF_POSTGRESQL_PASSWORD
String
Hapi-fhir PostgreSQL password
Yes
instant101
HF_POSTGRESQL_DATABASE
String
Hapi-fhir PostgreSQL database
No
hapi
REPMGR_PASSWORD
Strign
hapi-fhir PostgreSQL Replication Manager username
Yes
Message Bus Helper Hapi Proxy
A helper package for the Kafka message bus.
A helper for Kafka message bus service, It sends data to the HAPI FHIR datastore and then to the Kafka message bus based on the response from HAPI FHIR.
More particularly:
It receives messages from OpenHIM
It sends the data to the HAPI FHIR server and waits for the response
It gets the response. According to the response status, it will send the message to the topic that corresponds to that status (2xx, 4xx, 5xx, ... )
It will send back the response from HAPI FHIR to OpenHIM as well
Reverse Proxy Nginx
Reverse proxy for secure and insecure nginx configurations.
, Kafka's topics are imported to Kafka. The topics are specified using the , and must be of syntax:
topic or topic:partition:replicationFactor
FHIR Datastore HAPI FHIR
A FHIR compliant server for the platform.
The HAPI FHIR service will be used for two mandatory functionalities:
A validator of FHIR messages
A storage of FHIR message
Environment Variables
A kafka processor to unbundle resources into their own kafka topics.
Variable Name
Type
Relevance
Required
Default
Environment Variables
Listed in this page are all environment variables needed to run the Message Bus Kafka.
Variable Name
Type
Relevance
Required
Default
Local Development
A FHIR compliant server for the platform.
Instant OpenHIE FHIR Data Store Component
This component consists of two services:
Postgres
Environment Variables
Listed in this page are all environment variables needed to run Reverse Proxy Nginx.
Variable Name
Type
Relevance
Required
Default
Environment Variables
Variable Name
Description
Default
Environment Variables
Listed in this page are all environment variables needed to run Hapi-proxy.
Variable Name
Type
Relevance
Required
Default
OpenHIM Data
OpenHIM backup & restore
OpenHIM transaction logs and other data is stored in the Mongo database.
Restoring this data means restoring all the history of transactions which mandatory to recover in case something unexpected happened and we lost all the data.
In the following sections, we will cover:
Already implemented jobs to create backups periodically
How to restore the backups
Terraform
A tool that enables infrastructure as code to set up servers in AWS EC2.
Cloud Dev environments
To set up a developer's development environment in AWS, run this terraform project. The scripts will allow the joining of an existing VPC, the creation of a public subnet and a variable number of EC2 instances that the user will have SSH access to. Alarms have been created in the scripts which will auto-shutdown the instances after a configurable period, based on CPU metrics. A Lambda scheduled event can also be configured which can run at a regular interval to shut down any instances that may still be running.
Pre-requisites
A validator
Incoming messages from an EMR or Postman bundles are not always well structured and it may be missing required elements or be malformed.
HAPI FHIR will use a FHIR IG to validate these messages.
It will reject any invalid resources and it will return errors according to the IG.
HAPI FHIR is the first check to make sure the data injected in the rest of the system conforms to the requirements.
A storage
Backed by a PostgreSQL database, all the validated incoming messages will be stored.
This will allow HAPI FHIR to check for correct links and references between the resources, as well as another storage for backups in case the data is lost.
Using topics 2xx, 3xx, and metrics (partition=3, replicationFactor=1) as an example, we would declare:
The following job may be used to set up a backup job for clustered Mongo:
Restore
In order to restore from a backup you would need to launch a Mongo container with access to the backup file and the mongo_backup network by running the following command:
docker run -d --network=mongo_backup --mount type=bind,source=/backups,target=/backups mongo:4.2
Then exec into the container and run mongorestore:
This should only be done once per AWS account as there is a limit of 5 per region. Please check if this has already been run and use the existing VPC_ID and SUBNET_ID for the following section if it does and skips to the next section.
Navigate to the infrastructure/terraform/vpc directory
Initialize Terraform project:
Execute the following:
Copy the output for the next step, e.g for ICAP this has already been run and this is the result:
Creating EC2 instances
Navigate to the infrastructure/terraform directory
Initialize Terraform project:
The following properties have to be set:
The configuration can be done using an terraform variable file. Create a file called my.tfvars. Below is an example that illustrates the structure of the environment variables file. This example is of a configuration that you can use for the ICAP CDR. Please replace {user} with your own user.
The AWS account to be used is defined in the ~/.aws/credentials file. If you don't have file this make sure you have configured the AWS CLI.
The sample file above has access to 3 accounts and the options for <account_name> could be "default", "jembi-sandbox", "jembi-icap"
Optionally, add ACCOUNT = "<account_name>" to my.tfvars if you want to use something other than default.
The flag for specifying an environment variables file is -var-file, create the AWS stack by running:
Once the script has run successfully, the ip addresses and domains for the servers will be displayed:
SSH access should be now available - use the default 'ubuntu' user - ssh ubuntu@<ip_address>
PUBLIC_KEY_PATH - path to the user's public key file that gets injected into the servers created
PROJECT_NAME - unique project name that is used to identify each VPC and its resources
HOSTED_ZONE_ID - (only if you are creating domains, which by default you are) the hosted zone to use, this must be created in the AWS console
DOMAIN_NAME - the base domain name to use
SUBNET_ID - the subnet id to use, copy this from the previous step
VPC_ID - the subnet id to use, copy this from the previous step
This page gives a list of common command and examples for easy reference
Install the latest Instant OpenHIE binary locally:
Launch a particular package (with metadata initialisation):
OpenFn
Introduction
Welcome to the documentation for the openfn package! This package is designed to provide a platform for seamless integration and automation of data workflows. Whether you are a developer, data analyst, or data scientist, this package will help you streamline your data processing tasks.
Environment Variables
The following environment variables can be used to configure Traefik:
Variable
Value
Description
Local Development
Reverse proxy for secure and insecure nginx configurations.
Nginx Reverse Proxy
This package can be used to secure all of the data transfered to and from services using SSL encryption and also to generate SSL certificates as well.
Instead of configuring each package separately, we're using this package that will hold all of the Nginx configuration.
It will generate Staging or Production certificates from Let's Encrypt to ensure a secure connection (in case we require SSL to be enabled).
Provisioning remote servers
Infrastructure tools for the OpenHIM Platform
Deploying from your local environment to a remote server or cluster is easy. All you have to do is ensure the remote servers are setup as a Docker Swarm cluster. Then, from your local environment you may target a remote environment by using the `DOCKER_HOST` env var. e.g.
Setting up new servers
In addition, as part of the OpenHIM Platform Github repository we also provide scripts to easily setup new servers. The Terraform script are able to instantiate server in AWS and the Ansible script are able to configure those server to be ready to accept OpenHIM Platform packages.
Architecture
OpenHIM Platform builds on Instant OpenHIE v2 as the deployment tool which provides the concepts of packages, profiles and the CLI utility that powers the ability to launch OpenHIM Platform packages and recipes. Please read the the for some foundational concepts about how packages are run.
On this page we will discuss the architecture of OpenHIM Platform which is a set of packages and recipes that use those packages that create a fully functional HIE from scratch.
Modularity and flexibility
OpenHIM Platform packages are able to stood up individually so that as much or as little of the package set that is necessary from an implementer can be utilised. However, OpenHIM Platform is designed with some key features that will only be available if a number of packages are setup together. Recipes group those sets of packages together so that it is easier to deploy all at once.
It is responsible for routing network traffic to the correct service.
Structure of Reverse Proxy Nginx package
The current package contains the following:
config: A folder that contains the general Nginx config for secure and insecure mode.
package-conf-insecure: A folder that contains all the insecure configs related to the services that need outside access.
package-conf-secure: A folder that contains all the secure configs related to the services that need outside access.
A job using Ofelia exists to renew the certificates automatically based on the certificate renewal period.
Adding new packages that require external access will require adding the Nginx config needed in this package.
Disaster Recovery Process
Backup & restore process.
Two major procedures should exist in order to recover lost data:
Creating backups continuously
Restoring the backups
This includes the different databases: MongoDB, PostgreSQL DB and Elasticsearch.
The current implementation will create continuous backups for MongoDB (to backup all the transactions of OpenHIM) and PostgreSQL (to backup the HAPI FHIR data) as follows:
Daily backups (for 7 days rotation)
Weekly backups (for 4 weeks rotation)
Monthly backups (for 3 months rotation)
More details on each service backup & restore pages.
Enable or disable TLS encryption.
TLS_CHALLENGE
http
The challenge type to use for TLS certificate generation.
WEB_ENTRY_POINT
web
The entry point for web traffic.
REDIRECT_TO_HTTPS
true
Enable or disable automatic redirection to HTTPS.
CERT_RESOLVER
le
The certificate resolver to use for obtaining TLS certificates.
Provision of the remote servers in single and cluster mode: user and firewall configurations, docker installation, docker authentication and docker swarm provision.
All the passwords are saved securely using Keepass.
In the inventories, there is different environment configuration (development, production and staging) that contains: users and their ssh keys list, docker credentials and definition of the hosts.
Start a particular package (WITHOUT metadata initialisation):
Destroy (delete all data too) a particular package:
Launch a particular recipe (with metadata initialisation) using profiles (which are defined in the config.yaml file):
Stop a particular recipe:
Start a particular recipe (WITHOUT metadata initialisation):
Destroy (delete all data too) a particular recipe:
Add --dev to any `instant` command to expose development ports to the host for packages
Usage
Once you have added the openfn package, you can start using it in your projects. Here is how to instantiate the package
instant package init -n openfn --dev
Demo
To get a hands-on experience with the openfn package, try out the demo. The demo showcases the package's capabilities and provides a sample project used to export data from CDR to NDR with transformations. It utilizes a Kafka queue and a custom adapter to map Bundles to be compliant with the FHIR Implementation Guide (IG).
Configure the Kafka trigger Change the trigger type from webhook to “Kafka Consumer” Enter in configuration details → see docs Kafka topic: {whichever you want to use} (e.g., “cdr-ndr”) Hosts: {cdr host name} Initial offset reset policy: earliest Connection timeout: 30 (default value, but can be adjusted) Warning: Check Disable this trigger to ensure that consumption doesn’t start until you are ready to run the workflow! Once unchecked, it will immediately start consuming messages off the topic.
Documentation
For more detailed information on the openfn package and its functionalities, please refer to the official documentation. The documentation covers various topics, including installation instructions, usage guidelines, and advanced features.
Client: This section represents client systems that might want to interact with the HIE, they have a particular set of interactions that the core OpenHIM Platform packages enable support for.
Platform core: These are a set of packages with instantiate applications and mediators that enable the core client workflows to be executed. This includes accepting FHIR bundles, splitting patient demographic data into an MPI and clinical data into a FHIR data store as well as managing the linkage from patient demographics to clinical data in a way that isn't affected by potencial linking and unlinking of patient records via the MPI. More on this later.
Platform pluggable services: by design, once the core packages have processed FHIR request, they are pushed into Kafka for secondary use by other systems. Typically this includes data analytics pipelines and a default implementation is included in OpenHIM Platform, however, the data can be read an used for any purpose i.e. syncing to another HIE or sent to a national data warehouse.
Platform core wokflow
This is how the core packages interact to split the data into two separate stores, a clinical data store and a patient demographics store.
The reason for doing this are as follows:
With the clinical and patient demographic data split it is easier to link and unlink patient identities as no data in the clinical store needs to change. They continue to reference the source patient ID and whatever happen to that patient, whether they are grouped together with other identities in the MPI or not, that ID remains constant.
The split of data is a useful security feature as the clinical data and the Personal Identifiable Information (PII) or stored separately. An attacker would need to compromise both to relate clinical information to a particular person.
It prevent duplicate information being stored in multiple places, a clear source of truth for each type of information is identified. This prevent data from getting out of sync when it is stored in multiple places.
Ansible
A tool that enables infrastructure as code for provision of the servers.
Platform Deploy
Prerequisites
Linux OS to run commands
Install Ansible (as per )
Ansible Docker Community Collection installed
Infrastructure and Servers
Please see the /inventories/{ENVIRONMENT}/hosts file for IP details of the designated servers. Set these to the server that you created via Terraform or to an on-premises server.
Ansible
SSH Access
To authenticate yourself on the remote servers your ssh key will need to be added to the sudoers var in the /inventories/{ENVIRONMENT}/group_vars/all.yml.
To have docker access you need to add your ssh key to the docker_users var in the /inventories/{ENVIRONMENT}/group_vars/all.yml file.
An authorised user will need to run the provision_servers.yml playbook to add the SSH key of the person who will run the Ansible scripts to the servers.
Configuration
Before running the ansible script add the server to your known_hosts file else ansible will throw an error, for each server run:
To run a playbook you can use:
Alternatively, to run all provisioning playbooks with the development inventory (most common for setting up a dev server), use:
Vault
The vault password required for running the playbooks can be found in the database.kdbx KeePass file.
To encrypt a new secret with the Ansible vault run:
The New password is the original Ansible Vault password.
Keepass
Copies of all the passwords used here are kept in the encrypted database.kdbx file.
Please ask your admin for the decryption password of the database.kdbx file.
Performance Testing
The performance scripts are located in the test folder. To run this script against a local or remote server.
Steps
Make sure you have the necessary dependencies installed, more importantly, the k6 binary. Refer to this documentation Building a k6 binary
Set the [BASE_URL] variable to the URL of your server. By default, it is set to "http://localhost:5001", but you can change it to the appropriate URL.
If there are any additional dependencies or configurations required by the [generateBundle] function or any other imported modules, make sure those are set up correctly.
Open your terminal or command prompt and navigate to the directory where the scripts are located, e.g.
Run the script using the k6 run command followed by the filename. In this case, you would run [k6 run load.js]
The script will start executing and sending HTTP POST requests to the specified server. The requests will be sent at a constant arrival rate defined in the [options] object
The script includes some thresholds defined in the [options] object. These thresholds define the performance criteria for the script. If any of the thresholds are exceeded, the script will report a failure.
Monitor the output in the terminal to see the results of the script execution. It will display information such as the number of virtual users (VUs), request statistics, and any failures that occurred.
To visualize the output in grafana, run the k6 scripts with the following environment variables and flag set K6_PROMETHEUS_RW_SERVER_URL=http://localhost:9090/api/v1/write && ./k6 run -o experimental-prometheus-rw script.js
Sample load test result
The test results were obtained from running on Ubuntu 22.04 OS, 64GB RAM and 12 Cores. ✓ status code is 200
Metric
Value
Sample volume test results
Metric
Value
Resource Allocations
Allot CPU and RAM resources to services, per service, per server.
What it Means
CPU
CPU allocations are specified as a portion of the total number of cores on the host system, i.e., a CPU limit of 2 in a 6-core system is an effective limit of 33.33% of the CPU, and a CPU limit of 6 in a 6-core system is an effective limit of 100% of the CPU.
RAM
Memory allocations are specified as a number followed by their multiplier, i.e., 500M, 1G, 10G, etc.
Defaults
As a default, each package contained in Platform is allocated a maximum of 3 GB of RAM, and 100% CPU usage.
Allocating Resources per Package
The resource allocation can be set on a per-package basis, as specified by the relevant environment variables found in the relevant .
Notes
Be wary of allocating CPU limits to ELK Stack services. These seem to fail with CPU limits and their already implemented health checks.
Take note to not allocate less memory to ELK Stack services than their JVM heap sizes.
Exit code 137 indicates an out-of-memory failure. When running into this, it means that the service has been allocated too little memory.
Reverse Proxy Traefik
Reverse proxy for secure traefik configurations.
Reverse Proxy Traefik
Reverse Proxy Traefik
The package is an alternative reverse proxy Nginx, this reverse proxy exposes packages using both subdomains and subdirectories to host the following services:
Package
Hosted
Please ensure that the ENV "DOMAIN_NAME_HOST_TRAEFIK" is set, in this documentation we will be using the placeholder "domain" for its value
Subdomain-Based Reverse Proxy
The following packages do not support subdomains and require the use of domain/subdomain to access over the reverse proxy
Superset
Set the following environment variable in the package-metadata.json in the "./dashboard-visualiser-superset" directory
Jempi
Set the following environment variables in the package-metadata.json in the "./client-registry-jempi" directory
Santempi
Set the following environment variables in the package-metadata.json in the "./client-registry-santempi" directory
Enabling Kibana
Set the following environment variables in the package-metadata.json in the "./dashboard-visualiser-kibana" directory
Subdirectory
Enabling Minio
Set the following environment variables in the package-metadata.json in the "monitoring" directory
MinIO Configuration
The MinIO server is configured to run with the following port settings:
API Port: 9090
Console Port: 9001
Ensure that your Traefik configuration reflects these ports to properly route traffic to the MinIO services. The API can be accessed at https://<domain>/minio and the Console at https://<domain>/minio-console.
Update your Traefik labels in the docker-compose.yml to match these settings:
Enabling Grafana
Set the following environment variables in the package-metadata.json in the "monitoring" directory
JS Report
Set the following environment variables in the package-metadata.json in the "dashboard-visualiser-jsreport" directory
OpenHIM
Set the following environment variables in the package-metadata.json in the "./interoperability-layer-openhim" directory
Note: Only the Backend services are accessible through subdirectory paths, not the frontend
Elasticsearch offers the functionality to save a backup in different ways, for further understanding, you can use this link: Register a snapshot repository docs.
Elasticsearch Restore
To see how to restore snapshots in Elasticsearch: .
Config Importing
This section defines the configuration importing methods used in the Platform
Overview
Certain packages in the Platform require configuration to enable their intended functionality in a stack. For instance, the OpenHIM package requires the setting of users, channels, roles, and so on. Other packages, such as JS Report or Kibana, require importing of pre-configured dashboards stored in compressed files.
Most services in the Platform can be configured by sending a request containing the required configuration files to the relevant service API. To achieve this, the Platform leverages a helper container to make that API call.
Community
We encourage any contributions and suggestions! If you would like to get involved, please visit us on . Feel free to submit an issue or to create a PR to see your features included in the project.
If you'd like to chat about OpenHIM Platform please join our .
We look forward to growing the set of capabilities within OpenHIM Platform together!
HAPI FHIR Data
FHIR messages Backup & Restore.
Validated messages from HAPI FHIR will be stored in PostgreSQL database.
The following content will detail the backup and restore process of this data.
Backups
This section assumes Postgres backups are made using pg_basebackup
Development
Adding Packages
The Go Cli runs all services from the jembi/platform docker image. When adding new packages or updating existing packages to Platform you will need to build/update your local jembi/platform image. .
If a package uses a config importer, its configuration can be found in the relevant package's importer section.
The Helper Container
The Process
As part of the package-launching process, the to-be-configured service is deployed, then awaits configuring. Before the configuration can take place, the relevant service is waited upon for joining to the Docker internal network. Once the service has joined the network, the helper container is launched and makes the API request to configure the service.
Images
jembi/api-config-importer
For reference on how to use the jembi/api-config-importer image, see the repo here.
jembi/instantohie-config-importer
For reference on how to use the jembi/instantohie-config-importer image, see the repo here.
As you add new packages to the platform remember to list them in the config.yml file - otherwise the added package will not be detected by the platform-cli tool.
To start up HAPI FHIR and ensure that the backups can be made, ensure that you have created the HAPI FHIR bind mount directory (eg./backup)
Disaster Recovery
NB! DO NOT UNTAR OR EDIT THE FILE PERMISSIONS OF THE POSTGRES BACKUP FILE
Postgres (HAPI FHIR)
Preliminary steps:
Do a destroy of fhir-datastore-hapi-fhir using the CLI binary (./platform-linux for linux)
Make sure the Postgres volumes on nodes other than the swarm leader have been removed as well! You will need to ssh into each server and manually remove them.
Do an init of fhir-datastore-hapi-fhir using the CLI binary
After running the preliminary steps, run the following commands on the node hosting the Postgres leader:
NOTE: The value of the REPMGR_PRIMARY_HOST variable in your .env file indicates the Postgres leader
Retrieve the Postgres leader's container-ID using: docker ps -a.
Hereafter called postgres_leader_container_id
Run the following command:
docker exec -t <postgres_leader_container_id> pg_ctl stop -D /bitnami/postgresql/data
Wait for the Postgres leader container to die and start up again.
You can monitor this using: docker ps -a
Run the following command:
docker rm <postgres_leader_container_id>
Retrieve the new Postgres leader's container-ID using docker ps -a, be weary to not use the old postgres_leader_container_id
Retrieve the Postgres backup file's name as an absolute path (/backups/postgresql_xxx).
Hereafter called backup_file
Run the following commands in the order listed :
Do a down of fhir-datastore-hapi-fhir using the CLI binary
Example: ./instant-linux package down -n=fhir-datastore-hapi-fhir --env-file=.env.*
Wait for the down operation to complete
Do an init of fhir-datastore-hapi-fhir using the CLI binary
Example: ./instant-linux package init -n=fhir-datastore-hapi-fhir --env-file=.env.*
Postgres should now be recovered
Note: After performing the data recovery, it is possible to get an error from HAPI FHIR (500 internal server error) while the data is still being replicated across the cluster. Wait a minute and try again.