Skip to content

Deployment of full OHDSI technology stack for a non-cluster environment

About OHDSI Broadsea

You can use OHDSI Broadsea to build a docker container on your VM or server (hereafter, the host), which includes necessary OHDSI technologies such as ATLAS, WebAPI, Achilles, R Methods Library and others.

Refer to README.md for general information of dependencies and installation: - Broadsea Dependencies - https://github.com/OHDSI/Broadsea#broadsea-dependencies - Quick Start Broadsea Deployment - https://github.com/OHDSI/Broadsea#quick-start-broadsea-deployment

Database Setup

Broadsea supports Apache Impala, Oracle, MS SQL Server, PostgreSQL. Here we show an installation guide where you can install and run PostgreSQL on the host in which you are running docker containers.

After installing PostgreSQL in the host, create the user and database using psql commands assuming: - username: dpm360 - password: dpm360-password - database name: dpm360db

Next configure PostgreSQL to allow a docker VM to access to the PostgreSQL database. Please confirm the IP address of docker0 (virtual network bridge on the host) by
ip address show dev docker0.
Note, you can see this address before starting containers (it is usually ok if the docker service is on). Here, we assume the ip is 172.17.0.1

Using this IP address, modify configuration files:

/etc/postgresql/10/main/pg_hba.conf:
add the following line
host    all    all    172.17.0.1/0    md5
/etc/postgresql/10/main/postgresql.conf:
change listen_addresses variable as
listen_addresses = 'localhost, 172.17.0.1'

Broadsea Container Deployment and Run

To run docker container for OHDSI stack (ATLAS, WebAPI, and Achilles), follow instructions below. 

  • change directory to <dpm360 root dir>/installer/express/broadsea-example
  • modify docker-compose.yml for your environment
services:
  broadsea-webtools:
    ports:
      - "18080:8080" # change host port 18080 if needed
    environment:
      - WEBAPI_URL=http://172.17.0.1:18080 # confirm address and port
      - datasource_url=jdbc:postgresql://172.17.0.1:5432/dpm360db # confirm address and postgresql configuration
      - datasource_username=dpm360 # confirm postgresql configuration
      - datasource_password=dpm360-password # confirm postgresql configuration
      - flyway_datasource_url=jdbc:postgresql://172.17.0.1:5432/dpm360db # confirm address and postgresql configuration
      - flyway_datasource_username=dpm360 # confirm postgresql configuration
      - flyway_datasource_password=dpm360-password # confirm postgresql configuration
  • run docker-compose up -d
  • please wait for while and access to <host>:18080/atlas/ to confirm it is started

Define and Populate OHDSI OMOP-CDM Database on PostgreSQL

Next step is to setup PostgreSQL by defining tables and importing data. We provide a custom docker image to initialize the CDM database with Athena Vocabularies. Currently this only suppors SynPUF 1k data but we plan to release a more general tool to setup the database.

First, you make a vocabulary file.  All necessary vocabulary files can be downloaded from the ATHENA download site: http://athena.ohdsi.org. A tutorial for Athena is available at https://www.youtube.com/watch?v=2WdwBASZYLk. Download guide is given from 10:04. According to the guidance, please make vocabs.tar.gz, and put the file at:

<dpm360 root dir>/installer/express/cdm-init-example-local/data/vocabs.tar.gz

Please confirm vocabs.tar.gz includes the followings (confirm it has no directory structure):

  CONCEPT_ANCESTOR.csv
  CONCEPT_CLASS.csv
  CONCEPT_RELATIONSHIP.csv
  CONCEPT_SYNONYM.csv
  CONCEPT.csv
  DOMAIN.csv
  DRUG_STRENGTH.csv
  RELATIONSHIP.csv
  VOCABULARY.csv

Next, obtain SynPUF 1k (CDM 5.3.1) data from here. You have to change the directory structure as expected. Try the following:

tar -zxvf synpuf1k.tar.gz *.csv
cd synpuf1k531
tar -zcvf synpuf1k.tar.gz *.csv
and put the file at:

<dpm360 root dir>/installer/express/cdm-init-example-local/data/synpuf1k.tar.gz

Please confirm synpuf1k.tar.gz includes the followings (confirm it has no directory structure):

  visit_occurrence.csv
  care_site.csv
  cdm_source.csv
  condition_era.csv
  condition_occurrence.csv
  cost.csv
  death.csv
  device_exposure.csv
  drug_era.csv
  drug_exposure.csv
  location.csv
  measurement.csv
  observation_period.csv
  observation.csv
  payer_plan_period.csv
  person.csv
  procedure_occurrence.csv
  provider.csv

The following instructions then run a docker container to prepare the database.

  • change directory to <dpm360 root dir>/installer/express/cdm-init-example-local
  • modify docker-compose.yml for your environment
services:
  cdmInitJob:
    image: ibmcom/dpm360-cdm_init:1.2
    volumes:
      - ./data:/data # /data is mounted to ./data of the host
    environment:
      - CDM_URL=file:///data/vocabs.tar.gz # /data is mounted to the host, confirm file name is correct
      - SYNPUF1K_URL=file:///data/synpuf1k.tar.gz # /data is mounted to the host, confirm file name is correct
  • run docker-compose up
  • wait untill it ends

Achilles computes statistics on your OMOP CDM database.

Follow instructions below to run a docker container to make Achilles work for your database.

  • change directory to <dpm360 root dir>/installer/express/achilles-example
  • run docker-compose up
  • wait untill it ends
  • access to <host>:18080/atlas/ and click "Data source" to see the statistics of your database

Model Registry

You can run MLFlow on the host as Model Registry, which can be connected to lightsaber (model training framework) and service builder (micro service builder using the trained model). A guidance is being prepared.

What To Do Next

  • use Atlas <host>:18080/atlas/ to define cohorts and outcomes
  • use cohort tools to extract features to make training data
  • use lightsaber to build and train the model using above data
  • use service builder to deploy a service using the trained model