Skip to content
Snippets Groups Projects
Code owners
Assign users and groups as approvers for specific file changes. Learn more.

hattivatti

hattivatti submits pgsc_calc jobs to Puhti HPC at CSC. Jobs are configured to execute in a secure way because genomes are sensitive data. hattivatti is a proof of concept for testing sensitive data submission to CSC.

Run hattivatti

See Releases for most recent stable versions of hattivatti. The development version can be run with:

$ git clone https://github.com/ebi-gdp/hattivatti.git --branch dev
$ cargo run

Documentation

$ cargo doc --open

Deployment notes

Puhti is currently on RHEL 7 with an old version of glibc.

Github actions builds with rust-buster to match glibc version (2.28).

Cronjob

cron shell doesn't load much:

$ # load 'module' command
$ source /appl/profile/zz-csc-env.sh

Set environment variables

Sensitive variables:

$ export GLOBUS_SECRET_TOKEN=<...>
$ export AWS_ACCESS_KEY_ID=<...>
$ export AWS_SECRET_ACCESS_KEY=<...>
$ export NXF_SINGULARITY_CACHEDIR=<...>

Configuration variables:

$ export RUST_LOG=info
$ export NXF_SINGULARITY_CACHEDIR=<path>

Clone pgsc_calc

$ cd /scratch/projec_XXXXXX/
$ nextflow clone https://github.com/PGScatalog/pgsc_calc.git

Run hattivatti

$ hattivatti --schema-dir repo/data/schemas  --work-dir work

Backup database (optional)

After hattivatti executes the database will have no connections.

$ module load allas
$ rclone copy work/hattivatti.db s3allas://bucket/hattivatti/hattivatti.db

Software dependencies

  • curl
  • jq
  • nextflow
    • java 16