From 0f638714f41a0145e637c4fbbef90b73feb3b9b0 Mon Sep 17 00:00:00 2001 From: Benjamin Wingfield <bwingfield@ebi.ac.uk> Date: Mon, 3 Jul 2023 15:56:57 +0100 Subject: [PATCH] update README --- README.md | 86 +++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 84 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 651cc1b..220e09f 100644 --- a/README.md +++ b/README.md @@ -1,3 +1,85 @@ -# hattivatti +# `hattivatti` -> A Finnish [job submitter](https://github.com/ebi-gdp/jobsubmitter). +`hattivatti` submits [`pgsc_calc`](https://github.com/PGScatalog/pgsc_calc) jobs +to [Puhti HPC](https://docs.csc.fi/computing/systems-puhti/) at CSC. Jobs are +configured to execute in a secure way because genomes are sensitive +data. `hattivatti` is a proof of concept for testing sensitive data submission +to CSC. + +## Run `hattivatti` + +See [Releases](https://github.com/ebi-gdp/hattivatti/releases) for most recent +stable versions of `hattivatti`. The development version can be run with: + +``` +$ git clone https://github.com/ebi-gdp/hattivatti.git --branch dev +$ cargo run +``` + +## Documentation + +``` +$ cargo doc --open +``` + +## Deployment notes + +Puhti is currently on RHEL 7 with an old version of glibc. + +Github actions builds with rust-buster to match glibc version (2.28). + +### Cronjob + +cron shell doesn't load much: + +``` +$ # load 'module' command +$ source /appl/profile/zz-csc-env.sh +``` + +### Set environment variables + +Sensitive variables: + +``` +$ export GLOBUS_SECRET_TOKEN=<...> +$ export AWS_ACCESS_KEY_ID=<...> +$ export AWS_SECRET_ACCESS_KEY=<...> +$ export NXF_SINGULARITY_CACHEDIR=<...> +``` + +Configuration variables: + +``` +$ export RUST_LOG=info +$ export NXF_SINGULARITY_CACHEDIR=<path> +``` + +### Clone pgsc_calc + +``` +$ cd /scratch/projec_XXXXXX/ +$ nextflow clone https://github.com/PGScatalog/pgsc_calc.git +``` + +### Run hattivatti + +``` +$ hattivatti --schema-dir repo/data/schemas --work-dir work +``` + +### Backup database (optional) + +After hattivatti executes the database will have no connections. + +``` +$ module load allas +$ rclone copy work/hattivatti.db s3allas://bucket/hattivatti/hattivatti.db +``` + +### Software dependencies + +* `curl` +* `jq` +* `nextflow` + * `java 16` -- GitLab