diff --git a/README.md b/README.md index 651cc1ba397834521767e2c9b840980de36198e8..220e09fb2c5780c99dc6b24b1ec74e161455859a 100644 --- a/README.md +++ b/README.md @@ -1,3 +1,85 @@ -# hattivatti +# `hattivatti` -> A Finnish [job submitter](https://github.com/ebi-gdp/jobsubmitter). +`hattivatti` submits [`pgsc_calc`](https://github.com/PGScatalog/pgsc_calc) jobs +to [Puhti HPC](https://docs.csc.fi/computing/systems-puhti/) at CSC. Jobs are +configured to execute in a secure way because genomes are sensitive +data. `hattivatti` is a proof of concept for testing sensitive data submission +to CSC. + +## Run `hattivatti` + +See [Releases](https://github.com/ebi-gdp/hattivatti/releases) for most recent +stable versions of `hattivatti`. The development version can be run with: + +``` +$ git clone https://github.com/ebi-gdp/hattivatti.git --branch dev +$ cargo run +``` + +## Documentation + +``` +$ cargo doc --open +``` + +## Deployment notes + +Puhti is currently on RHEL 7 with an old version of glibc. + +Github actions builds with rust-buster to match glibc version (2.28). + +### Cronjob + +cron shell doesn't load much: + +``` +$ # load 'module' command +$ source /appl/profile/zz-csc-env.sh +``` + +### Set environment variables + +Sensitive variables: + +``` +$ export GLOBUS_SECRET_TOKEN=<...> +$ export AWS_ACCESS_KEY_ID=<...> +$ export AWS_SECRET_ACCESS_KEY=<...> +$ export NXF_SINGULARITY_CACHEDIR=<...> +``` + +Configuration variables: + +``` +$ export RUST_LOG=info +$ export NXF_SINGULARITY_CACHEDIR=<path> +``` + +### Clone pgsc_calc + +``` +$ cd /scratch/projec_XXXXXX/ +$ nextflow clone https://github.com/PGScatalog/pgsc_calc.git +``` + +### Run hattivatti + +``` +$ hattivatti --schema-dir repo/data/schemas --work-dir work +``` + +### Backup database (optional) + +After hattivatti executes the database will have no connections. + +``` +$ module load allas +$ rclone copy work/hattivatti.db s3allas://bucket/hattivatti/hattivatti.db +``` + +### Software dependencies + +* `curl` +* `jq` +* `nextflow` + * `java 16`