Intro
Context
Large scale numerical experiments are central to much of contemporary scientific and mathematical research. Performing these numerical experiments in a valid, reproducible and scalable fashion is not easy. Even a small typical project may need to run 1000s of executions, and 10k+ is not uncommon. It is crucial to have good tools to coordinate and organize these experiments.
nf-nest
This webpage documents nf-nest
, a collection of small but powerful utilities built on nextflow to help accomplish this. Link to github repository..
Aspects taken into account:
- Automating cross-product of input parameters in experiments.
- Automating creation of submission scripts and ordering of jobs (taking care of moving across file system, dynamic memory requirements, etc).
- Automating the gathering of results from many runs.
- Caching already ran jobs.
- Robustness to failure.
- Reproducibility via apptainer and docker, supporting both x86 and apple silicon.
- Support for GPU programming.
Technology stack
nf-nest
uses the following open source projects:
- Nextflow: can be thought of as an “operating system” for coordinating numerical experiments.
- Julia: a programming language to unlock full access to high performance computation on both CPUs and GPUs.
While some features of nf-nest
are Julia specific, other parts are language agnostic.
Background
Scientific workflow
A scientific workflow is a directed acyclic graph where each node is a process and each edge between node \(n\) to \(n'\) denote that at least one output of process \(n\) is fed as an input to process \(n'\).
Here is an example from a workflow covered later in this tutorial:
In a nutshell, nextflow will submit one or several SLURM job for each node in this graph, gather results, and produce some nice reports.
More information
Both nextflow and Julia have excellent and extensive documentation.
See also this nextflow tutorial.