HRA workflows for annotating h5ad using different tools.
- cwl-runner
- Docker
Docker images can be build locally by running ./scripts/build-containers.sh. By default the script will build all containers when run. To build individual images or a set of images provide the container names as arguments to the build script, ex. ./scripts/build-containers.sh azimuth gene-expression.
Download model data by running cwl-runner download-models.cwl.
The first step is to create a job file that will specify inputs to the pipeline. The file can be written as either a json or yaml file.
matrix:
class: File
path: path/to/data.h5ad
organ: UBERON:0002048 # Uberon id for lung
algorithms:
# Algorithm specific options are documented in the container's options.yml
- azimuth:
referenceDataDir:
class: Directory
path: path/to/models/directoryAfter creating a job file running the annotation tools is as simple as running cwl-runner pipeline.cwl my-job.yml (replace my-job.yml with your job file).
An annotation tool generally has the following file structure:
containers/
my-annotation-tool/
Dockerfile
options.yml
pipeline.cwl
download-data.cwl (optional)
context/*
code and assets...
Where each file should perform the following function:
Dockerfile- Instructions for building a docker image.
options.yml- Cwl definition of tool specific options.
pipeline.cwl- Main cwl pipeline for running the tool.
- 3 inputs: "matrix", "organ", and "options"
- 3 outputs: "annotations", "annotated_matrix", and "report".
download-data.cwl(optional)- Download models and other data required for running the tool.
- Implement this pipeline when the model data is to large to embed directly in the docker image.
context/*- Directory containing the code and assets implementing the tool.
After implementing a new algorithm a few changes have to be made to enable the tool from the main pipeline. The files that have to be updated are: pipeline.cwl, ./steps/annotate.cwl, and ./steps/run-one.cwl. After adding the new tool to the top level pipeline it can be used by specifying the tool in a job file.
matrix:
class: File
path: path/to/data.h5ad
organ: UBERON:0002048 # Uberon id for lung
algorithms:
- my-annotation-tool:
# Options specific to my-annotation-tool
option1: value1
option2: value2
...