This Snakemake workflow automatically generates all results and figures from the initial Varlociraptor paper focusing on somatic variant calling.
Rerunning the workflow requires a lot of computation time and some unavoidable external resources that have to be manually deployed. We therefore hope that the Snakemake report in the supplementary material of the paper, providing all results together with comprehensive provenance information (workflow steps, parameters, software versions, code) will already yield sufficient information in most of the cases. If you nevertheless intend to rerun the analysis, feel free to follow the steps below, and please inform us about any potential issues.
Any 64-bit Linux installation with GLIBC 2.5 or newer (i.e. any Linux distribution that is newer than CentOS 6). Note that the restriction of this workflow to Linux is purely a design decision (to save space and ensure reproducibility) and not related to Conda/Bioconda. Bioconda packages are available for both Linux and MacOS in general.
This workflow can be used to recreate all results found in the paper.
If you are on a Linux system with GLIBC 2.5 or newer (i.e. any Linux distribution that is newer than CentOS 6), you can simply install Miniconda3 with
curl -o /tmp/miniconda.sh https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh && bash /tmp/miniconda.sh
Make sure to answer yes
to the question whether your PATH variable shall be modified.
Afterwards, open a new shell/terminal.
Otherwise, e.g., on MacOS or if you don't want to modify your system setup, install Docker, run
docker run -it continuumio/miniconda3 /bin/bash
and execute all the following steps within that container.
If you want to use an existing Miniconda installation, please be aware that this is only possible if it uses Python 3 by default. You can check this via
python --version
Further, ensure it is up to date with
conda update --all
Setup Bioconda with
conda config --add channels defaults
conda config --add channels conda-forge
conda config --add channels bioconda
Install Snakemake >=5.4.0 with
conda install snakemake
If you already have an older version of Snakemake, please make sure it is updated to >=5.4.0.
First, create a working directory:
mkdir varlociraptor-workflow
cd varlociraptor-workflow
Then, download the workflow archive from https://doi.org/10.5281/zenodo.3361700 and unpack it with
tar -xf workflow.tar.gz
In this special case there are unfortunately unavoidable additional requirements, due to licensing restrictions and data size.
- The required real data has to be obtained from EGA (EGAD00001002142). After downloading it, edit the config.yaml in order to point to the right paths here and below.
- The required simulated data has to be obtained from Zenodo: https://doi.org/10.5281/zenodo.1421298. Download it, convert back to BAM, and edit the config.yaml in order to point to the right paths here and here.
- The synthetic data can be obtained by running a separate workflow: https://doi.org/10.5281/zenodo.3630241.
Once all data is obtained, edit the config.yaml
under datasets:
to point to the right file paths in your system.
Execute the analysis workflow with Snakemake
snakemake --use-conda
Please wait a few minutes for the analysis to finish.
Results can be found in the folder figs/
.
If you have been running the workflow in the docker container (see above),
you can obtain the results with
docker cp <container-id>:/bioconda-workflow/figs .
whith <container-id>
being the ID of the container.
-
If you see an error like
ImportError: No module named 'appdirs'
when starting Snakemake, you are likely suffering from a bug in an older conda version. Make sure to update your conda installation with
conda update --all
and then reinstall the
appdirs
andsnakemake
package withconda install -f appdirs snakemake
-
If you see an error like
ImportError: Missing required dependencies ['numpy']
you are likely suffering from a bug in an older conda version. Make sure to update your conda installation with
conda update --all
and then reinstall the
snakemake
package withconda install -f snakemake