A Reproducible Pipeline for Untargeted Metabolomics Data Analysis.
This program is released as open source software under the terms of MIT License.
Please refer to our wiki for how to install and use RUMP
RUMP can display usage information on the command line:
Nextflow main.nf --help true
RUMP accepts .mzXML
and .mzXL
files. Files are processed in parallel using MZmine-2.53; several statists are calculated using Python3 codes; interactive report is generated with MultiQC; pathway analysis are done with mummichog; unknown metabolites search are done with CEU Mass Mediator. Note that the processes related to unknow search with CEU Mass Mediator is turned off by default due to the unstable server, it can be turned on by setting parameter --unknown_search
to "1".
- Student t-test: Test if there is a significant statistical difference of certain peak intensities between the two groups of samples.
- Venn diagram: Report the number of peaks that are significantly enriched in one of the groups, and the number of peaks that have no significant difference between two groups.
- Principal component analysis: Dimensional reduction using the peak intensities of the two group samples, and visualize the difference.
- Hierarchical clustering: Cluster all samples and plot a heatmap to show the difference between samples and peaks.
- Bar plot: plot the metabolites with top-10 and bottom-10 fold-change for the comparison between two groups. (note: the figure will display abnormally if there is an infinite fold change value)
Logs and error reports will be stored under logs/
folder after running.
Run the following command to clean all the files generated by Nextflow
bash clear.sh
RUMP returns the following exit status values:
- 3: Positive file groups are not the same as negative file groups, please check design files.
- 4: Not all input files are in .mzXML format, please check input data folders.
- 5: One or more input files does not exist.
- Other Linux reserved exit codes
Please submit questions, bug reports and feature requests to the issue tracker on GitHub:
- SECIM Core provided valuable guidance related to parameter settings of data processing.
- Sample data is from Metabolomics Workbench PR000188
- Data used for continuous integration is from Metabolights MTBLS146