Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline QC comments #33

Open
7 tasks done
ao508 opened this issue Aug 9, 2019 · 0 comments
Open
7 tasks done

Pipeline QC comments #33

ao508 opened this issue Aug 9, 2019 · 0 comments
Assignees

Comments

@ao508
Copy link
Contributor

ao508 commented Aug 9, 2019

TODO

  • update MAF output filename to data_mutations_extended.txt
  • update MAF meta filename to meta_mutations_extended.txt
  • clinical file meta data headers must start with #
  • CNA header should have Hugo_Symbol (header shows GENE SYMBOL)
  • CNA file for test study has duplicate headers. Headers for data types like CNA and expression should be treated as if they might have duplicate sample ids just in case. Only use the first sample id you come across if a duplicate is found.
  • extra file found in output directory called expression_cna.txt - this should not exist

For documentation:

  • add reference to cBioPortal file formats in README.md so that anyone using the pipeline can reference the cbio docs to understand the file formats generated by the GDC pipeline
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants