Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve dataset/source info and remove duplication #7

Open
danielabutano opened this issue May 27, 2020 · 0 comments
Open

Improve dataset/source info and remove duplication #7

danielabutano opened this issue May 27, 2020 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@danielabutano
Copy link
Member

danielabutano commented May 27, 2020

  • There are 2 FlyBase Controlled Vocabulary dataset with no entities associated. Deleted it from datasets.xml?

  • GO and The Gene Ontology are probably the same. To merge those, add in project.xml the properties dataset (and datasource) and set the same of the values in datasets.xml. For example The Gene Ontology (The Gene Ontology Consortium)

  • GO Annotation data set and GO Annotation for Drosophila melanogaster are probably the same. In this converter dataset/source are hardcoded. The only way to merge is using the same values in datasets.xml. Use GO Annotation data set and GO Annotation in the datasets.xml.

  • Human gene identifiers and NCBI Entrez Gene identifiers are probably the same.
    In this converter dataset/source are hardcoded. The only way to merge is using the same values
    Use NCBI Entrez Gene identifiers and NCBI in the datasets.xml

  • NCBI PubMed to gene mapping and PubMed to gene mapping are probably the same. In this converter dataset/source are hardcoded. The only way to merge is using the same values
    Use PubMed to gene mapping and NCBI in the datasets.xml

  • Sequence Ontology and The Sequence Ontology are probablye the same. To merge those, add in project.xml the properties dataset and datasource and set the same of the values in datasets.xml

  • UniProt data set and Uniprot data set are probalby the same. Case sensitive, use UniProt data set and UniProt in the datasets.xml

  • WormBase gene identifiers and Wormbase gene identifiers are probaby the same. Case sensitive, use WormBase gene identifiers and WormBase in the datasets.xml

  • miRBase Targets and microRNA Targets are probably the same. To merge those, set in the project.xml gff3.dataSetTitle and gff3.dataSourceName to be the same of the values in datasets.xml

@danielabutano danielabutano added the enhancement New feature or request label May 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants