Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deletions in .r generate warning for auto-mapping to long form of .c (eg .r:6_8del -> .c:6_8delTTG) when the original is valid #584

Open
John-F-Wagstaff opened this issue Feb 5, 2024 · 6 comments

Comments

@John-F-Wagstaff
Copy link
Collaborator

John-F-Wagstaff commented Feb 5, 2024

Describe the bug
we are getting warnings out like "NM_004006.2:r.6_8del automapped to NM_004006.2:c.6_8delTTG" for .r type variants when NM_004006.2:r.6_8del is valid and more concise. The more concise form is preferred so we do not want to include the extended form in warnings (unless it is somehow needed to describe the error). We possibly don't want to warn on auto-mapping as it is not an error. We already filter auto-mapping warnings out in some of our endpoints, since any warning that is not a warning may confuse the users. If we do want to return a message we may want to tweak the warning message to flag that it is informative rather than cautionary.

To Reproduce
input a deletion like NM_004006.2:r.6_8del into vv

import json
import VariantValidator
var = vval.validate(json.dumps(['NM_004006.2:r.6_8del']),'GRCh38',select_transcripts='all',liftover_level=None)
print(var.format_as_dict(with_meta=True))

{'flag': 'gene_variant', 'NM_004006.2:c.6_8del': {'selected_assembly': 'GRCh38', 'submitted_variant': 'NM_004006.2:r.6_8del', 'gene_symbol': 'DMD', 'gene_ids': {'hgnc_id': 'HGNC:2928', 'entrez_gene_id': '1756', 'ensembl_gene_id': 'ENSG00000198947', 'ucsc_id': 'uc004dda.2', 'omim_id': ['300377'], 'ccds_ids': ['CCDS14233', 'CCDS55395', 'CCDS14232', 'CCDS48091', 'CCDS94585', 'CCDS14231', 'CCDS94586', 'CCDS14234', 'CCDS14230', 'CCDS14229', 'CCDS55394']}, 'annotations': {'db_xref': {'CCDS': 'CCDS14233.1', 'select': 'RefSeq', 'ncbigene': '1756', 'ensemblgene': None, 'hgnc': 'HGNC:2928'}, 'chromosome': 'X', 'map': 'Xp21.2-p21.1', 'note': 'dystrophin', 'variant': 'DP427M', 'refseq_select': True, 'mane_select': False, 'ensembl_select': False, 'mane_plus_clinical': False}, 'transcript_description': 'Homo sapiens dystrophin (DMD), transcript variant Dp427m, mRNA', 'hgvs_transcript_variant': 'NM_004006.2:c.6_8del', 'rna_variant_descriptions': {'usage_warnings': ['RNA (r.) descriptions are independent of cDNA descriptions (c.)', 'RNA descriptions must only be used if the RNA has been sequenced and must not be inferred from a cDNA description', 'c. and g. descriptions provided by VariantValidator must only be used if the DNA sequence has been confirmed'], 'rna_variant': 'NM_004006.2:r.6_8del', 'translation': 'NP_003997.1:p.Trp4del', 'translation_slr': 'NP_003997.1:p.W4del'}, 'genome_context_intronic_sequence': '', 'refseqgene_context_intronic_sequence': '', 'hgvs_refseqgene_variant': 'NG_012232.1:g.133303_133305del', 'hgvs_predicted_protein_consequence': {'tlr': 'NP_003997.1:p.(Trp4del)', 'slr': 'NP_003997.1:p.(W4del)', 'lrg_tlr': 'LRG_199p1:p.(Trp4del)', 'lrg_slr': 'LRG_199p1:p.(W4del)'}, 'validation_warnings': ['NM_004006.2:r.6_8del automapped to NM_004006.2:c.6_8delTTG', 'A more recent version of the selected reference sequence NM_004006.2 is available (NM_004006.3): NM_004006.3:c.6_8del MUST be fully validated prior to use in reports: select_variants=NM_004006.3:c.6_8del'], 'hgvs_lrg_transcript_variant': 'LRG_199t1:c.6_8del', 'hgvs_lrg_variant': 'LRG_199:g.133303_133305del', 'alt_genomic_loci': [], 'primary_assembly_loci': {'grch38': {'hgvs_genomic_description': 'NC_000023.11:g.33211305_33211307del', 'vcf': {'chr': 'X', 'pos': '33211304', 'ref': 'CCAA', 'alt': 'C'}}, 'hg38': {'hgvs_genomic_description': 'NC_000023.11:g.33211305_33211307del', 'vcf': {'chr': 'chrX', 'pos': '33211304', 'ref': 'CCAA', 'alt': 'C'}}}, 'variant_exonic_positions': {'NC_000023.11': {'start_exon': '1', 'end_exon': '1'}, 'NG_012232.1': {'start_exon': '1', 'end_exon': '1'}}, 'reference_sequence_records': {'transcript': 'https://www.ncbi.nlm.nih.gov/nuccore/NM_004006.2', 'protein': 'https://www.ncbi.nlm.nih.gov/nuccore/NP_003997.1', 'refseqgene': 'https://www.ncbi.nlm.nih.gov/nuccore/NG_012232.1', 'lrg': 'http://ftp.ebi.ac.uk/pub/databases/lrgex/LRG_199.xml'}}, 'metadata': {'variantvalidator_version': '2.2.1.dev521+g1d4df7a', 'variantvalidator_hgvs_version': '2.2.1.dev0+g69b1a7c.d20240131', 'vvta_version': 'vvta', 'vvseqrepo_db': 'VV_SR_2024_01/master', 'vvdb_version': 'vvdb_2022_11'}}

Expected behavior
NM_004006.2:r.6_8del is valid and should be returned, without warning or with an eddited warning

@Peter-J-Freeman
Copy link
Collaborator

Not quite correct. All r. variants are mapped onto c. for processing because hgvs.py does not correctly handle r. variants.

There are 2 issues. The warnings should never contain the excess bases e.g. delTTAT shoud be del
Do we keep warning about r. to c. as it may be confusing to users

@John-F-Wagstaff
Copy link
Collaborator Author

John-F-Wagstaff commented Feb 6, 2024

OK, I have edited a bit to make it more clear, also add the implication that we might want to keep the warning message in some form, I had thought you where more emphatic about wanting rid of it.

Currently we don't have any feature to handle user warnings that would be flagged with INFO in our logs, or not printed depending on log level. It is possible that we might want to add some kind of handling for non critical informational warnings, at least in the long term, if we want to keep these type of warning in, rather than filtering all of the output.
edit: do you want me to submit this as a low priority feature request, separate to this bug?

@ifokkema
Copy link
Collaborator

ifokkema commented Feb 6, 2024

FYI: The LOVD3 VV library tosses all "automapped to..." messages because it assumes they are just internal messages about automapping. So, for us, they have no value.

@Peter-J-Freeman
Copy link
Collaborator

I'm wondering if they have any real value either. @John-F-Wagstaff we could also just toss them

@John-F-Wagstaff
Copy link
Collaborator Author

John-F-Wagstaff commented Feb 6, 2024

We could, I am happy either way you decide on this.

A lot of them have the flavour of "if we get an error later than this will explain why/how we got to this point" but I am not sure how user-helpful they are in action. If we don't want to add a 'info' type flag for VV, to trigger optional extra warnings, then I can see that some (most/all?) of them might be better off just removed from the user visible output.

If we decide that they are not useful to the users, then they probably want to go in as a logger.info() call instead, for our internal use, if that does not cause issues itself.

@Peter-J-Freeman
Copy link
Collaborator

I think we will just filter out when I do an error message clean. We can the reinstate if needed. Don;t really use them for bug fixing either so limited internal use as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants