Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect Annotation for Frameshift Variant in VEP 113 #1796

Open
GSYongWu opened this issue Nov 19, 2024 · 7 comments
Open

Incorrect Annotation for Frameshift Variant in VEP 113 #1796

GSYongWu opened this issue Nov 19, 2024 · 7 comments
Assignees
Labels

Comments

@GSYongWu
Copy link

Hi,
I would like to report an issue with the variant annotation in VEP version 113. Specifically, variants that should be annotated as frameshift_variant are being incorrectly annotated as inframe_insertion.

Example Variant:

Variant: 17:58740533_A>AAAGCCCTGACTTTAAGGATACATGATTC
Current VEP 113 Annotation: inframe_insertion & stop_retained_variant
VEP 111 Annotation: stop_gained & frameshift_variant

HGVSp Results:

  • VEP 113 HGVSp: p.S489_L490insSPDFKDT*
  • VEP 111 HGVSp: p.L490Sfs*8

It appears that the results from VEP 111 are correct.

Could you please investigate this discrepancy? Correct annotation of such variants is crucial for downstream analyses.
Thank you for your attention to this matter.

Best regards,

@olaaustine olaaustine self-assigned this Nov 19, 2024
@olaaustine
Copy link
Contributor

Hi @GSYongWu,
Hope this meets you well?
Please can you share more information so we can try to recreate the issue, such as what assembly and your VEP command ?
Thank you
Ola

@GSYongWu
Copy link
Author

Hi,
I used the genome version GRCh37.
My command is:
"/usr/bin/perl ensembl-vep-release-113.0/vep --offline --no_stats --buffer_size 10000 --fork 4 --ccds --uniprot --hgvs --symbol --shift_3prime 1 --numbers --canonical --protein --biotype --hgvsg --variant_class --total_length --force_overwrite --allele_number --no_escape --vcf --dir vepdb --fasta genome/hs37d5.fa --format vcf --input_file clincal.merge.vcf --output_file clincal.merge.out.vep.vcf --refseq --use_given_ref --no_check_variants_order"

@olaaustine
Copy link
Contributor

olaaustine commented Nov 20, 2024

Hi @GSYongWu,
We have been able to identify the issue. The issue is with the --shift_3prime an improvement introduced to the code recently.
As a workaround while we get this sorted, can you run the command without the --shift_3prime 1 to see if that fixes the problem.
Let us know if there are still any issues.
Thank you
Ola.

@olaaustine olaaustine added the bug label Nov 20, 2024
@GSYongWu
Copy link
Author

Hi, Ola
Thank you very much for your reply.

I tried removing this parameter, and the issue with the reported mutation sites was indeed resolved. However, other sites were affected. For example, at the site 7:55248981_T>TCCAGGAAGC, the HGVSp should be p.A763_Y764insQEA, but after removing the --shift_3prime parameter, the HGVSp is empty. The consequence also changed from inframe_insertion to splice_region_variant.

Best regards,

@olaaustine
Copy link
Contributor

Hi @GSYongWu,
Hope you are well?
Thank you for your response and for letting us know the workaround fixes a problem.
About the variant mentioned above, looking at the HGVSc using Ensembl Transcript
ENST00000275493.2:c.2284-3_2289dup, it affects a Splice site.
This annotation is consistent across the different releases mentioned above without the --shift_3prime parameter.
Let me know if this helps.
Thank you
Ola.

@GSYongWu
Copy link
Author

Hi, Ola
Thank you for your response.
This is unrelated to the version; I am discussing what the correct result for this mutation should be.
I think this is an incorrect result, because this mutation is a duplication (dup), which does not change the splice site but has altered the protein coding, resulting in a non-frameshift mutation. There should be an HGVSp, and the consequence should be inframe_insertion.
There is a similar mutation on COSMIC. 7:55248980_C>CTCCAGGAAGCCT
企业微信截图_17322628226573
If removing the --shift_3prime parameter in different versions still does not yield the correct results, then I think this parameter is essential for me.

Best regards,

@olaaustine
Copy link
Contributor

Hi @GSYongWu,
I mentioned versions because, although the first variant described is a bug, the annotation for the variant 7:55248981_T>TCCAGGAAGC remains consistent across different versions
To understand the way VEP handles shifting, please you can take a look at this documentation.
Let us know if you have any more questions
Thank you
Ola

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants