-
Notifications
You must be signed in to change notification settings - Fork 300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change URLs to DOIs in <doi> field #1621
base: master
Are you sure you want to change the base?
Conversation
Build successful. You can preview it here: https://preview.aclanthology.org/fix-paclic-dois |
Let's also adjust the schema to catch this kind of mistakes in the future. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, I made this "request changes" because I think some discussion is needed.
Yes, the DOI service resolve IDs such as 2065/12156
. No, this is not a valid DOI. The DOI handbook states:
The DOI prefix shall be composed of a directory indicator followed by a registrant code. These two components shall be separated by a full stop (period).
The directory indicator shall be "10"
In other words: Every DOI needs to start with the character sequence 10.
.
Someone obviously put in the handle URI for these papers. We could 1) keep them in the doi field because we know they (currently!) resolve even though they are not DOIs 2) find out whether they have proper DOIs and insert them 3) remove the DOI fields and maybe add the handle URI as some other field.
It is too late for me to form a definitve opinion, but I would be hesitant to put a non-DOI identifier into a DOI field. DOIs are specifically made to be exact and we would water that down.
Yes, I was about to write the same thing while you were posting this, @akoehn. :) Funnily enough, even the currently generated nonsense link on the website resolves: In that case, I'm not sure we currently have a mechanism to handle these cases. I think the DOI field is currently the only way to link to an external website like that. |
So are there valid DOIs for these, then? |
No, but those are valid permanent identifiers. Maybe we should treat the field that way instead of only supporting doi.
|
I think we should
We can split this into two steps, for example doing (1) in this PR and then adding (2) later when someone has time. Reading the doc raises a separate question: we generate our DOI suffixes as |
We could also do some something like |
Making a new field would be very little work, it just produces extra code for what currently is a rare exception. The |
Many PACLIC proceedings have URLs in their
<doi>
entry in the XML, not DOIs. This fixes that.Technically, the current entries are Handle URLs, not DOI URLs, but from spot-checking it seems that they are actually valid DOIs (DOI uses Handle internally).
Compare, for example:
(h/t https://twitter.com/gchrupala/status/1451552455519506448)