Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Escaping and brace nesting in bibtex field values #2075

Open
Omikhleia opened this issue Jun 17, 2024 · 2 comments
Open

Escaping and brace nesting in bibtex field values #2075

Omikhleia opened this issue Jun 17, 2024 · 2 comments
Labels
enhancement Software improvement or feature request

Comments

@Omikhleia
Copy link
Member

Extracted from review comment #2048 (comment)

BibTeX allows two types syntax for (leaf-)string values in field:

  • Braced {My Awesome Book}
  • Quoted "My Awesome Book"

Out bibtex implementation currently supports both on the surface, but things get much trickier inside these strings:

  • There can be nested braced content with a specific/mixed meaning (making TeX commands "robust", enforcing by-passing casing rules, or even used as a separator... Ex:
    • {{\relax Ph}ilip Doe}, {{\LaTeX}} etc. = some TeX-based mess
    • {Doe, John and {National Aeronautics and Space Administration}} = avoid the organization being seen and parsed as a name (split at and)
    • (In addition to name fields, in BibLaTeX) special handing of multiple institution, publisher, location fields, ex. {Johnson {and} Smith} vs London and Paris
    • {My {A}we{S}ome Book} when lower-cased, title-cased, or whatever, would keep A and S capitals...
  • In the quoted variant, we currently make the assumption that backslash-escaping quotes should work, e.g. "My \"Awesome\" Book", but the official way documented in "Tame the BeaST. The B to X of BibTEX,* version 1.4 (2009) seems to be "My {"}Awesome{"} Book"

--> For full correct support, we may need to revise our grammar at some point (and or consolidate some entries differently).

@alerque
Copy link
Member

alerque commented Jun 17, 2024

Have you run across ABNF, EBNF, or any other formal grammars for BibTeX stuff or are there just a hodge–podge of implementations we're copying that might not even all match each other? Any links to grammars or parsers in any language would probably be useful when we go to address this.

@alerque
Copy link
Member

alerque commented Jun 23, 2024

From the soon to be closed PR, a still relevant comment:

For some elements (e.g. @string), I checked a few (non-grammar-based) implementations as well as a few ABNF grammars. But the rest is mostly from the biblatex manual and Tame the Beast. Problem is that most ABNF specifications I've found are incomplete... So yes, it ends up as a bit of a hodge-podge... (Hence also the "notes" on the format added in the documentation, as it's not clear to me that there's some full complete definition agreed by everyone and formally defined... Biber/biblatex keep adding stuff!)

@alerque alerque moved this to Todo in Bibliography cleanup Jun 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Software improvement or feature request
Projects
Development

No branches or pull requests

2 participants