XML Grammar for tree-sitter
Based on the W3C XML 1.0 recommendation
Example files come from W3C samples, generated data, and wikimedia dumps.
Finished sections from the XML specification:
- Document
- Character Range (currently doing with
/./
) - White Space (currently doing with
/\s/
) - Names and Tokens
- Literals
- Character Data
- Comments
- Processing Instructions (haven't quite finished the pi_target definition)
- CDATA Sections (not sure about the cdata element, I need to test it more)
- Prolog
- Document Type Definition
- External Subset
- Standalone Document Declaration
- Language Identification
- Element
- Start-tag
- End-tag
- Content of Elements
- Tags for Empty Elements
- Element Type Declaration
- Element-content Models
- Mixed-content Declaration
- Attribute-list Declaration
- Attribute Types
- Enumerated Attribute Types
- Attribute Defaults
- Conditional Section ($.ignore might not be good enough)
- Character Reference
- Entity Reference
- Entity Declaration
- External Entity Declaraion
- Text Declaration
- Well-Formed External Parsed Entity
- Encoding Declaration
- Encoding Declaration
- Notation Declarations
- Characters
Note that these just mean I have literally written them in, I'm still working on organization and figuring out what needs to be visible/hidden. Help is appreciated!