Appearance
14. Grammar Appendix
This chapter provides partial formal grammar for constructs whose prose definition benefits from a machine-checkable form. MdFlow does not ship a complete formal grammar; where a PEG fragment is given, the fragment is normative and prose is explanatory.
The full collection of PEG fragments lives in src/grammar/index.peg as a single file.
14.1 Notation
PEG rules follow the conventions of Bryan Ford's original PEG paper (2004), using:
A / B— ordered choice.A B— sequence.A*,A+,A?— repetition.&A,!A— positive / negative lookahead.[...]— character class."..."— literal string.
14.2 Lexical fragments
space = ' ' / '\t'
newline = '\r\n' / '\n' / '\r'
eol = newline / !.
blank_line = space* newline
indent = space{0,3}14.3 URL filter
Normative for §10.3.1.
url = url_with_scheme / relative_url
url_with_scheme = scheme ':' path
scheme = [a-zA-Z] [a-zA-Z0-9+.-]*
path = (![\s<>] .)*
relative_url = pathHost post-processing (strip NULs, fold scheme case, single-layer percent-decode of first :) is prose-normative per §10.3.1.1 and cannot be expressed as a pure PEG rule. Implementations MUST apply the pre-processing to the parsed url_with_scheme.scheme before matching the whitelist.
14.4 Emphasis
Normative for §7.4. This rule captures MdFlow's greedy-matcher semantics; it intentionally diverges from CommonMark's delimiter-stack algorithm.
emphasis = strong / emph
strong = "**" !space strong_body "**"
/ "__" !space &underscore_open strong_body "__" &underscore_close
emph = "*" !space emph_body "*"
/ "_" !space &underscore_open emph_body "_" &underscore_close
strong_body = (!("**") inline)+
emph_body = (!("*"/"**") inline)+
underscore_open = ![A-Za-z0-9] . // previous char not alphanumeric
underscore_close = ![A-Za-z0-9] // next char not alphanumericThe underscore_open and underscore_close lookarounds use host- language context (the char before the opening _ and after the closing _); implementations MAY encode this as a pre-check.
14.5 Code span
Normative for §7.3.
code_span = run:backticks space_trim content:(!same_run .)* same_run
{ return Code(strip_edges(content)) }
backticks = "`"+
same_run = &{ same_length(run) } "`"+
space_trim = // single leading and trailing space stripped if both present14.6 Link destination
Normative for §7.6.
link = "[" label:inline_body "]" "(" dest:link_dest title:link_title? ")"
link_dest = "<" bracketed_dest ">" / bare_dest
bracketed_dest = (!("<"/">") .)*
bare_dest = bare_dest_char+
bare_dest_char = !space !["()"] .
/ "(" bare_dest_char* ")" // balanced parens
/ "\\" [!-~] // escape
link_title = space+ quoted
quoted = '"' (!'"' .)* '"'
/ "'" (!"'" .)* "'"
/ "(" (!")" .)* ")"14.7 Custom-element attribute
Normative for §8.4.
attributes = (space+ attribute)*
attribute = name:attr_name (space* "=" space* value:attr_value)?
attr_name = [a-z_:] [a-z0-9_:.-]*
attr_value = dquoted / squoted / unquoted
dquoted = '"' (![\"] .)* '"'
squoted = "'" (![\'] .)* "'"
unquoted = (![ \t\n\r\"\'=<>`] .)+Names matching ^on MUST be rejected after parsing (post-filter; cannot be expressed in PEG without negative-lookahead gymnastics).
14.8 Custom element start tag
Normative for §8.3.
tag_name = [a-z] [a-z0-9]* "-" [a-z0-9-]*
void_tag = "<" tag:tag_name attrs:attributes space* "/>"
open_tag = "<" tag:tag_name attrs:attributes space* ">"
close_tag = "</" tag:tag_name space* ">"Tag must match the whitelist post-parse.
14.9 Streaming boundary
Normative for §5 Streaming Model.
The streaming boundary is not a syntactic construct but a state transition. A token's PEG rule is classified as streamable if evaluating it on a truncated input can produce a well-formed partial result with a pending marker. The following table summarizes streamable block rules:
| Block kind | Streamable? | Pending while… |
|---|---|---|
| Paragraph | yes | no blank line seen |
| Heading | no | always complete at LF |
| ThematicBreak | no | always complete at LF |
| CodeBlock | yes | closing fence not seen |
| BlockQuote | yes | children still pending |
| List | yes | next line could extend |
| Table | yes | next line could be row |
| CustomBlock | yes | closing tag not seen |
An inline's streamable-ness is inherited from its containing block.
14.10 Full grammar file
The above fragments are collected in src/grammar/index.peg, which is the normative artifact. The prose in this chapter is explanatory.
14.11 Grammar maintenance
Grammar fragments MUST stay in sync with prose and with implementation. When prose and grammar disagree, it is an editorial error; one MUST be corrected in a subsequent draft. Per §2.4, a vector disagreeing with both MUST be the authority, and both MUST be updated.