Skip to content

highlight

A pluggable, zero-dependency syntax-highlighting plugin for fenced code blocks. Replaces basic code-block rendering with colorized output. Recognized languages are intercepted before markdown-code (priority 95 vs 100); unrecognized languages fall through.

Details

PropertyValue
Namehighlight
Priority95
TypeBlock
Factoryhighlight(options?)
Built-in languages10
Bundle size~5.7 KB gzip (entire plugin, all built-in languages)
Dependenciesnone (only @generative-dom/core for plugin types)

How It Works

The plugin intercepts fenced code blocks that have a recognized language tag. Because its priority (95) is lower than markdown-code (100), it gets first chance to match code fences. For unrecognized languages, it returns null and the match falls through to markdown-code.

Code content is tokenized by a shared charcode-hot-path engine driven by per-language LangDef data objects. Tokens become <span class="hl-*"> elements with textContent only — HTML in code is never interpreted (XSS-safe by construction).

Supported Languages

LanguageAliasesToken classes used
JavaScriptjavascript, jskeyword, string, comment, number, operator, punctuation
TypeScripttypescript, tskeyword, type, string, comment, number, operator, punctuation
HTMLhtmlkeyword, string, comment, operator, punctuation
CSScsskeyword, string, comment, number, operator, punctuation
JSONjsonkeyword, string, number, operator, punctuation
Cckeyword, type, builtin (preprocessor), string, comment, number, operator, punctuation
C++cpp, c++keyword, type, string (incl. R"…" raw), comment, number, operator, punctuation
Bashbash, sh, shell, zshkeyword, builtin (declarations + variables), string, comment, number, operator, punctuation
Juliajulia, jlkeyword, type, builtin (@macro), string (incl. :symbol, triple-quoted), comment (incl. #= =#), number, operator, punctuation
Zigzigkeyword, type, builtin (@import etc.), string (incl. \\ multi-line), comment (//, ///, //!), number, operator, punctuation

CSS Classes

Each token type receives a CSS class. Two new classes were added in v2 for richer themes: hl-type (built-in type vocabulary) and hl-builtin (preprocessor directives, macro invocations, shell variables, Zig builtins).

Token TypeCSS ClassTypical use
Keywordhl-keywordcontrol flow, declarations
Typehl-typeint, Int64, string, u32, etc.
Builtinhl-builtin#include, @import, $VAR, @show
Stringhl-string"…", '…', `…`, :symbol, raw strings
Commenthl-comment//, /* */, <!-- -->, #, #= =#, ///
Numberhl-number42, 0xFF, 1.0e-3, 1_000_000
Operatorhl-operator=, +, &&, ?:
Punctuationhl-punctuation(, ), ,, ;, .
Defaulthl-defaultidentifiers, whitespace

Rendering

markdown
```javascript
const greeting = "hello";
// A comment
```

Renders as (whitespace becomes text nodes — no <span class="hl-default"> wrappers around pure spaces, reducing DOM nodes by ~40% on typical code):

html
<pre><code class="language-javascript"><span class="hl-keyword">const</span> <span class="hl-default">greeting</span> <span class="hl-operator">=</span> <span class="hl-string">"hello"</span><span class="hl-punctuation">;</span>
<span class="hl-comment">// A comment</span>
</code></pre>

API

Default usage — all 10 built-in languages

ts
import { highlight } from '@generative-dom/plugin-highlight';

const plugin = highlight();

HighlightOptions

ts
interface HighlightOptions {
  /** Add or override language defs. User-provided wins on alias clash. */
  languages?: Record<string, LangDef>;
  /** Restrict to these aliases. Aliases not listed fall through to markdown-code. */
  include?: string[];
}
OptionTypeDefaultDescription
languagesRecord<string, LangDef>{}Adds or overrides languages by alias
includestring[]undefinedWhitelist of aliases (others fall through)

Subset for smaller surface area

ts
// A chat UI that only ever streams TS + JSON code blocks.
highlight({ include: ['typescript', 'ts', 'json'] });

Custom language authoring

ts
import { highlight } from '@generative-dom/plugin-highlight';
import type { LangDef } from '@generative-dom/plugin-highlight';

const SQL: LangDef = {
  aliases: ['sql'],
  keywords: new Set([
    'SELECT', 'FROM', 'WHERE', 'JOIN', 'ON',
    'AND', 'OR', 'NOT', 'INSERT', 'UPDATE', 'DELETE',
    'CREATE', 'TABLE', 'INDEX', 'PRIMARY', 'KEY', 'FOREIGN',
  ]),
  types: new Set([
    'INT', 'INTEGER', 'VARCHAR', 'TEXT', 'BOOLEAN', 'DATE',
    'TIMESTAMP', 'NUMERIC', 'DECIMAL', 'BLOB',
  ]),
  lineComment: ['--'],
  blockComment: [['/*', '*/']],
  strings: [
    { open: "'", close: "'", escape: true },
    { open: '"', close: '"', escape: true },
  ],
};

const plugin = highlight({ languages: { sql: SQL } });

LangDef interface

ts
interface LangDef {
  /** Aliases that map to this def (e.g. ['javascript', 'js']) */
  readonly aliases: readonly string[];
  /** Keywords → hl-keyword */
  readonly keywords: ReadonlySet<string>;
  /** Built-in types → hl-type */
  readonly types?: ReadonlySet<string>;
  /** Builtins / library calls → hl-builtin */
  readonly builtins?: ReadonlySet<string>;
  /** Line comment prefixes (e.g. ['//'] or ['#']) */
  readonly lineComment?: readonly string[];
  /** Block comment open/close pairs (e.g. [['/*', '*/']]) */
  readonly blockComment?: readonly (readonly [string, string])[];
  /** Ordered string delimiter rules — first match wins */
  readonly strings?: readonly StringRule[];
  /** Language-specific patterns (preprocessor, $VAR, @macro, etc.) — run BEFORE the built-in dispatch */
  readonly extraRules?: readonly ExtraRule[];
}

Styling

The plugin provides CSS classes but no default styles. Add your own:

css
.hl-keyword     { color: #c678dd; }                       /* mauve */
.hl-type        { color: #f9e2af; }                       /* yellow */
.hl-builtin     { color: #fab387; }                       /* peach */
.hl-string      { color: #98c379; }                       /* green */
.hl-comment     { color: #5c6370; font-style: italic; }
.hl-number      { color: #d19a66; }                       /* peach */
.hl-operator    { color: #56b6c2; }                       /* cyan */
.hl-punctuation { color: #abb2bf; }
.hl-default     { color: #abb2bf; }

Architecture (v2)

@generative-dom/plugin-highlight/
  src/
    types.ts            HlTokenType, HlToken, LangDef, StringRule, ExtraRule
    engine.ts           tokenize(code, def) — charcode hot path, no regex inner loop
    render.ts           renderTokens — whitespace as text nodes, fragment batching
    plugin.ts           highlight() factory + matchBlock + render
    index.ts            public exports
    lang/               one file per built-in language
      javascript.ts
      typescript.ts
      html.ts
      css.ts
      json.ts
      c.ts
      cpp.ts
      bash.ts
      julia.ts
      zig.ts
      index.ts          BUILTIN_LANGS alias map

Edge Cases

  • HTML tags inside code are escaped and displayed as text, not rendered (XSS-safe — no innerHTML ever).
  • Very long lines are not wrapped by the plugin — CSS white-space and overflow control wrapping.
  • Empty code blocks with a language tag render as an empty highlighted block.
  • Nested template literals with ${} in JavaScript are tokenized as one string (the engine doesn't recurse into expression interpolations — keeps the cost predictable).
  • Heredocs in Bash and other "second-pass" syntactic forms are NOT tokenized inside the body (the opening line is correct).
  • C++ raw strings R"delim(…)delim" are matched best-effort. Unbalanced or runaway raw strings stop at end-of-input rather than locking up the tokenizer.
  • Unrecognized languages fall through to markdown-code for basic rendering.

Performance

  • Single fenced block, JS, ~50 lines: < 0.5 ms tokenize on a developer laptop.
  • 5KB adversarial input (mixed delimiters): < 100 ms across all 10 languages — see Cross-language invariants test suite.
  • Round-trip is byte-identical: tokens.map(t => t.value).join('') === input for every language and every input. Tested on 10 realistic samples + adversarial input + whitespace-only edge cases.

Public exports

ts
import {
  highlight,             // plugin factory
  type HighlightOptions, // factory options
  type LangDef,          // for custom languages
  type StringRule,
  type ExtraRule,
  type HlTokenType,
  type HlToken,
  // built-in language constants — for inspection or composition
  BUILTIN_LANGS,
  LANG_JAVASCRIPT, LANG_TYPESCRIPT,
  LANG_HTML, LANG_CSS, LANG_JSON,
  LANG_C, LANG_CPP, LANG_BASH, LANG_JULIA, LANG_ZIG,
  // escape hatches
  tokenize,              // tokenize(code, def): HlToken[]
  renderTokens,          // renderTokens(tokens, codeEl, ctx)
} from '@generative-dom/plugin-highlight';