mdlint
is a Markdown linter. It is designed to enforce specific rules about how Markdown is to be written in the Fuchsia Source Tree. This linter is designed to parse Hoedown syntax, as used on the fuchsia.dev site.
Configure, and build
fx set core.x64 --with //tools/mdlint:host # or similar fx build
Example invocation running specific rules over //docs
, and reporting all findings:
fx mdlint --root-dir docs \ --enable no-extra-space-on-right \ --enable casing-of-anchors \ --enable bad-lists \ --enable verify-internal-links
Example invocation running all rules over //docs
, and only reporting findings within Markdown documents whose filenames match docs/contribute/governance
:
fx mdlint --root-dir docs \ --enable all \ --filter-filenames docs/contribute/governance
Configure
fx set core.x64 --with //tools/mdlint:tests # or similar
Then test
fx test mdlint_core_tests fx test mdlint_rules_tests
The linter parses Markdown files successively, typically all files under a root directory.
Each Markdown file is read as a stream of UTF-8 characters (runes), which is then tokenized into a stream of tokens. We recognize specific patterns from this token stream, giving rise to a stream of patterns. This layered processing is similar to how streaming XML parsers are structured, and offers hook points for linting rules to operate at various levels of abstraction.
Because Markdown attaches important meaning to whitespace characters (e.g. leading space to form a list element), and certain constructs' meaning depend on their context (e.g. links, or section headers), the tokenization differs slightly from what is typically done for more standard programming languages.
Tokenization segments streams of runes into meaningful chunks, named tokens.
All whitespace runes are considered tokens, and are preserved in the token stream. For instance, the text Hello, World!
would consist of three tokens: a text token (Hello,
), a whitespace token (
, and lastly followed by a text token (World!
).
Certain tokens are classified and tokenized differently depending on their preceding context. Consider for instance a sentence (with a parenthesis)
which is simply text tokens separated by whitespace tokens, as opposed to a [sentence](with-a-link)
where instead need to identify both the link (sentence
) and it's corresponding URL (with-a-link
). Other similar examples are headings, which are denoted by a series of pound runes (#
) at the start of a line, or heading anchors {#like-so}
, which may only appear on a heading line.
Once a Markdown document has been tokenized, the stream of token is then pattern matched and recognized into a stream of patterns. As an example, depending on placement, the text [Example]
could be a link‘s text, a link’s cross reference, both a link's test and its cross reference, or the start of a cross reference definition.
Implementation wise, the recognition work is done in the recognizer
which bridges the LintRuleOverTokens
rule to a LintRuleOverPatterns
rule.
There are two sets of rules supported, rules over tokens, and rules over patterns. Both of these have common behavior which we describe first.
Common behavior
All rules are invoked:
Over tokens
Rules over tokens are additionally invoked after a document starts to parse, and before a document completes:
Over patterns
Rules over patterns are additionally invoked after a document starts to parse, and before a document completes, for every pattern encountered. A non-exhaustive list includes:
Each rule should be defined in its own file named example_rule.go
. Rules should include a description, which by convention is placed in the test file. The convention is to follow the pattern:
package rules import ( "go.fuchsia.dev/fuchsia/tools/mdlint/core" ) func init() { // or core.RegisterLintRuleOverPatterns(...) core.RegisterLintRuleOverTokens(exampleRuleName, newExampleRule) } const exampleRuleName = "example-rule" type exampleRule struct { ... } var _ core.LintRuleOverTokens = (*exampleRule)(nil) // or core.LintRuleOverPatterns func newExampleRule(reporter core.Reporter) core.LintRuleOverTokens { return &exampleRule{ ... } } // followed by the implementation
Rules should be tested using sample Markdown documents, with the help of the provided testing utilities:
// Description of the rule, with details of the checks provided. func TestExampleRule_firstCase(t *testing.T) { ruleTestCase{ files: map[string]string{ "first.md": `Sample Markdown document Use a «marker» to denote expected warnings. You can place markers on whitespace, for instance« » denotes an expected warning on a non-trimmed line.`, "second.md": `Another Markdown document here.`, }, // or runOverPatterns }.runOverTokens(t, newExampleRule) }
In multi-files tests, we rely non the non-deterministic iteration order of maps to ensure that rules do not rely on a specific file order for their correctness. Consider running new tests multiple times using the go test
flag count
to verify the robustness of your rule.