| --- |
| title: CommonMark Spec |
| author: John MacFarlane |
| version: 0.22 |
| date: 2015-08-23 |
| license: '[CC-BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/)' |
| ... |
| |
| # Introduction |
| |
| ## What is Markdown? |
| |
| Markdown is a plain text format for writing structured documents, |
| based on conventions used for indicating formatting in email and |
| usenet posts. It was developed in 2004 by John Gruber, who wrote |
| the first Markdown-to-HTML converter in perl, and it soon became |
| widely used in websites. By 2014 there were dozens of |
| implementations in many languages. Some of them extended basic |
| Markdown syntax with conventions for footnotes, definition lists, |
| tables, and other constructs, and some allowed output not just in |
| HTML but in LaTeX and many other formats. |
| |
| ## Why is a spec needed? |
| |
| John Gruber's [canonical description of Markdown's |
| syntax](http://daringfireball.net/projects/markdown/syntax) |
| does not specify the syntax unambiguously. Here are some examples of |
| questions it does not answer: |
| |
| 1. How much indentation is needed for a sublist? The spec says that |
| continuation paragraphs need to be indented four spaces, but is |
| not fully explicit about sublists. It is natural to think that |
| they, too, must be indented four spaces, but `Markdown.pl` does |
| not require that. This is hardly a "corner case," and divergences |
| between implementations on this issue often lead to surprises for |
| users in real documents. (See [this comment by John |
| Gruber](http://article.gmane.org/gmane.text.markdown.general/1997).) |
| |
| 2. Is a blank line needed before a block quote or header? |
| Most implementations do not require the blank line. However, |
| this can lead to unexpected results in hard-wrapped text, and |
| also to ambiguities in parsing (note that some implementations |
| put the header inside the blockquote, while others do not). |
| (John Gruber has also spoken [in favor of requiring the blank |
| lines](http://article.gmane.org/gmane.text.markdown.general/2146).) |
| |
| 3. Is a blank line needed before an indented code block? |
| (`Markdown.pl` requires it, but this is not mentioned in the |
| documentation, and some implementations do not require it.) |
| |
| ``` markdown |
| paragraph |
| code? |
| ``` |
| |
| 4. What is the exact rule for determining when list items get |
| wrapped in `<p>` tags? Can a list be partially "loose" and partially |
| "tight"? What should we do with a list like this? |
| |
| ``` markdown |
| 1. one |
| |
| 2. two |
| 3. three |
| ``` |
| |
| Or this? |
| |
| ``` markdown |
| 1. one |
| - a |
| |
| - b |
| 2. two |
| ``` |
| |
| (There are some relevant comments by John Gruber |
| [here](http://article.gmane.org/gmane.text.markdown.general/2554).) |
| |
| 5. Can list markers be indented? Can ordered list markers be right-aligned? |
| |
| ``` markdown |
| 8. item 1 |
| 9. item 2 |
| 10. item 2a |
| ``` |
| |
| 6. Is this one list with a horizontal rule in its second item, |
| or two lists separated by a horizontal rule? |
| |
| ``` markdown |
| * a |
| * * * * * |
| * b |
| ``` |
| |
| 7. When list markers change from numbers to bullets, do we have |
| two lists or one? (The Markdown syntax description suggests two, |
| but the perl scripts and many other implementations produce one.) |
| |
| ``` markdown |
| 1. fee |
| 2. fie |
| - foe |
| - fum |
| ``` |
| |
| 8. What are the precedence rules for the markers of inline structure? |
| For example, is the following a valid link, or does the code span |
| take precedence ? |
| |
| ``` markdown |
| [a backtick (`)](/url) and [another backtick (`)](/url). |
| ``` |
| |
| 9. What are the precedence rules for markers of emphasis and strong |
| emphasis? For example, how should the following be parsed? |
| |
| ``` markdown |
| *foo *bar* baz* |
| ``` |
| |
| 10. What are the precedence rules between block-level and inline-level |
| structure? For example, how should the following be parsed? |
| |
| ``` markdown |
| - `a long code span can contain a hyphen like this |
| - and it can screw things up` |
| ``` |
| |
| 11. Can list items include section headers? (`Markdown.pl` does not |
| allow this, but does allow blockquotes to include headers.) |
| |
| ``` markdown |
| - # Heading |
| ``` |
| |
| 12. Can list items be empty? |
| |
| ``` markdown |
| * a |
| * |
| * b |
| ``` |
| |
| 13. Can link references be defined inside block quotes or list items? |
| |
| ``` markdown |
| > Blockquote [foo]. |
| > |
| > [foo]: /url |
| ``` |
| |
| 14. If there are multiple definitions for the same reference, which takes |
| precedence? |
| |
| ``` markdown |
| [foo]: /url1 |
| [foo]: /url2 |
| |
| [foo][] |
| ``` |
| |
| In the absence of a spec, early implementers consulted `Markdown.pl` |
| to resolve these ambiguities. But `Markdown.pl` was quite buggy, and |
| gave manifestly bad results in many cases, so it was not a |
| satisfactory replacement for a spec. |
| |
| Because there is no unambiguous spec, implementations have diverged |
| considerably. As a result, users are often surprised to find that |
| a document that renders one way on one system (say, a github wiki) |
| renders differently on another (say, converting to docbook using |
| pandoc). To make matters worse, because nothing in Markdown counts |
| as a "syntax error," the divergence often isn't discovered right away. |
| |
| ## About this document |
| |
| This document attempts to specify Markdown syntax unambiguously. |
| It contains many examples with side-by-side Markdown and |
| HTML. These are intended to double as conformance tests. An |
| accompanying script `spec_tests.py` can be used to run the tests |
| against any Markdown program: |
| |
| python test/spec_tests.py --spec spec.txt --program PROGRAM |
| |
| Since this document describes how Markdown is to be parsed into |
| an abstract syntax tree, it would have made sense to use an abstract |
| representation of the syntax tree instead of HTML. But HTML is capable |
| of representing the structural distinctions we need to make, and the |
| choice of HTML for the tests makes it possible to run the tests against |
| an implementation without writing an abstract syntax tree renderer. |
| |
| This document is generated from a text file, `spec.txt`, written |
| in Markdown with a small extension for the side-by-side tests. |
| The script `tools/makespec.py` can be used to convert `spec.txt` into |
| HTML or CommonMark (which can then be converted into other formats). |
| |
| In the examples, the `→` character is used to represent tabs. |
| |
| # Preliminaries |
| |
| ## Characters and lines |
| |
| Any sequence of [character]s is a valid CommonMark |
| document. |
| |
| A [character](@character) is a Unicode code point. Although some |
| code points (for example, combining accents) do not correspond to |
| characters in an intuitive sense, all code points count as characters |
| for purposes of this spec. |
| |
| This spec does not specify an encoding; it thinks of lines as composed |
| of [character]s rather than bytes. A conforming parser may be limited |
| to a certain encoding. |
| |
| A [line](@line) is a sequence of zero or more [character]s |
| other than newline (`U+000A`) or carriage return (`U+000D`), |
| followed by a [line ending] or by the end of file. |
| |
| A [line ending](@line-ending) is a newline (`U+000A`), a carriage return |
| (`U+000D`) not followed by a newline, or a carriage return and a |
| following newline. |
| |
| A line containing no characters, or a line containing only spaces |
| (`U+0020`) or tabs (`U+0009`), is called a [blank line](@blank-line). |
| |
| The following definitions of character classes will be used in this spec: |
| |
| A [whitespace character](@whitespace-character) is a space |
| (`U+0020`), tab (`U+0009`), newline (`U+000A`), line tabulation (`U+000B`), |
| form feed (`U+000C`), or carriage return (`U+000D`). |
| |
| [Whitespace](@whitespace) is a sequence of one or more [whitespace |
| character]s. |
| |
| A [Unicode whitespace character](@unicode-whitespace-character) is |
| any code point in the Unicode `Zs` class, or a tab (`U+0009`), |
| carriage return (`U+000D`), newline (`U+000A`), or form feed |
| (`U+000C`). |
| |
| [Unicode whitespace](@unicode-whitespace) is a sequence of one |
| or more [Unicode whitespace character]s. |
| |
| A [space](@space) is `U+0020`. |
| |
| A [non-whitespace character](@non-whitespace-character) is any character |
| that is not a [whitespace character]. |
| |
| An [ASCII punctuation character](@ascii-punctuation-character) |
| is `!`, `"`, `#`, `$`, `%`, `&`, `'`, `(`, `)`, |
| `*`, `+`, `,`, `-`, `.`, `/`, `:`, `;`, `<`, `=`, `>`, `?`, `@`, |
| `[`, `\`, `]`, `^`, `_`, `` ` ``, `{`, `|`, `}`, or `~`. |
| |
| A [punctuation character](@punctuation-character) is an [ASCII |
| punctuation character] or anything in |
| the Unicode classes `Pc`, `Pd`, `Pe`, `Pf`, `Pi`, `Po`, or `Ps`. |
| |
| ## Tabs |
| |
| Tabs in lines are not expanded to [spaces][space]. However, |
| in contexts where indentation is significant for the |
| document's structure, tabs behave as if they were replaced |
| by spaces with a tab stop of 4 characters. |
| |
| . |
| →foo→baz→→bim |
| . |
| <pre><code>foo→baz→→bim |
| </code></pre> |
| . |
| |
| . |
| →foo→baz→→bim |
| . |
| <pre><code>foo→baz→→bim |
| </code></pre> |
| . |
| |
| . |
| a→a |
| ὐ→a |
| . |
| <pre><code>a→a |
| ὐ→a |
| </code></pre> |
| . |
| |
| . |
| - foo |
| |
| →bar |
| . |
| <ul> |
| <li> |
| <p>foo</p> |
| <p>bar</p> |
| </li> |
| </ul> |
| . |
| |
| . |
| >→foo→bar |
| . |
| <blockquote> |
| <p>foo→bar</p> |
| </blockquote> |
| . |
| |
| . |
| foo |
| →bar |
| . |
| <pre><code>foo |
| bar |
| </code></pre> |
| . |
| |
| |
| ## Insecure characters |
| |
| For security reasons, the Unicode character `U+0000` must be replaced |
| with the replacement character (`U+FFFD`). |
| |
| # Blocks and inlines |
| |
| We can think of a document as a sequence of |
| [blocks](@block)---structural elements like paragraphs, block |
| quotations, lists, headers, rules, and code blocks. Some blocks (like |
| block quotes and list items) contain other blocks; others (like |
| headers and paragraphs) contain [inline](@inline) content---text, |
| links, emphasized text, images, code, and so on. |
| |
| ## Precedence |
| |
| Indicators of block structure always take precedence over indicators |
| of inline structure. So, for example, the following is a list with |
| two items, not a list with one item containing a code span: |
| |
| . |
| - `one |
| - two` |
| . |
| <ul> |
| <li>`one</li> |
| <li>two`</li> |
| </ul> |
| . |
| |
| This means that parsing can proceed in two steps: first, the block |
| structure of the document can be discerned; second, text lines inside |
| paragraphs, headers, and other block constructs can be parsed for inline |
| structure. The second step requires information about link reference |
| definitions that will be available only at the end of the first |
| step. Note that the first step requires processing lines in sequence, |
| but the second can be parallelized, since the inline parsing of |
| one block element does not affect the inline parsing of any other. |
| |
| ## Container blocks and leaf blocks |
| |
| We can divide blocks into two types: |
| [container block](@container-block)s, |
| which can contain other blocks, and [leaf block](@leaf-block)s, |
| which cannot. |
| |
| # Leaf blocks |
| |
| This section describes the different kinds of leaf block that make up a |
| Markdown document. |
| |
| ## Horizontal rules |
| |
| A line consisting of 0-3 spaces of indentation, followed by a sequence |
| of three or more matching `-`, `_`, or `*` characters, each followed |
| optionally by any number of spaces, forms a |
| [horizontal rule](@horizontal-rule). |
| |
| . |
| *** |
| --- |
| ___ |
| . |
| <hr /> |
| <hr /> |
| <hr /> |
| . |
| |
| Wrong characters: |
| |
| . |
| +++ |
| . |
| <p>+++</p> |
| . |
| |
| . |
| === |
| . |
| <p>===</p> |
| . |
| |
| Not enough characters: |
| |
| . |
| -- |
| ** |
| __ |
| . |
| <p>-- |
| ** |
| __</p> |
| . |
| |
| One to three spaces indent are allowed: |
| |
| . |
| *** |
| *** |
| *** |
| . |
| <hr /> |
| <hr /> |
| <hr /> |
| . |
| |
| Four spaces is too many: |
| |
| . |
| *** |
| . |
| <pre><code>*** |
| </code></pre> |
| . |
| |
| . |
| Foo |
| *** |
| . |
| <p>Foo |
| ***</p> |
| . |
| |
| More than three characters may be used: |
| |
| . |
| _____________________________________ |
| . |
| <hr /> |
| . |
| |
| Spaces are allowed between the characters: |
| |
| . |
| - - - |
| . |
| <hr /> |
| . |
| |
| . |
| ** * ** * ** * ** |
| . |
| <hr /> |
| . |
| |
| . |
| - - - - |
| . |
| <hr /> |
| . |
| |
| Spaces are allowed at the end: |
| |
| . |
| - - - - |
| . |
| <hr /> |
| . |
| |
| However, no other characters may occur in the line: |
| |
| . |
| _ _ _ _ a |
| |
| a------ |
| |
| ---a--- |
| . |
| <p>_ _ _ _ a</p> |
| <p>a------</p> |
| <p>---a---</p> |
| . |
| |
| It is required that all of the [non-whitespace character]s be the same. |
| So, this is not a horizontal rule: |
| |
| . |
| *-* |
| . |
| <p><em>-</em></p> |
| . |
| |
| Horizontal rules do not need blank lines before or after: |
| |
| . |
| - foo |
| *** |
| - bar |
| . |
| <ul> |
| <li>foo</li> |
| </ul> |
| <hr /> |
| <ul> |
| <li>bar</li> |
| </ul> |
| . |
| |
| Horizontal rules can interrupt a paragraph: |
| |
| . |
| Foo |
| *** |
| bar |
| . |
| <p>Foo</p> |
| <hr /> |
| <p>bar</p> |
| . |
| |
| If a line of dashes that meets the above conditions for being a |
| horizontal rule could also be interpreted as the underline of a [setext |
| header], the interpretation as a |
| [setext header] takes precedence. Thus, for example, |
| this is a setext header, not a paragraph followed by a horizontal rule: |
| |
| . |
| Foo |
| --- |
| bar |
| . |
| <h2>Foo</h2> |
| <p>bar</p> |
| . |
| |
| When both a horizontal rule and a list item are possible |
| interpretations of a line, the horizontal rule takes precedence: |
| |
| . |
| * Foo |
| * * * |
| * Bar |
| . |
| <ul> |
| <li>Foo</li> |
| </ul> |
| <hr /> |
| <ul> |
| <li>Bar</li> |
| </ul> |
| . |
| |
| If you want a horizontal rule in a list item, use a different bullet: |
| |
| . |
| - Foo |
| - * * * |
| . |
| <ul> |
| <li>Foo</li> |
| <li> |
| <hr /> |
| </li> |
| </ul> |
| . |
| |
| ## ATX headers |
| |
| An [ATX header](@atx-header) |
| consists of a string of characters, parsed as inline content, between an |
| opening sequence of 1--6 unescaped `#` characters and an optional |
| closing sequence of any number of unescaped `#` characters. |
| The opening sequence of `#` characters cannot be followed directly by a |
| [non-whitespace character]. The optional closing sequence of `#`s must be |
| preceded by a [space] and may be followed by spaces only. The opening |
| `#` character may be indented 0-3 spaces. The raw contents of the |
| header are stripped of leading and trailing spaces before being parsed |
| as inline content. The header level is equal to the number of `#` |
| characters in the opening sequence. |
| |
| Simple headers: |
| |
| . |
| # foo |
| ## foo |
| ### foo |
| #### foo |
| ##### foo |
| ###### foo |
| . |
| <h1>foo</h1> |
| <h2>foo</h2> |
| <h3>foo</h3> |
| <h4>foo</h4> |
| <h5>foo</h5> |
| <h6>foo</h6> |
| . |
| |
| More than six `#` characters is not a header: |
| |
| . |
| ####### foo |
| . |
| <p>####### foo</p> |
| . |
| |
| At least one space is required between the `#` characters and the |
| header's contents, unless the header is empty. Note that many |
| implementations currently do not require the space. However, the |
| space was required by the |
| [original ATX implementation](http://www.aaronsw.com/2002/atx/atx.py), |
| and it helps prevent things like the following from being parsed as |
| headers: |
| |
| . |
| #5 bolt |
| |
| #foobar |
| . |
| <p>#5 bolt</p> |
| <p>#foobar</p> |
| . |
| |
| This is not a header, because the first `#` is escaped: |
| |
| . |
| \## foo |
| . |
| <p>## foo</p> |
| . |
| |
| Contents are parsed as inlines: |
| |
| . |
| # foo *bar* \*baz\* |
| . |
| <h1>foo <em>bar</em> *baz*</h1> |
| . |
| |
| Leading and trailing blanks are ignored in parsing inline content: |
| |
| . |
| # foo |
| . |
| <h1>foo</h1> |
| . |
| |
| One to three spaces indentation are allowed: |
| |
| . |
| ### foo |
| ## foo |
| # foo |
| . |
| <h3>foo</h3> |
| <h2>foo</h2> |
| <h1>foo</h1> |
| . |
| |
| Four spaces are too much: |
| |
| . |
| # foo |
| . |
| <pre><code># foo |
| </code></pre> |
| . |
| |
| . |
| foo |
| # bar |
| . |
| <p>foo |
| # bar</p> |
| . |
| |
| A closing sequence of `#` characters is optional: |
| |
| . |
| ## foo ## |
| ### bar ### |
| . |
| <h2>foo</h2> |
| <h3>bar</h3> |
| . |
| |
| It need not be the same length as the opening sequence: |
| |
| . |
| # foo ################################## |
| ##### foo ## |
| . |
| <h1>foo</h1> |
| <h5>foo</h5> |
| . |
| |
| Spaces are allowed after the closing sequence: |
| |
| . |
| ### foo ### |
| . |
| <h3>foo</h3> |
| . |
| |
| A sequence of `#` characters with anything but [space]s following it |
| is not a closing sequence, but counts as part of the contents of the |
| header: |
| |
| . |
| ### foo ### b |
| . |
| <h3>foo ### b</h3> |
| . |
| |
| The closing sequence must be preceded by a space: |
| |
| . |
| # foo# |
| . |
| <h1>foo#</h1> |
| . |
| |
| Backslash-escaped `#` characters do not count as part |
| of the closing sequence: |
| |
| . |
| ### foo \### |
| ## foo #\## |
| # foo \# |
| . |
| <h3>foo ###</h3> |
| <h2>foo ###</h2> |
| <h1>foo #</h1> |
| . |
| |
| ATX headers need not be separated from surrounding content by blank |
| lines, and they can interrupt paragraphs: |
| |
| . |
| **** |
| ## foo |
| **** |
| . |
| <hr /> |
| <h2>foo</h2> |
| <hr /> |
| . |
| |
| . |
| Foo bar |
| # baz |
| Bar foo |
| . |
| <p>Foo bar</p> |
| <h1>baz</h1> |
| <p>Bar foo</p> |
| . |
| |
| ATX headers can be empty: |
| |
| . |
| ## |
| # |
| ### ### |
| . |
| <h2></h2> |
| <h1></h1> |
| <h3></h3> |
| . |
| |
| ## Setext headers |
| |
| A [setext header](@setext-header) |
| consists of a line of text, containing at least one [non-whitespace character], |
| with no more than 3 spaces indentation, followed by a [setext header |
| underline]. The line of text must be |
| one that, were it not followed by the setext header underline, |
| would be interpreted as part of a paragraph: it cannot be |
| interpretable as a [code fence], [ATX header][ATX headers], |
| [block quote][block quotes], [horizontal rule][horizontal rules], |
| [list item][list items], or [HTML block][HTML blocks]. |
| |
| A [setext header underline](@setext-header-underline) is a sequence of |
| `=` characters or a sequence of `-` characters, with no more than 3 |
| spaces indentation and any number of trailing spaces. If a line |
| containing a single `-` can be interpreted as an |
| empty [list items], it should be interpreted this way |
| and not as a [setext header underline]. |
| |
| The header is a level 1 header if `=` characters are used in the |
| [setext header underline], and a level 2 |
| header if `-` characters are used. The contents of the header are the |
| result of parsing the first line as Markdown inline content. |
| |
| In general, a setext header need not be preceded or followed by a |
| blank line. However, it cannot interrupt a paragraph, so when a |
| setext header comes after a paragraph, a blank line is needed between |
| them. |
| |
| Simple examples: |
| |
| . |
| Foo *bar* |
| ========= |
| |
| Foo *bar* |
| --------- |
| . |
| <h1>Foo <em>bar</em></h1> |
| <h2>Foo <em>bar</em></h2> |
| . |
| |
| The underlining can be any length: |
| |
| . |
| Foo |
| ------------------------- |
| |
| Foo |
| = |
| . |
| <h2>Foo</h2> |
| <h1>Foo</h1> |
| . |
| |
| The header content can be indented up to three spaces, and need |
| not line up with the underlining: |
| |
| . |
| Foo |
| --- |
| |
| Foo |
| ----- |
| |
| Foo |
| === |
| . |
| <h2>Foo</h2> |
| <h2>Foo</h2> |
| <h1>Foo</h1> |
| . |
| |
| Four spaces indent is too much: |
| |
| . |
| Foo |
| --- |
| |
| Foo |
| --- |
| . |
| <pre><code>Foo |
| --- |
| |
| Foo |
| </code></pre> |
| <hr /> |
| . |
| |
| The setext header underline can be indented up to three spaces, and |
| may have trailing spaces: |
| |
| . |
| Foo |
| ---- |
| . |
| <h2>Foo</h2> |
| . |
| |
| Four spaces is too much: |
| |
| . |
| Foo |
| --- |
| . |
| <p>Foo |
| ---</p> |
| . |
| |
| The setext header underline cannot contain internal spaces: |
| |
| . |
| Foo |
| = = |
| |
| Foo |
| --- - |
| . |
| <p>Foo |
| = =</p> |
| <p>Foo</p> |
| <hr /> |
| . |
| |
| Trailing spaces in the content line do not cause a line break: |
| |
| . |
| Foo |
| ----- |
| . |
| <h2>Foo</h2> |
| . |
| |
| Nor does a backslash at the end: |
| |
| . |
| Foo\ |
| ---- |
| . |
| <h2>Foo\</h2> |
| . |
| |
| Since indicators of block structure take precedence over |
| indicators of inline structure, the following are setext headers: |
| |
| . |
| `Foo |
| ---- |
| ` |
| |
| <a title="a lot |
| --- |
| of dashes"/> |
| . |
| <h2>`Foo</h2> |
| <p>`</p> |
| <h2><a title="a lot</h2> |
| <p>of dashes"/></p> |
| . |
| |
| The setext header underline cannot be a [lazy continuation |
| line] in a list item or block quote: |
| |
| . |
| > Foo |
| --- |
| . |
| <blockquote> |
| <p>Foo</p> |
| </blockquote> |
| <hr /> |
| . |
| |
| . |
| - Foo |
| --- |
| . |
| <ul> |
| <li>Foo</li> |
| </ul> |
| <hr /> |
| . |
| |
| A setext header cannot interrupt a paragraph: |
| |
| . |
| Foo |
| Bar |
| --- |
| |
| Foo |
| Bar |
| === |
| . |
| <p>Foo |
| Bar</p> |
| <hr /> |
| <p>Foo |
| Bar |
| ===</p> |
| . |
| |
| But in general a blank line is not required before or after: |
| |
| . |
| --- |
| Foo |
| --- |
| Bar |
| --- |
| Baz |
| . |
| <hr /> |
| <h2>Foo</h2> |
| <h2>Bar</h2> |
| <p>Baz</p> |
| . |
| |
| Setext headers cannot be empty: |
| |
| . |
| |
| ==== |
| . |
| <p>====</p> |
| . |
| |
| Setext header text lines must not be interpretable as block |
| constructs other than paragraphs. So, the line of dashes |
| in these examples gets interpreted as a horizontal rule: |
| |
| . |
| --- |
| --- |
| . |
| <hr /> |
| <hr /> |
| . |
| |
| . |
| - foo |
| ----- |
| . |
| <ul> |
| <li>foo</li> |
| </ul> |
| <hr /> |
| . |
| |
| . |
| foo |
| --- |
| . |
| <pre><code>foo |
| </code></pre> |
| <hr /> |
| . |
| |
| . |
| > foo |
| ----- |
| . |
| <blockquote> |
| <p>foo</p> |
| </blockquote> |
| <hr /> |
| . |
| |
| If you want a header with `> foo` as its literal text, you can |
| use backslash escapes: |
| |
| . |
| \> foo |
| ------ |
| . |
| <h2>> foo</h2> |
| . |
| |
| ## Indented code blocks |
| |
| An [indented code block](@indented-code-block) is composed of one or more |
| [indented chunk]s separated by blank lines. |
| An [indented chunk](@indented-chunk) is a sequence of non-blank lines, |
| each indented four or more spaces. The contents of the code block are |
| the literal contents of the lines, including trailing |
| [line ending]s, minus four spaces of indentation. |
| An indented code block has no [info string]. |
| |
| An indented code block cannot interrupt a paragraph, so there must be |
| a blank line between a paragraph and a following indented code block. |
| (A blank line is not needed, however, between a code block and a following |
| paragraph.) |
| |
| . |
| a simple |
| indented code block |
| . |
| <pre><code>a simple |
| indented code block |
| </code></pre> |
| . |
| |
| If there is any ambiguity between an interpretation of indentation |
| as a code block and as indicating that material belongs to a [list |
| item][list items], the list item interpretation takes precedence: |
| |
| . |
| - foo |
| |
| bar |
| . |
| <ul> |
| <li> |
| <p>foo</p> |
| <p>bar</p> |
| </li> |
| </ul> |
| . |
| |
| . |
| 1. foo |
| |
| - bar |
| . |
| <ol> |
| <li> |
| <p>foo</p> |
| <ul> |
| <li>bar</li> |
| </ul> |
| </li> |
| </ol> |
| . |
| |
| |
| The contents of a code block are literal text, and do not get parsed |
| as Markdown: |
| |
| . |
| <a/> |
| *hi* |
| |
| - one |
| . |
| <pre><code><a/> |
| *hi* |
| |
| - one |
| </code></pre> |
| . |
| |
| Here we have three chunks separated by blank lines: |
| |
| . |
| chunk1 |
| |
| chunk2 |
| |
| |
| |
| chunk3 |
| . |
| <pre><code>chunk1 |
| |
| chunk2 |
| |
| |
| |
| chunk3 |
| </code></pre> |
| . |
| |
| Any initial spaces beyond four will be included in the content, even |
| in interior blank lines: |
| |
| . |
| chunk1 |
| |
| chunk2 |
| . |
| <pre><code>chunk1 |
| |
| chunk2 |
| </code></pre> |
| . |
| |
| An indented code block cannot interrupt a paragraph. (This |
| allows hanging indents and the like.) |
| |
| . |
| Foo |
| bar |
| |
| . |
| <p>Foo |
| bar</p> |
| . |
| |
| However, any non-blank line with fewer than four leading spaces ends |
| the code block immediately. So a paragraph may occur immediately |
| after indented code: |
| |
| . |
| foo |
| bar |
| . |
| <pre><code>foo |
| </code></pre> |
| <p>bar</p> |
| . |
| |
| And indented code can occur immediately before and after other kinds of |
| blocks: |
| |
| . |
| # Header |
| foo |
| Header |
| ------ |
| foo |
| ---- |
| . |
| <h1>Header</h1> |
| <pre><code>foo |
| </code></pre> |
| <h2>Header</h2> |
| <pre><code>foo |
| </code></pre> |
| <hr /> |
| . |
| |
| The first line can be indented more than four spaces: |
| |
| . |
| foo |
| bar |
| . |
| <pre><code> foo |
| bar |
| </code></pre> |
| . |
| |
| Blank lines preceding or following an indented code block |
| are not included in it: |
| |
| . |
| |
| |
| foo |
| |
| |
| . |
| <pre><code>foo |
| </code></pre> |
| . |
| |
| Trailing spaces are included in the code block's content: |
| |
| . |
| foo |
| . |
| <pre><code>foo |
| </code></pre> |
| . |
| |
| |
| ## Fenced code blocks |
| |
| A [code fence](@code-fence) is a sequence |
| of at least three consecutive backtick characters (`` ` ``) or |
| tildes (`~`). (Tildes and backticks cannot be mixed.) |
| A [fenced code block](@fenced-code-block) |
| begins with a code fence, indented no more than three spaces. |
| |
| The line with the opening code fence may optionally contain some text |
| following the code fence; this is trimmed of leading and trailing |
| spaces and called the [info string](@info-string). |
| The [info string] may not contain any backtick |
| characters. (The reason for this restriction is that otherwise |
| some inline code would be incorrectly interpreted as the |
| beginning of a fenced code block.) |
| |
| The content of the code block consists of all subsequent lines, until |
| a closing [code fence] of the same type as the code block |
| began with (backticks or tildes), and with at least as many backticks |
| or tildes as the opening code fence. If the leading code fence is |
| indented N spaces, then up to N spaces of indentation are removed from |
| each line of the content (if present). (If a content line is not |
| indented, it is preserved unchanged. If it is indented less than N |
| spaces, all of the indentation is removed.) |
| |
| The closing code fence may be indented up to three spaces, and may be |
| followed only by spaces, which are ignored. If the end of the |
| containing block (or document) is reached and no closing code fence |
| has been found, the code block contains all of the lines after the |
| opening code fence until the end of the containing block (or |
| document). (An alternative spec would require backtracking in the |
| event that a closing code fence is not found. But this makes parsing |
| much less efficient, and there seems to be no real down side to the |
| behavior described here.) |
| |
| A fenced code block may interrupt a paragraph, and does not require |
| a blank line either before or after. |
| |
| The content of a code fence is treated as literal text, not parsed |
| as inlines. The first word of the [info string] is typically used to |
| specify the language of the code sample, and rendered in the `class` |
| attribute of the `code` tag. However, this spec does not mandate any |
| particular treatment of the [info string]. |
| |
| Here is a simple example with backticks: |
| |
| . |
| ``` |
| < |
| > |
| ``` |
| . |
| <pre><code>< |
| > |
| </code></pre> |
| . |
| |
| With tildes: |
| |
| . |
| ~~~ |
| < |
| > |
| ~~~ |
| . |
| <pre><code>< |
| > |
| </code></pre> |
| . |
| |
| The closing code fence must use the same character as the opening |
| fence: |
| |
| . |
| ``` |
| aaa |
| ~~~ |
| ``` |
| . |
| <pre><code>aaa |
| ~~~ |
| </code></pre> |
| . |
| |
| . |
| ~~~ |
| aaa |
| ``` |
| ~~~ |
| . |
| <pre><code>aaa |
| ``` |
| </code></pre> |
| . |
| |
| The closing code fence must be at least as long as the opening fence: |
| |
| . |
| ```` |
| aaa |
| ``` |
| `````` |
| . |
| <pre><code>aaa |
| ``` |
| </code></pre> |
| . |
| |
| . |
| ~~~~ |
| aaa |
| ~~~ |
| ~~~~ |
| . |
| <pre><code>aaa |
| ~~~ |
| </code></pre> |
| . |
| |
| Unclosed code blocks are closed by the end of the document |
| (or the enclosing [block quote] or [list item]): |
| |
| . |
| ``` |
| . |
| <pre><code></code></pre> |
| . |
| |
| . |
| ````` |
| |
| ``` |
| aaa |
| . |
| <pre><code> |
| ``` |
| aaa |
| </code></pre> |
| . |
| |
| . |
| > ``` |
| > aaa |
| |
| bbb |
| . |
| <blockquote> |
| <pre><code>aaa |
| </code></pre> |
| </blockquote> |
| <p>bbb</p> |
| . |
| |
| A code block can have all empty lines as its content: |
| |
| . |
| ``` |
| |
| |
| ``` |
| . |
| <pre><code> |
| |
| </code></pre> |
| . |
| |
| A code block can be empty: |
| |
| . |
| ``` |
| ``` |
| . |
| <pre><code></code></pre> |
| . |
| |
| Fences can be indented. If the opening fence is indented, |
| content lines will have equivalent opening indentation removed, |
| if present: |
| |
| . |
| ``` |
| aaa |
| aaa |
| ``` |
| . |
| <pre><code>aaa |
| aaa |
| </code></pre> |
| . |
| |
| . |
| ``` |
| aaa |
| aaa |
| aaa |
| ``` |
| . |
| <pre><code>aaa |
| aaa |
| aaa |
| </code></pre> |
| . |
| |
| . |
| ``` |
| aaa |
| aaa |
| aaa |
| ``` |
| . |
| <pre><code>aaa |
| aaa |
| aaa |
| </code></pre> |
| . |
| |
| Four spaces indentation produces an indented code block: |
| |
| . |
| ``` |
| aaa |
| ``` |
| . |
| <pre><code>``` |
| aaa |
| ``` |
| </code></pre> |
| . |
| |
| Closing fences may be indented by 0-3 spaces, and their indentation |
| need not match that of the opening fence: |
| |
| . |
| ``` |
| aaa |
| ``` |
| . |
| <pre><code>aaa |
| </code></pre> |
| . |
| |
| . |
| ``` |
| aaa |
| ``` |
| . |
| <pre><code>aaa |
| </code></pre> |
| . |
| |
| This is not a closing fence, because it is indented 4 spaces: |
| |
| . |
| ``` |
| aaa |
| ``` |
| . |
| <pre><code>aaa |
| ``` |
| </code></pre> |
| . |
| |
| |
| Code fences (opening and closing) cannot contain internal spaces: |
| |
| . |
| ``` ``` |
| aaa |
| . |
| <p><code></code> |
| aaa</p> |
| . |
| |
| . |
| ~~~~~~ |
| aaa |
| ~~~ ~~ |
| . |
| <pre><code>aaa |
| ~~~ ~~ |
| </code></pre> |
| . |
| |
| Fenced code blocks can interrupt paragraphs, and can be followed |
| directly by paragraphs, without a blank line between: |
| |
| . |
| foo |
| ``` |
| bar |
| ``` |
| baz |
| . |
| <p>foo</p> |
| <pre><code>bar |
| </code></pre> |
| <p>baz</p> |
| . |
| |
| Other blocks can also occur before and after fenced code blocks |
| without an intervening blank line: |
| |
| . |
| foo |
| --- |
| ~~~ |
| bar |
| ~~~ |
| # baz |
| . |
| <h2>foo</h2> |
| <pre><code>bar |
| </code></pre> |
| <h1>baz</h1> |
| . |
| |
| An [info string] can be provided after the opening code fence. |
| Opening and closing spaces will be stripped, and the first word, prefixed |
| with `language-`, is used as the value for the `class` attribute of the |
| `code` element within the enclosing `pre` element. |
| |
| . |
| ```ruby |
| def foo(x) |
| return 3 |
| end |
| ``` |
| . |
| <pre><code class="language-ruby">def foo(x) |
| return 3 |
| end |
| </code></pre> |
| . |
| |
| . |
| ~~~~ ruby startline=3 $%@#$ |
| def foo(x) |
| return 3 |
| end |
| ~~~~~~~ |
| . |
| <pre><code class="language-ruby">def foo(x) |
| return 3 |
| end |
| </code></pre> |
| . |
| |
| . |
| ````; |
| ```` |
| . |
| <pre><code class="language-;"></code></pre> |
| . |
| |
| [Info string]s for backtick code blocks cannot contain backticks: |
| |
| . |
| ``` aa ``` |
| foo |
| . |
| <p><code>aa</code> |
| foo</p> |
| . |
| |
| Closing code fences cannot have [info string]s: |
| |
| . |
| ``` |
| ``` aaa |
| ``` |
| . |
| <pre><code>``` aaa |
| </code></pre> |
| . |
| |
| |
| ## HTML blocks |
| |
| An [HTML block](@html-block) is a group of lines that is treated |
| as raw HTML (and will not be escaped in HTML output). |
| |
| There are seven kinds of [HTML block], which can be defined |
| by their start and end conditions. The block begins with a line that |
| meets a [start condition](@start-condition) (after up to three spaces |
| optional indentation). It ends with the first subsequent line that |
| meets a matching [end condition](@end-condition), or the last line of |
| the document, if no line is encountered that meets the |
| [end condition]. If the first line meets both the [start condition] |
| and the [end condition], the block will contain just that line. |
| |
| 1. **Start condition:** line begins with the string `<script`, |
| `<pre`, or `<style` (case-insensitive), followed by whitespace, |
| the string `>`, or the end of the line.\ |
| **End condition:** line contains an end tag |
| `</script>`, `</pre>`, or `</style>` (case-insensitive; it |
| need not match the start tag). |
| |
| 2. **Start condition:** line begins with the string `<!--`.\ |
| **End condition:** line contains the string `-->`. |
| |
| 3. **Start condition:** line begins with the string `<?`.\ |
| **End condition:** line contains the string `?>`. |
| |
| 4. **Start condition:** line begins with the string `<!` |
| followed by an uppercase ASCII letter.\ |
| **End condition:** line contains the character `>`. |
| |
| 5. **Start condition:** line begins with the string |
| `<![CDATA[`.\ |
| **End condition:** line contains the string `]]>`. |
| |
| 6. **Start condition:** line begins the string `<` or `</` |
| followed by one of the strings (case-insensitive) `address`, |
| `article`, `aside`, `base`, `basefont`, `blockquote`, `body`, |
| `caption`, `center`, `col`, `colgroup`, `dd`, `details`, `dialog`, |
| `dir`, `div`, `dl`, `dt`, `fieldset`, `figcaption`, `figure`, |
| `footer`, `form`, `frame`, `frameset`, `h1`, `head`, `header`, `hr`, |
| `html`, `iframe`, `legend`, `li`, `link`, `main`, `menu`, `menuitem`, |
| `meta`, `nav`, `noframes`, `ol`, `optgroup`, `option`, `p`, `param`, |
| `section`, `source`, `summary`, `table`, `tbody`, `td`, |
| `tfoot`, `th`, `thead`, `title`, `tr`, `track`, `ul`, followed |
| by [whitespace], the end of the line, the string `>`, or |
| the string `/>`.\ |
| **End condition:** line is followed by a [blank line]. |
| |
| 7. **Start condition:** line begins with a complete [open tag] |
| or [closing tag] (with any [tag name] other than `script`, |
| `style`, or `pre`) followed only by [whitespace] |
| or the end of the line.\ |
| **End condition:** line is followed by a [blank line]. |
| |
| All types of [HTML blocks] except type 7 may interrupt |
| a paragraph. Blocks of type 7 may not interrupt a paragraph. |
| (This restriction is intended to prevent unwanted interpretation |
| of long tags inside a wrapped paragraph as starting HTML blocks.) |
| |
| Some simple examples follow. Here are some basic HTML blocks |
| of type 6: |
| |
| . |
| <table> |
| <tr> |
| <td> |
| hi |
| </td> |
| </tr> |
| </table> |
| |
| okay. |
| . |
| <table> |
| <tr> |
| <td> |
| hi |
| </td> |
| </tr> |
| </table> |
| <p>okay.</p> |
| . |
| |
| . |
| <div> |
| *hello* |
| <foo><a> |
| . |
| <div> |
| *hello* |
| <foo><a> |
| . |
| |
| A block can also start with a closing tag: |
| |
| . |
| </div> |
| *foo* |
| . |
| </div> |
| *foo* |
| . |
| |
| Here we have two HTML blocks with a Markdown paragraph between them: |
| |
| . |
| <DIV CLASS="foo"> |
| |
| *Markdown* |
| |
| </DIV> |
| . |
| <DIV CLASS="foo"> |
| <p><em>Markdown</em></p> |
| </DIV> |
| . |
| |
| The tag on the first line can be partial, as long |
| as it is split where there would be whitespace: |
| |
| . |
| <div id="foo" |
| class="bar"> |
| </div> |
| . |
| <div id="foo" |
| class="bar"> |
| </div> |
| . |
| |
| . |
| <div id="foo" class="bar |
| baz"> |
| </div> |
| . |
| <div id="foo" class="bar |
| baz"> |
| </div> |
| . |
| |
| An open tag need not be closed: |
| . |
| <div> |
| *foo* |
| |
| *bar* |
| . |
| <div> |
| *foo* |
| <p><em>bar</em></p> |
| . |
| |
| |
| A partial tag need not even be completed (garbage |
| in, garbage out): |
| |
| . |
| <div id="foo" |
| *hi* |
| . |
| <div id="foo" |
| *hi* |
| . |
| |
| . |
| <div class |
| foo |
| . |
| <div class |
| foo |
| . |
| |
| The initial tag doesn't even need to be a valid |
| tag, as long as it starts like one: |
| |
| . |
| <div *???-&&&-<--- |
| *foo* |
| . |
| <div *???-&&&-<--- |
| *foo* |
| . |
| |
| In type 6 blocks, the initial tag need not be on a line by |
| itself: |
| |
| . |
| <div><a href="bar">*foo*</a></div> |
| . |
| <div><a href="bar">*foo*</a></div> |
| . |
| |
| . |
| <table><tr><td> |
| foo |
| </td></tr></table> |
| . |
| <table><tr><td> |
| foo |
| </td></tr></table> |
| . |
| |
| Everything until the next blank line or end of document |
| gets included in the HTML block. So, in the following |
| example, what looks like a Markdown code block |
| is actually part of the HTML block, which continues until a blank |
| line or the end of the document is reached: |
| |
| . |
| <div></div> |
| ``` c |
| int x = 33; |
| ``` |
| . |
| <div></div> |
| ``` c |
| int x = 33; |
| ``` |
| . |
| |
| To start an [HTML block] with a tag that is *not* in the |
| list of block-level tags in (6), you must put the tag by |
| itself on the first line (and it must be complete): |
| |
| . |
| <a href="foo"> |
| *bar* |
| </a> |
| . |
| <a href="foo"> |
| *bar* |
| </a> |
| . |
| |
| In type 7 blocks, the [tag name] can be anything: |
| |
| . |
| <Warning> |
| *bar* |
| </Warning> |
| . |
| <Warning> |
| *bar* |
| </Warning> |
| . |
| |
| . |
| <i class="foo"> |
| *bar* |
| </i> |
| . |
| <i class="foo"> |
| *bar* |
| </i> |
| . |
| |
| . |
| </ins> |
| *bar* |
| . |
| </ins> |
| *bar* |
| . |
| |
| These rules are designed to allow us to work with tags that |
| can function as either block-level or inline-level tags. |
| The `<del>` tag is a nice example. We can surround content with |
| `<del>` tags in three different ways. In this case, we get a raw |
| HTML block, because the `<del>` tag is on a line by itself: |
| |
| . |
| <del> |
| *foo* |
| </del> |
| . |
| <del> |
| *foo* |
| </del> |
| . |
| |
| In this case, we get a raw HTML block that just includes |
| the `<del>` tag (because it ends with the following blank |
| line). So the contents get interpreted as CommonMark: |
| |
| . |
| <del> |
| |
| *foo* |
| |
| </del> |
| . |
| <del> |
| <p><em>foo</em></p> |
| </del> |
| . |
| |
| Finally, in this case, the `<del>` tags are interpreted |
| as [raw HTML] *inside* the CommonMark paragraph. (Because |
| the tag is not on a line by itself, we get inline HTML |
| rather than an [HTML block].) |
| |
| . |
| <del>*foo*</del> |
| . |
| <p><del><em>foo</em></del></p> |
| . |
| |
| HTML tags designed to contain literal content |
| (`script`, `style`, `pre`), comments, processing instructions, |
| and declarations are treated somewhat differently. |
| Instead of ending at the first blank line, these blocks |
| end at the first line containing a corresponding end tag. |
| As a result, these blocks can contain blank lines: |
| |
| A pre tag (type 1): |
| |
| . |
| <pre language="haskell"><code> |
| import Text.HTML.TagSoup |
| |
| main :: IO () |
| main = print $ parseTags tags |
| </code></pre> |
| . |
| <pre language="haskell"><code> |
| import Text.HTML.TagSoup |
| |
| main :: IO () |
| main = print $ parseTags tags |
| </code></pre> |
| . |
| |
| A script tag (type 1): |
| |
| . |
| <script type="text/javascript"> |
| // JavaScript example |
| |
| document.getElementById("demo").innerHTML = "Hello JavaScript!"; |
| </script> |
| . |
| <script type="text/javascript"> |
| // JavaScript example |
| |
| document.getElementById("demo").innerHTML = "Hello JavaScript!"; |
| </script> |
| . |
| |
| A style tag (type 1): |
| |
| . |
| <style |
| type="text/css"> |
| h1 {color:red;} |
| |
| p {color:blue;} |
| </style> |
| . |
| <style |
| type="text/css"> |
| h1 {color:red;} |
| |
| p {color:blue;} |
| </style> |
| . |
| |
| If there is no matching end tag, the block will end at the |
| end of the document (or the enclosing [block quote] or |
| [list item]): |
| |
| . |
| <style |
| type="text/css"> |
| |
| foo |
| . |
| <style |
| type="text/css"> |
| |
| foo |
| . |
| |
| . |
| > <div> |
| > foo |
| |
| bar |
| . |
| <blockquote> |
| <div> |
| foo |
| </blockquote> |
| <p>bar</p> |
| . |
| |
| . |
| - <div> |
| - foo |
| . |
| <ul> |
| <li> |
| <div> |
| </li> |
| <li>foo</li> |
| </ul> |
| . |
| |
| The end tag can occur on the same line as the start tag: |
| |
| . |
| <style>p{color:red;}</style> |
| *foo* |
| . |
| <style>p{color:red;}</style> |
| <p><em>foo</em></p> |
| . |
| |
| . |
| <!-- foo -->*bar* |
| *baz* |
| . |
| <!-- foo -->*bar* |
| <p><em>baz</em></p> |
| . |
| |
| Note that anything on the last line after the |
| end tag will be included in the [HTML block]: |
| |
| . |
| <script> |
| foo |
| </script>1. *bar* |
| . |
| <script> |
| foo |
| </script>1. *bar* |
| . |
| |
| A comment (type 2): |
| |
| . |
| <!-- Foo |
| |
| bar |
| baz --> |
| . |
| <!-- Foo |
| |
| bar |
| baz --> |
| . |
| |
| |
| A processing instruction (type 3): |
| |
| . |
| <?php |
| |
| echo '>'; |
| |
| ?> |
| . |
| <?php |
| |
| echo '>'; |
| |
| ?> |
| . |
| |
| A declaration (type 4): |
| |
| . |
| <!DOCTYPE html> |
| . |
| <!DOCTYPE html> |
| . |
| |
| CDATA (type 5): |
| |
| . |
| <![CDATA[ |
| function matchwo(a,b) |
| { |
| if (a < b && a < 0) then { |
| return 1; |
| |
| } else { |
| |
| return 0; |
| } |
| } |
| ]]> |
| . |
| <![CDATA[ |
| function matchwo(a,b) |
| { |
| if (a < b && a < 0) then { |
| return 1; |
| |
| } else { |
| |
| return 0; |
| } |
| } |
| ]]> |
| . |
| |
| The opening tag can be indented 1-3 spaces, but not 4: |
| |
| . |
| <!-- foo --> |
| |
| <!-- foo --> |
| . |
| <!-- foo --> |
| <pre><code><!-- foo --> |
| </code></pre> |
| . |
| |
| . |
| <div> |
| |
| <div> |
| . |
| <div> |
| <pre><code><div> |
| </code></pre> |
| . |
| |
| An HTML block of types 1--6 can interrupt a paragraph, and need not be |
| preceded by a blank line. |
| |
| . |
| Foo |
| <div> |
| bar |
| </div> |
| . |
| <p>Foo</p> |
| <div> |
| bar |
| </div> |
| . |
| |
| However, a following blank line is needed, except at the end of |
| a document, and except for blocks of types 1--5, above: |
| |
| . |
| <div> |
| bar |
| </div> |
| *foo* |
| . |
| <div> |
| bar |
| </div> |
| *foo* |
| . |
| |
| HTML blocks of type 7 cannot interrupt a paragraph: |
| |
| . |
| Foo |
| <a href="bar"> |
| baz |
| . |
| <p>Foo |
| <a href="bar"> |
| baz</p> |
| . |
| |
| This rule differs from John Gruber's original Markdown syntax |
| specification, which says: |
| |
| > The only restrictions are that block-level HTML elements — |
| > e.g. `<div>`, `<table>`, `<pre>`, `<p>`, etc. — must be separated from |
| > surrounding content by blank lines, and the start and end tags of the |
| > block should not be indented with tabs or spaces. |
| |
| In some ways Gruber's rule is more restrictive than the one given |
| here: |
| |
| - It requires that an HTML block be preceded by a blank line. |
| - It does not allow the start tag to be indented. |
| - It requires a matching end tag, which it also does not allow to |
| be indented. |
| |
| Most Markdown implementations (including some of Gruber's own) do not |
| respect all of these restrictions. |
| |
| There is one respect, however, in which Gruber's rule is more liberal |
| than the one given here, since it allows blank lines to occur inside |
| an HTML block. There are two reasons for disallowing them here. |
| First, it removes the need to parse balanced tags, which is |
| expensive and can require backtracking from the end of the document |
| if no matching end tag is found. Second, it provides a very simple |
| and flexible way of including Markdown content inside HTML tags: |
| simply separate the Markdown from the HTML using blank lines: |
| |
| Compare: |
| |
| . |
| <div> |
| |
| *Emphasized* text. |
| |
| </div> |
| . |
| <div> |
| <p><em>Emphasized</em> text.</p> |
| </div> |
| . |
| |
| . |
| <div> |
| *Emphasized* text. |
| </div> |
| . |
| <div> |
| *Emphasized* text. |
| </div> |
| . |
| |
| Some Markdown implementations have adopted a convention of |
| interpreting content inside tags as text if the open tag has |
| the attribute `markdown=1`. The rule given above seems a simpler and |
| more elegant way of achieving the same expressive power, which is also |
| much simpler to parse. |
| |
| The main potential drawback is that one can no longer paste HTML |
| blocks into Markdown documents with 100% reliability. However, |
| *in most cases* this will work fine, because the blank lines in |
| HTML are usually followed by HTML block tags. For example: |
| |
| . |
| <table> |
| |
| <tr> |
| |
| <td> |
| Hi |
| </td> |
| |
| </tr> |
| |
| </table> |
| . |
| <table> |
| <tr> |
| <td> |
| Hi |
| </td> |
| </tr> |
| </table> |
| . |
| |
| There are problems, however, if the inner tags are indented |
| *and* separated by spaces, as then they will be interpreted as |
| an indented code block: |
| |
| . |
| <table> |
| |
| <tr> |
| |
| <td> |
| Hi |
| </td> |
| |
| </tr> |
| |
| </table> |
| . |
| <table> |
| <tr> |
| <pre><code><td> |
| Hi |
| </td> |
| </code></pre> |
| </tr> |
| </table> |
| . |
| |
| Fortunately, blank lines are usually not necessary and can be |
| deleted. The exception is inside `<pre>` tags, but as described |
| above, raw HTML blocks starting with `<pre>` *can* contain blank |
| lines. |
| |
| ## Link reference definitions |
| |
| A [link reference definition](@link-reference-definition) |
| consists of a [link label], indented up to three spaces, followed |
| by a colon (`:`), optional [whitespace] (including up to one |
| [line ending]), a [link destination], |
| optional [whitespace] (including up to one |
| [line ending]), and an optional [link |
| title], which if it is present must be separated |
| from the [link destination] by [whitespace]. |
| No further [non-whitespace character]s may occur on the line. |
| |
| A [link reference definition] |
| does not correspond to a structural element of a document. Instead, it |
| defines a label which can be used in [reference link]s |
| and reference-style [images] elsewhere in the document. [Link |
| reference definitions] can come either before or after the links that use |
| them. |
| |
| . |
| [foo]: /url "title" |
| |
| [foo] |
| . |
| <p><a href="/url" title="title">foo</a></p> |
| . |
| |
| . |
| [foo]: |
| /url |
| 'the title' |
| |
| [foo] |
| . |
| <p><a href="/url" title="the title">foo</a></p> |
| . |
| |
| . |
| [Foo*bar\]]:my_(url) 'title (with parens)' |
| |
| [Foo*bar\]] |
| . |
| <p><a href="my_(url)" title="title (with parens)">Foo*bar]</a></p> |
| . |
| |
| . |
| [Foo bar]: |
| <my url> |
| 'title' |
| |
| [Foo bar] |
| . |
| <p><a href="my%20url" title="title">Foo bar</a></p> |
| . |
| |
| The title may extend over multiple lines: |
| |
| . |
| [foo]: /url ' |
| title |
| line1 |
| line2 |
| ' |
| |
| [foo] |
| . |
| <p><a href="/url" title=" |
| title |
| line1 |
| line2 |
| ">foo</a></p> |
| . |
| |
| However, it may not contain a [blank line]: |
| |
| . |
| [foo]: /url 'title |
| |
| with blank line' |
| |
| [foo] |
| . |
| <p>[foo]: /url 'title</p> |
| <p>with blank line'</p> |
| <p>[foo]</p> |
| . |
| |
| The title may be omitted: |
| |
| . |
| [foo]: |
| /url |
| |
| [foo] |
| . |
| <p><a href="/url">foo</a></p> |
| . |
| |
| The link destination may not be omitted: |
| |
| . |
| [foo]: |
| |
| [foo] |
| . |
| <p>[foo]:</p> |
| <p>[foo]</p> |
| . |
| |
| Both title and destination can contain backslash escapes |
| and literal backslashes: |
| |
| . |
| [foo]: /url\bar\*baz "foo\"bar\baz" |
| |
| [foo] |
| . |
| <p><a href="/url%5Cbar*baz" title="foo"bar\baz">foo</a></p> |
| . |
| |
| A link can come before its corresponding definition: |
| |
| . |
| [foo] |
| |
| [foo]: url |
| . |
| <p><a href="url">foo</a></p> |
| . |
| |
| If there are several matching definitions, the first one takes |
| precedence: |
| |
| . |
| [foo] |
| |
| [foo]: first |
| [foo]: second |
| . |
| <p><a href="first">foo</a></p> |
| . |
| |
| As noted in the section on [Links], matching of labels is |
| case-insensitive (see [matches]). |
| |
| . |
| [FOO]: /url |
| |
| [Foo] |
| . |
| <p><a href="/url">Foo</a></p> |
| . |
| |
| . |
| [ΑΓΩ]: /φου |
| |
| [αγω] |
| . |
| <p><a href="/%CF%86%CE%BF%CF%85">αγω</a></p> |
| . |
| |
| Here is a link reference definition with no corresponding link. |
| It contributes nothing to the document. |
| |
| . |
| [foo]: /url |
| . |
| . |
| |
| Here is another one: |
| |
| . |
| [ |
| foo |
| ]: /url |
| bar |
| . |
| <p>bar</p> |
| . |
| |
| This is not a link reference definition, because there are |
| [non-whitespace character]s after the title: |
| |
| . |
| [foo]: /url "title" ok |
| . |
| <p>[foo]: /url "title" ok</p> |
| . |
| |
| This is a link reference definition, but it has no title: |
| |
| . |
| [foo]: /url |
| "title" ok |
| . |
| <p>"title" ok</p> |
| . |
| |
| This is not a link reference definition, because it is indented |
| four spaces: |
| |
| . |
| [foo]: /url "title" |
| |
| [foo] |
| . |
| <pre><code>[foo]: /url "title" |
| </code></pre> |
| <p>[foo]</p> |
| . |
| |
| This is not a link reference definition, because it occurs inside |
| a code block: |
| |
| . |
| ``` |
| [foo]: /url |
| ``` |
| |
| [foo] |
| . |
| <pre><code>[foo]: /url |
| </code></pre> |
| <p>[foo]</p> |
| . |
| |
| A [link reference definition] cannot interrupt a paragraph. |
| |
| . |
| Foo |
| [bar]: /baz |
| |
| [bar] |
| . |
| <p>Foo |
| [bar]: /baz</p> |
| <p>[bar]</p> |
| . |
| |
| However, it can directly follow other block elements, such as headers |
| and horizontal rules, and it need not be followed by a blank line. |
| |
| . |
| # [Foo] |
| [foo]: /url |
| > bar |
| . |
| <h1><a href="/url">Foo</a></h1> |
| <blockquote> |
| <p>bar</p> |
| </blockquote> |
| . |
| |
| Several [link reference definition]s |
| can occur one after another, without intervening blank lines. |
| |
| . |
| [foo]: /foo-url "foo" |
| [bar]: /bar-url |
| "bar" |
| [baz]: /baz-url |
| |
| [foo], |
| [bar], |
| [baz] |
| . |
| <p><a href="/foo-url" title="foo">foo</a>, |
| <a href="/bar-url" title="bar">bar</a>, |
| <a href="/baz-url">baz</a></p> |
| . |
| |
| [Link reference definition]s can occur |
| inside block containers, like lists and block quotations. They |
| affect the entire document, not just the container in which they |
| are defined: |
| |
| . |
| [foo] |
| |
| > [foo]: /url |
| . |
| <p><a href="/url">foo</a></p> |
| <blockquote> |
| </blockquote> |
| . |
| |
| |
| ## Paragraphs |
| |
| A sequence of non-blank lines that cannot be interpreted as other |
| kinds of blocks forms a [paragraph](@paragraph). |
| The contents of the paragraph are the result of parsing the |
| paragraph's raw content as inlines. The paragraph's raw content |
| is formed by concatenating the lines and removing initial and final |
| [whitespace]. |
| |
| A simple example with two paragraphs: |
| |
| . |
| aaa |
| |
| bbb |
| . |
| <p>aaa</p> |
| <p>bbb</p> |
| . |
| |
| Paragraphs can contain multiple lines, but no blank lines: |
| |
| . |
| aaa |
| bbb |
| |
| ccc |
| ddd |
| . |
| <p>aaa |
| bbb</p> |
| <p>ccc |
| ddd</p> |
| . |
| |
| Multiple blank lines between paragraph have no effect: |
| |
| . |
| aaa |
| |
| |
| bbb |
| . |
| <p>aaa</p> |
| <p>bbb</p> |
| . |
| |
| Leading spaces are skipped: |
| |
| . |
| aaa |
| bbb |
| . |
| <p>aaa |
| bbb</p> |
| . |
| |
| Lines after the first may be indented any amount, since indented |
| code blocks cannot interrupt paragraphs. |
| |
| . |
| aaa |
| bbb |
| ccc |
| . |
| <p>aaa |
| bbb |
| ccc</p> |
| . |
| |
| However, the first line may be indented at most three spaces, |
| or an indented code block will be triggered: |
| |
| . |
| aaa |
| bbb |
| . |
| <p>aaa |
| bbb</p> |
| . |
| |
| . |
| aaa |
| bbb |
| . |
| <pre><code>aaa |
| </code></pre> |
| <p>bbb</p> |
| . |
| |
| Final spaces are stripped before inline parsing, so a paragraph |
| that ends with two or more spaces will not end with a [hard line |
| break]: |
| |
| . |
| aaa |
| bbb |
| . |
| <p>aaa<br /> |
| bbb</p> |
| . |
| |
| ## Blank lines |
| |
| [Blank line]s between block-level elements are ignored, |
| except for the role they play in determining whether a [list] |
| is [tight] or [loose]. |
| |
| Blank lines at the beginning and end of the document are also ignored. |
| |
| . |
| |
| |
| aaa |
| |
| |
| # aaa |
| |
| |
| . |
| <p>aaa</p> |
| <h1>aaa</h1> |
| . |
| |
| |
| # Container blocks |
| |
| A [container block] is a block that has other |
| blocks as its contents. There are two basic kinds of container blocks: |
| [block quotes] and [list items]. |
| [Lists] are meta-containers for [list items]. |
| |
| We define the syntax for container blocks recursively. The general |
| form of the definition is: |
| |
| > If X is a sequence of blocks, then the result of |
| > transforming X in such-and-such a way is a container of type Y |
| > with these blocks as its content. |
| |
| So, we explain what counts as a block quote or list item by explaining |
| how these can be *generated* from their contents. This should suffice |
| to define the syntax, although it does not give a recipe for *parsing* |
| these constructions. (A recipe is provided below in the section entitled |
| [A parsing strategy](#appendix-a-parsing-strategy).) |
| |
| ## Block quotes |
| |
| A [block quote marker](@block-quote-marker) |
| consists of 0-3 spaces of initial indent, plus (a) the character `>` together |
| with a following space, or (b) a single character `>` not followed by a space. |
| |
| The following rules define [block quotes]: |
| |
| 1. **Basic case.** If a string of lines *Ls* constitute a sequence |
| of blocks *Bs*, then the result of prepending a [block quote |
| marker] to the beginning of each line in *Ls* |
| is a [block quote](#block-quotes) containing *Bs*. |
| |
| 2. **Laziness.** If a string of lines *Ls* constitute a [block |
| quote](#block-quotes) with contents *Bs*, then the result of deleting |
| the initial [block quote marker] from one or |
| more lines in which the next [non-whitespace character] after the [block |
| quote marker] is [paragraph continuation |
| text] is a block quote with *Bs* as its content. |
| [Paragraph continuation text](@paragraph-continuation-text) is text |
| that will be parsed as part of the content of a paragraph, but does |
| not occur at the beginning of the paragraph. |
| |
| 3. **Consecutiveness.** A document cannot contain two [block |
| quotes] in a row unless there is a [blank line] between them. |
| |
| Nothing else counts as a [block quote](#block-quotes). |
| |
| Here is a simple example: |
| |
| . |
| > # Foo |
| > bar |
| > baz |
| . |
| <blockquote> |
| <h1>Foo</h1> |
| <p>bar |
| baz</p> |
| </blockquote> |
| . |
| |
| The spaces after the `>` characters can be omitted: |
| |
| . |
| ># Foo |
| >bar |
| > baz |
| . |
| <blockquote> |
| <h1>Foo</h1> |
| <p>bar |
| baz</p> |
| </blockquote> |
| . |
| |
| The `>` characters can be indented 1-3 spaces: |
| |
| . |
| > # Foo |
| > bar |
| > baz |
| . |
| <blockquote> |
| <h1>Foo</h1> |
| <p>bar |
| baz</p> |
| </blockquote> |
| . |
| |
| Four spaces gives us a code block: |
| |
| . |
| > # Foo |
| > bar |
| > baz |
| . |
| <pre><code>> # Foo |
| > bar |
| > baz |
| </code></pre> |
| . |
| |
| The Laziness clause allows us to omit the `>` before a |
| paragraph continuation line: |
| |
| . |
| > # Foo |
| > bar |
| baz |
| . |
| <blockquote> |
| <h1>Foo</h1> |
| <p>bar |
| baz</p> |
| </blockquote> |
| . |
| |
| A block quote can contain some lazy and some non-lazy |
| continuation lines: |
| |
| . |
| > bar |
| baz |
| > foo |
| . |
| <blockquote> |
| <p>bar |
| baz |
| foo</p> |
| </blockquote> |
| . |
| |
| Laziness only applies to lines that would have been continuations of |
| paragraphs had they been prepended with [block quote marker]s. |
| For example, the `> ` cannot be omitted in the second line of |
| |
| ``` markdown |
| > foo |
| > --- |
| ``` |
| |
| without changing the meaning: |
| |
| . |
| > foo |
| --- |
| . |
| <blockquote> |
| <p>foo</p> |
| </blockquote> |
| <hr /> |
| . |
| |
| Similarly, if we omit the `> ` in the second line of |
| |
| ``` markdown |
| > - foo |
| > - bar |
| ``` |
| |
| then the block quote ends after the first line: |
| |
| . |
| > - foo |
| - bar |
| . |
| <blockquote> |
| <ul> |
| <li>foo</li> |
| </ul> |
| </blockquote> |
| <ul> |
| <li>bar</li> |
| </ul> |
| . |
| |
| For the same reason, we can't omit the `> ` in front of |
| subsequent lines of an indented or fenced code block: |
| |
| . |
| > foo |
| bar |
| . |
| <blockquote> |
| <pre><code>foo |
| </code></pre> |
| </blockquote> |
| <pre><code>bar |
| </code></pre> |
| . |
| |
| . |
| > ``` |
| foo |
| ``` |
| . |
| <blockquote> |
| <pre><code></code></pre> |
| </blockquote> |
| <p>foo</p> |
| <pre><code></code></pre> |
| . |
| |
| Note that in the following case, we have a paragraph |
| continuation line: |
| |
| . |
| > foo |
| - bar |
| . |
| <blockquote> |
| <p>foo |
| - bar</p> |
| </blockquote> |
| . |
| |
| To see why, note that in |
| |
| ```markdown |
| > foo |
| > - bar |
| ``` |
| |
| the `- bar` is indented too far to start a list, and can't |
| be an indented code block because indented code blocks cannot |
| interrupt paragraphs, so it is a [paragraph continuation line]. |
| |
| A block quote can be empty: |
| |
| . |
| > |
| . |
| <blockquote> |
| </blockquote> |
| . |
| |
| . |
| > |
| > |
| > |
| . |
| <blockquote> |
| </blockquote> |
| . |
| |
| A block quote can have initial or final blank lines: |
| |
| . |
| > |
| > foo |
| > |
| . |
| <blockquote> |
| <p>foo</p> |
| </blockquote> |
| . |
| |
| A blank line always separates block quotes: |
| |
| . |
| > foo |
| |
| > bar |
| . |
| <blockquote> |
| <p>foo</p> |
| </blockquote> |
| <blockquote> |
| <p>bar</p> |
| </blockquote> |
| . |
| |
| (Most current Markdown implementations, including John Gruber's |
| original `Markdown.pl`, will parse this example as a single block quote |
| with two paragraphs. But it seems better to allow the author to decide |
| whether two block quotes or one are wanted.) |
| |
| Consecutiveness means that if we put these block quotes together, |
| we get a single block quote: |
| |
| . |
| > foo |
| > bar |
| . |
| <blockquote> |
| <p>foo |
| bar</p> |
| </blockquote> |
| . |
| |
| To get a block quote with two paragraphs, use: |
| |
| . |
| > foo |
| > |
| > bar |
| . |
| <blockquote> |
| <p>foo</p> |
| <p>bar</p> |
| </blockquote> |
| . |
| |
| Block quotes can interrupt paragraphs: |
| |
| . |
| foo |
| > bar |
| . |
| <p>foo</p> |
| <blockquote> |
| <p>bar</p> |
| </blockquote> |
| . |
| |
| In general, blank lines are not needed before or after block |
| quotes: |
| |
| . |
| > aaa |
| *** |
| > bbb |
| . |
| <blockquote> |
| <p>aaa</p> |
| </blockquote> |
| <hr /> |
| <blockquote> |
| <p>bbb</p> |
| </blockquote> |
| . |
| |
| However, because of laziness, a blank line is needed between |
| a block quote and a following paragraph: |
| |
| . |
| > bar |
| baz |
| . |
| <blockquote> |
| <p>bar |
| baz</p> |
| </blockquote> |
| . |
| |
| . |
| > bar |
| |
| baz |
| . |
| <blockquote> |
| <p>bar</p> |
| </blockquote> |
| <p>baz</p> |
| . |
| |
| . |
| > bar |
| > |
| baz |
| . |
| <blockquote> |
| <p>bar</p> |
| </blockquote> |
| <p>baz</p> |
| . |
| |
| It is a consequence of the Laziness rule that any number |
| of initial `>`s may be omitted on a continuation line of a |
| nested block quote: |
| |
| . |
| > > > foo |
| bar |
| . |
| <blockquote> |
| <blockquote> |
| <blockquote> |
| <p>foo |
| bar</p> |
| </blockquote> |
| </blockquote> |
| </blockquote> |
| . |
| |
| . |
| >>> foo |
| > bar |
| >>baz |
| . |
| <blockquote> |
| <blockquote> |
| <blockquote> |
| <p>foo |
| bar |
| baz</p> |
| </blockquote> |
| </blockquote> |
| </blockquote> |
| . |
| |
| When including an indented code block in a block quote, |
| remember that the [block quote marker] includes |
| both the `>` and a following space. So *five spaces* are needed after |
| the `>`: |
| |
| . |
| > code |
| |
| > not code |
| . |
| <blockquote> |
| <pre><code>code |
| </code></pre> |
| </blockquote> |
| <blockquote> |
| <p>not code</p> |
| </blockquote> |
| . |
| |
| |
| ## List items |
| |
| A [list marker](@list-marker) is a |
| [bullet list marker] or an [ordered list marker]. |
| |
| A [bullet list marker](@bullet-list-marker) |
| is a `-`, `+`, or `*` character. |
| |
| An [ordered list marker](@ordered-list-marker) |
| is a sequence of 1--9 arabic digits (`0-9`), followed by either a |
| `.` character or a `)` character. (The reason for the length |
| limit is that with 10 digits we start seeing integer overflows |
| in some browsers.) |
| |
| The following rules define [list items]: |
| |
| 1. **Basic case.** If a sequence of lines *Ls* constitute a sequence of |
| blocks *Bs* starting with a [non-whitespace character] and not separated |
| from each other by more than one blank line, and *M* is a list |
| marker of width *W* followed by 0 < *N* < 5 spaces, then the result |
| of prepending *M* and the following spaces to the first line of |
| *Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a |
| list item with *Bs* as its contents. The type of the list item |
| (bullet or ordered) is determined by the type of its list marker. |
| If the list item is ordered, then it is also assigned a start |
| number, based on the ordered list marker. |
| |
| For example, let *Ls* be the lines |
| |
| . |
| A paragraph |
| with two lines. |
| |
| indented code |
| |
| > A block quote. |
| . |
| <p>A paragraph |
| with two lines.</p> |
| <pre><code>indented code |
| </code></pre> |
| <blockquote> |
| <p>A block quote.</p> |
| </blockquote> |
| . |
| |
| And let *M* be the marker `1.`, and *N* = 2. Then rule #1 says |
| that the following is an ordered list item with start number 1, |
| and the same contents as *Ls*: |
| |
| . |
| 1. A paragraph |
| with two lines. |
| |
| indented code |
| |
| > A block quote. |
| . |
| <ol> |
| <li> |
| <p>A paragraph |
| with two lines.</p> |
| <pre><code>indented code |
| </code></pre> |
| <blockquote> |
| <p>A block quote.</p> |
| </blockquote> |
| </li> |
| </ol> |
| . |
| |
| The most important thing to notice is that the position of |
| the text after the list marker determines how much indentation |
| is needed in subsequent blocks in the list item. If the list |
| marker takes up two spaces, and there are three spaces between |
| the list marker and the next [non-whitespace character], then blocks |
| must be indented five spaces in order to fall under the list |
| item. |
| |
| Here are some examples showing how far content must be indented to be |
| put under the list item: |
| |
| . |
| - one |
| |
| two |
| . |
| <ul> |
| <li>one</li> |
| </ul> |
| <p>two</p> |
| . |
| |
| . |
| - one |
| |
| two |
| . |
| <ul> |
| <li> |
| <p>one</p> |
| <p>two</p> |
| </li> |
| </ul> |
| . |
| |
| . |
| - one |
| |
| two |
| . |
| <ul> |
| <li>one</li> |
| </ul> |
| <pre><code> two |
| </code></pre> |
| . |
| |
| . |
| - one |
| |
| two |
| . |
| <ul> |
| <li> |
| <p>one</p> |
| <p>two</p> |
| </li> |
| </ul> |
| . |
| |
| It is tempting to think of this in terms of columns: the continuation |
| blocks must be indented at least to the column of the first |
| [non-whitespace character] after the list marker. However, that is not quite right. |
| The spaces after the list marker determine how much relative indentation |
| is needed. Which column this indentation reaches will depend on |
| how the list item is embedded in other constructions, as shown by |
| this example: |
| |
| . |
| > > 1. one |
| >> |
| >> two |
| . |
| <blockquote> |
| <blockquote> |
| <ol> |
| <li> |
| <p>one</p> |
| <p>two</p> |
| </li> |
| </ol> |
| </blockquote> |
| </blockquote> |
| . |
| |
| Here `two` occurs in the same column as the list marker `1.`, |
| but is actually contained in the list item, because there is |
| sufficient indentation after the last containing blockquote marker. |
| |
| The converse is also possible. In the following example, the word `two` |
| occurs far to the right of the initial text of the list item, `one`, but |
| it is not considered part of the list item, because it is not indented |
| far enough past the blockquote marker: |
| |
| . |
| >>- one |
| >> |
| > > two |
| . |
| <blockquote> |
| <blockquote> |
| <ul> |
| <li>one</li> |
| </ul> |
| <p>two</p> |
| </blockquote> |
| </blockquote> |
| . |
| |
| Note that at least one space is needed between the list marker and |
| any following content, so these are not list items: |
| |
| . |
| -one |
| |
| 2.two |
| . |
| <p>-one</p> |
| <p>2.two</p> |
| . |
| |
| A list item may not contain blocks that are separated by more than |
| one blank line. Thus, two blank lines will end a list, unless the |
| two blanks are contained in a [fenced code block]. |
| |
| . |
| - foo |
| |
| bar |
| |
| - foo |
| |
| |
| bar |
| |
| - ``` |
| foo |
| |
| |
| bar |
| ``` |
| |
| - baz |
| |
| + ``` |
| foo |
| |
| |
| bar |
| ``` |
| . |
| <ul> |
| <li> |
| <p>foo</p> |
| <p>bar</p> |
| </li> |
| <li> |
| <p>foo</p> |
| </li> |
| </ul> |
| <p>bar</p> |
| <ul> |
| <li> |
| <pre><code>foo |
| |
| |
| bar |
| </code></pre> |
| </li> |
| <li> |
| <p>baz</p> |
| <ul> |
| <li> |
| <pre><code>foo |
| |
| |
| bar |
| </code></pre> |
| </li> |
| </ul> |
| </li> |
| </ul> |
| . |
| |
| A list item may contain any kind of block: |
| |
| . |
| 1. foo |
| |
| ``` |
| bar |
| ``` |
| |
| baz |
| |
| > bam |
| . |
| <ol> |
| <li> |
| <p>foo</p> |
| <pre><code>bar |
| </code></pre> |
| <p>baz</p> |
| <blockquote> |
| <p>bam</p> |
| </blockquote> |
| </li> |
| </ol> |
| . |
| |
| Note that ordered list start numbers must be nine digits or less: |
| |
| . |
| 123456789. ok |
| . |
| <ol start="123456789"> |
| <li>ok</li> |
| </ol> |
| . |
| |
| . |
| 1234567890. not ok |
| . |
| <p>1234567890. not ok</p> |
| . |
| |
| A start number may begin with 0s: |
| |
| . |
| 0. ok |
| . |
| <ol start="0"> |
| <li>ok</li> |
| </ol> |
| . |
| |
| . |
| 003. ok |
| . |
| <ol start="3"> |
| <li>ok</li> |
| </ol> |
| . |
| |
| A start number may not be negative: |
| |
| . |
| -1. not ok |
| . |
| <p>-1. not ok</p> |
| . |
| |
| |
| 2. **Item starting with indented code.** If a sequence of lines *Ls* |
| constitute a sequence of blocks *Bs* starting with an indented code |
| block and not separated from each other by more than one blank line, |
| and *M* is a list marker of width *W* followed by |
| one space, then the result of prepending *M* and the following |
| space to the first line of *Ls*, and indenting subsequent lines of |
| *Ls* by *W + 1* spaces, is a list item with *Bs* as its contents. |
| If a line is empty, then it need not be indented. The type of the |
| list item (bullet or ordered) is determined by the type of its list |
| marker. If the list item is ordered, then it is also assigned a |
| start number, based on the ordered list marker. |
| |
| An indented code block will have to be indented four spaces beyond |
| the edge of the region where text will be included in the list item. |
| In the following case that is 6 spaces: |
| |
| . |
| - foo |
| |
| bar |
| . |
| <ul> |
| <li> |
| <p>foo</p> |
| <pre><code>bar |
| </code></pre> |
| </li> |
| </ul> |
| . |
| |
| And in this case it is 11 spaces: |
| |
| . |
| 10. foo |
| |
| bar |
| . |
| <ol start="10"> |
| <li> |
| <p>foo</p> |
| <pre><code>bar |
| </code></pre> |
| </li> |
| </ol> |
| . |
| |
| If the *first* block in the list item is an indented code block, |
| then by rule #2, the contents must be indented *one* space after the |
| list marker: |
| |
| . |
| indented code |
| |
| paragraph |
| |
| more code |
| . |
| <pre><code>indented code |
| </code></pre> |
| <p>paragraph</p> |
| <pre><code>more code |
| </code></pre> |
| . |
| |
| . |
| 1. indented code |
| |
| paragraph |
| |
| more code |
| . |
| <ol> |
| <li> |
| <pre><code>indented code |
| </code></pre> |
| <p>paragraph</p> |
| <pre><code>more code |
| </code></pre> |
| </li> |
| </ol> |
| . |
| |
| Note that an additional space indent is interpreted as space |
| inside the code block: |
| |
| . |
| 1. indented code |
| |
| paragraph |
| |
| more code |
| . |
| <ol> |
| <li> |
| <pre><code> indented code |
| </code></pre> |
| <p>paragraph</p> |
| <pre><code>more code |
| </code></pre> |
| </li> |
| </ol> |
| . |
| |
| Note that rules #1 and #2 only apply to two cases: (a) cases |
| in which the lines to be included in a list item begin with a |
| [non-whitespace character], and (b) cases in which |
| they begin with an indented code |
| block. In a case like the following, where the first block begins with |
| a three-space indent, the rules do not allow us to form a list item by |
| indenting the whole thing and prepending a list marker: |
| |
| . |
| foo |
| |
| bar |
| . |
| <p>foo</p> |
| <p>bar</p> |
| . |
| |
| . |
| - foo |
| |
| bar |
| . |
| <ul> |
| <li>foo</li> |
| </ul> |
| <p>bar</p> |
| . |
| |
| This is not a significant restriction, because when a block begins |
| with 1-3 spaces indent, the indentation can always be removed without |
| a change in interpretation, allowing rule #1 to be applied. So, in |
| the above case: |
| |
| . |
| - foo |
| |
| bar |
| . |
| <ul> |
| <li> |
| <p>foo</p> |
| <p>bar</p> |
| </li> |
| </ul> |
| . |
| |
| 3. **Item starting with a blank line.** If a sequence of lines *Ls* |
| starting with a single [blank line] constitute a (possibly empty) |
| sequence of blocks *Bs*, not separated from each other by more than |
| one blank line, and *M* is a list marker of width *W*, |
| then the result of prepending *M* to the first line of *Ls*, and |
| indenting subsequent lines of *Ls* by *W + 1* spaces, is a list |
| item with *Bs* as its contents. |
| If a line is empty, then it need not be indented. The type of the |
| list item (bullet or ordered) is determined by the type of its list |
| marker. If the list item is ordered, then it is also assigned a |
| start number, based on the ordered list marker. |
| |
| Here are some list items that start with a blank line but are not empty: |
| |
| . |
| - |
| foo |
| - |
| ``` |
| bar |
| ``` |
| - |
| baz |
| . |
| <ul> |
| <li>foo</li> |
| <li> |
| <pre><code>bar |
| </code></pre> |
| </li> |
| <li> |
| <pre><code>baz |
| </code></pre> |
| </li> |
| </ul> |
| . |
| |
| A list item can begin with at most one blank line. |
| In the following example, `foo` is not part of the list |
| item: |
| |
| . |
| - |
| |
| foo |
| . |
| <ul> |
| <li></li> |
| </ul> |
| <p>foo</p> |
| . |
| |
| Here is an empty bullet list item: |
| |
| . |
| - foo |
| - |
| - bar |
| . |
| <ul> |
| <li>foo</li> |
| <li></li> |
| <li>bar</li> |
| </ul> |
| . |
| |
| It does not matter whether there are spaces following the [list marker]: |
| |
| . |
| - foo |
| - |
| - bar |
| . |
| <ul> |
| <li>foo</li> |
| <li></li> |
| <li>bar</li> |
| </ul> |
| . |
| |
| Here is an empty ordered list item: |
| |
| . |
| 1. foo |
| 2. |
| 3. bar |
| . |
| <ol> |
| <li>foo</li> |
| <li></li> |
| <li>bar</li> |
| </ol> |
| . |
| |
| A list may start or end with an empty list item: |
| |
| . |
| * |
| . |
| <ul> |
| <li></li> |
| </ul> |
| . |
| |
| |
| 4. **Indentation.** If a sequence of lines *Ls* constitutes a list item |
| according to rule #1, #2, or #3, then the result of indenting each line |
| of *Ls* by 1-3 spaces (the same for each line) also constitutes a |
| list item with the same contents and attributes. If a line is |
| empty, then it need not be indented. |
| |
| Indented one space: |
| |
| . |
| 1. A paragraph |
| with two lines. |
| |
| indented code |
| |
| > A block quote. |
| . |
| <ol> |
| <li> |
| <p>A paragraph |
| with two lines.</p> |
| <pre><code>indented code |
| </code></pre> |
| <blockquote> |
| <p>A block quote.</p> |
| </blockquote> |
| </li> |
| </ol> |
| . |
| |
| Indented two spaces: |
| |
| . |
| 1. A paragraph |
| with two lines. |
| |
| indented code |
| |
| > A block quote. |
| . |
| <ol> |
| <li> |
| <p>A paragraph |
| with two lines.</p> |
| <pre><code>indented code |
| </code></pre> |
| <blockquote> |
| <p>A block quote.</p> |
| </blockquote> |
| </li> |
| </ol> |
| . |
| |
| Indented three spaces: |
| |
| . |
| 1. A paragraph |
| with two lines. |
| |
| indented code |
| |
| > A block quote. |
| . |
| <ol> |
| <li> |
| <p>A paragraph |
| with two lines.</p> |
| <pre><code>indented code |
| </code></pre> |
| <blockquote> |
| <p>A block quote.</p> |
| </blockquote> |
| </li> |
| </ol> |
| . |
| |
| Four spaces indent gives a code block: |
| |
| . |
| 1. A paragraph |
| with two lines. |
| |
| indented code |
| |
| > A block quote. |
| . |
| <pre><code>1. A paragraph |
| with two lines. |
| |
| indented code |
| |
| > A block quote. |
| </code></pre> |
| . |
| |
| |
| 5. **Laziness.** If a string of lines *Ls* constitute a [list |
| item](#list-items) with contents *Bs*, then the result of deleting |
| some or all of the indentation from one or more lines in which the |
| next [non-whitespace character] after the indentation is |
| [paragraph continuation text] is a |
| list item with the same contents and attributes. The unindented |
| lines are called |
| [lazy continuation line](@lazy-continuation-line)s. |
| |
| Here is an example with [lazy continuation line]s: |
| |
| . |
| 1. A paragraph |
| with two lines. |
| |
| indented code |
| |
| > A block quote. |
| . |
| <ol> |
| <li> |
| <p>A paragraph |
| with two lines.</p> |
| <pre><code>indented code |
| </code></pre> |
| <blockquote> |
| <p>A block quote.</p> |
| </blockquote> |
| </li> |
| </ol> |
| . |
| |
| Indentation can be partially deleted: |
| |
| . |
| 1. A paragraph |
| with two lines. |
| . |
| <ol> |
| <li>A paragraph |
| with two lines.</li> |
| </ol> |
| . |
| |
| These examples show how laziness can work in nested structures: |
| |
| . |
| > 1. > Blockquote |
| continued here. |
| . |
| <blockquote> |
| <ol> |
| <li> |
| <blockquote> |
| <p>Blockquote |
| continued here.</p> |
| </blockquote> |
| </li> |
| </ol> |
| </blockquote> |
| . |
| |
| . |
| > 1. > Blockquote |
| > continued here. |
| . |
| <blockquote> |
| <ol> |
| <li> |
| <blockquote> |
| <p>Blockquote |
| continued here.</p> |
| </blockquote> |
| </li> |
| </ol> |
| </blockquote> |
| . |
| |
| |
| 6. **That's all.** Nothing that is not counted as a list item by rules |
| #1--5 counts as a [list item](#list-items). |
| |
| The rules for sublists follow from the general rules above. A sublist |
| must be indented the same number of spaces a paragraph would need to be |
| in order to be included in the list item. |
| |
| So, in this case we need two spaces indent: |
| |
| . |
| - foo |
| - bar |
| - baz |
| . |
| <ul> |
| <li>foo |
| <ul> |
| <li>bar |
| <ul> |
| <li>baz</li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| </ul> |
| . |
| |
| One is not enough: |
| |
| . |
| - foo |
| - bar |
| - baz |
| . |
| <ul> |
| <li>foo</li> |
| <li>bar</li> |
| <li>baz</li> |
| </ul> |
| . |
| |
| Here we need four, because the list marker is wider: |
| |
| . |
| 10) foo |
| - bar |
| . |
| <ol start="10"> |
| <li>foo |
| <ul> |
| <li>bar</li> |
| </ul> |
| </li> |
| </ol> |
| . |
| |
| Three is not enough: |
| |
| . |
| 10) foo |
| - bar |
| . |
| <ol start="10"> |
| <li>foo</li> |
| </ol> |
| <ul> |
| <li>bar</li> |
| </ul> |
| . |
| |
| A list may be the first block in a list item: |
| |
| . |
| - - foo |
| . |
| <ul> |
| <li> |
| <ul> |
| <li>foo</li> |
| </ul> |
| </li> |
| </ul> |
| . |
| |
| . |
| 1. - 2. foo |
| . |
| <ol> |
| <li> |
| <ul> |
| <li> |
| <ol start="2"> |
| <li>foo</li> |
| </ol> |
| </li> |
| </ul> |
| </li> |
| </ol> |
| . |
| |
| A list item can contain a header: |
| |
| . |
| - # Foo |
| - Bar |
| --- |
| baz |
| . |
| <ul> |
| <li> |
| <h1>Foo</h1> |
| </li> |
| <li> |
| <h2>Bar</h2> |
| baz</li> |
| </ul> |
| . |
| |
| ### Motivation |
| |
| John Gruber's Markdown spec says the following about list items: |
| |
| 1. "List markers typically start at the left margin, but may be indented |
| by up to three spaces. List markers must be followed by one or more |
| spaces or a tab." |
| |
| 2. "To make lists look nice, you can wrap items with hanging indents.... |
| But if you don't want to, you don't have to." |
| |
| 3. "List items may consist of multiple paragraphs. Each subsequent |
| paragraph in a list item must be indented by either 4 spaces or one |
| tab." |
| |
| 4. "It looks nice if you indent every line of the subsequent paragraphs, |
| but here again, Markdown will allow you to be lazy." |
| |
| 5. "To put a blockquote within a list item, the blockquote's `>` |
| delimiters need to be indented." |
| |
| 6. "To put a code block within a list item, the code block needs to be |
| indented twice — 8 spaces or two tabs." |
| |
| These rules specify that a paragraph under a list item must be indented |
| four spaces (presumably, from the left margin, rather than the start of |
| the list marker, but this is not said), and that code under a list item |
| must be indented eight spaces instead of the usual four. They also say |
| that a block quote must be indented, but not by how much; however, the |
| example given has four spaces indentation. Although nothing is said |
| about other kinds of block-level content, it is certainly reasonable to |
| infer that *all* block elements under a list item, including other |
| lists, must be indented four spaces. This principle has been called the |
| *four-space rule*. |
| |
| The four-space rule is clear and principled, and if the reference |
| implementation `Markdown.pl` had followed it, it probably would have |
| become the standard. However, `Markdown.pl` allowed paragraphs and |
| sublists to start with only two spaces indentation, at least on the |
| outer level. Worse, its behavior was inconsistent: a sublist of an |
| outer-level list needed two spaces indentation, but a sublist of this |
| sublist needed three spaces. It is not surprising, then, that different |
| implementations of Markdown have developed very different rules for |
| determining what comes under a list item. (Pandoc and python-Markdown, |
| for example, stuck with Gruber's syntax description and the four-space |
| rule, while discount, redcarpet, marked, PHP Markdown, and others |
| followed `Markdown.pl`'s behavior more closely.) |
| |
| Unfortunately, given the divergences between implementations, there |
| is no way to give a spec for list items that will be guaranteed not |
| to break any existing documents. However, the spec given here should |
| correctly handle lists formatted with either the four-space rule or |
| the more forgiving `Markdown.pl` behavior, provided they are laid out |
| in a way that is natural for a human to read. |
| |
| The strategy here is to let the width and indentation of the list marker |
| determine the indentation necessary for blocks to fall under the list |
| item, rather than having a fixed and arbitrary number. The writer can |
| think of the body of the list item as a unit which gets indented to the |
| right enough to fit the list marker (and any indentation on the list |
| marker). (The laziness rule, #5, then allows continuation lines to be |
| unindented if needed.) |
| |
| This rule is superior, we claim, to any rule requiring a fixed level of |
| indentation from the margin. The four-space rule is clear but |
| unnatural. It is quite unintuitive that |
| |
| ``` markdown |
| - foo |
| |
| bar |
| |
| - baz |
| ``` |
| |
| should be parsed as two lists with an intervening paragraph, |
| |
| ``` html |
| <ul> |
| <li>foo</li> |
| </ul> |
| <p>bar</p> |
| <ul> |
| <li>baz</li> |
| </ul> |
| ``` |
| |
| as the four-space rule demands, rather than a single list, |
| |
| ``` html |
| <ul> |
| <li> |
| <p>foo</p> |
| <p>bar</p> |
| <ul> |
| <li>baz</li> |
| </ul> |
| </li> |
| </ul> |
| ``` |
| |
| The choice of four spaces is arbitrary. It can be learned, but it is |
| not likely to be guessed, and it trips up beginners regularly. |
| |
| Would it help to adopt a two-space rule? The problem is that such |
| a rule, together with the rule allowing 1--3 spaces indentation of the |
| initial list marker, allows text that is indented *less than* the |
| original list marker to be included in the list item. For example, |
| `Markdown.pl` parses |
| |
| ``` markdown |
| - one |
| |
| two |
| ``` |
| |
| as a single list item, with `two` a continuation paragraph: |
| |
| ``` html |
| <ul> |
| <li> |
| <p>one</p> |
| <p>two</p> |
| </li> |
| </ul> |
| ``` |
| |
| and similarly |
| |
| ``` markdown |
| > - one |
| > |
| > two |
| ``` |
| |
| as |
| |
| ``` html |
| <blockquote> |
| <ul> |
| <li> |
| <p>one</p> |
| <p>two</p> |
| </li> |
| </ul> |
| </blockquote> |
| ``` |
| |
| This is extremely unintuitive. |
| |
| Rather than requiring a fixed indent from the margin, we could require |
| a fixed indent (say, two spaces, or even one space) from the list marker (which |
| may itself be indented). This proposal would remove the last anomaly |
| discussed. Unlike the spec presented above, it would count the following |
| as a list item with a subparagraph, even though the paragraph `bar` |
| is not indented as far as the first paragraph `foo`: |
| |
| ``` markdown |
| 10. foo |
| |
| bar |
| ``` |
| |
| Arguably this text does read like a list item with `bar` as a subparagraph, |
| which may count in favor of the proposal. However, on this proposal indented |
| code would have to be indented six spaces after the list marker. And this |
| would break a lot of existing Markdown, which has the pattern: |
| |
| ``` markdown |
| 1. foo |
| |
| indented code |
| ``` |
| |
| where the code is indented eight spaces. The spec above, by contrast, will |
| parse this text as expected, since the code block's indentation is measured |
| from the beginning of `foo`. |
| |
| The one case that needs special treatment is a list item that *starts* |
| with indented code. How much indentation is required in that case, since |
| we don't have a "first paragraph" to measure from? Rule #2 simply stipulates |
| that in such cases, we require one space indentation from the list marker |
| (and then the normal four spaces for the indented code). This will match the |
| four-space rule in cases where the list marker plus its initial indentation |
| takes four spaces (a common case), but diverge in other cases. |
| |
| ## Lists |
| |
| A [list](@list) is a sequence of one or more |
| list items [of the same type]. The list items |
| may be separated by single [blank lines], but two |
| blank lines end all containing lists. |
| |
| Two list items are [of the same type](@of-the-same-type) |
| if they begin with a [list marker] of the same type. |
| Two list markers are of the |
| same type if (a) they are bullet list markers using the same character |
| (`-`, `+`, or `*`) or (b) they are ordered list numbers with the same |
| delimiter (either `.` or `)`). |
| |
| A list is an [ordered list](@ordered-list) |
| if its constituent list items begin with |
| [ordered list marker]s, and a |
| [bullet list](@bullet-list) if its constituent list |
| items begin with [bullet list marker]s. |
| |
| The [start number](@start-number) |
| of an [ordered list] is determined by the list number of |
| its initial list item. The numbers of subsequent list items are |
| disregarded. |
| |
| A list is [loose](@loose) if any of its constituent |
| list items are separated by blank lines, or if any of its constituent |
| list items directly contain two block-level elements with a blank line |
| between them. Otherwise a list is [tight](@tight). |
| (The difference in HTML output is that paragraphs in a loose list are |
| wrapped in `<p>` tags, while paragraphs in a tight list are not.) |
| |
| Changing the bullet or ordered list delimiter starts a new list: |
| |
| . |
| - foo |
| - bar |
| + baz |
| . |
| <ul> |
| <li>foo</li> |
| <li>bar</li> |
| </ul> |
| <ul> |
| <li>baz</li> |
| </ul> |
| . |
| |
| . |
| 1. foo |
| 2. bar |
| 3) baz |
| . |
| <ol> |
| <li>foo</li> |
| <li>bar</li> |
| </ol> |
| <ol start="3"> |
| <li>baz</li> |
| </ol> |
| . |
| |
| In CommonMark, a list can interrupt a paragraph. That is, |
| no blank line is needed to separate a paragraph from a following |
| list: |
| |
| . |
| Foo |
| - bar |
| - baz |
| . |
| <p>Foo</p> |
| <ul> |
| <li>bar</li> |
| <li>baz</li> |
| </ul> |
| . |
| |
| `Markdown.pl` does not allow this, through fear of triggering a list |
| via a numeral in a hard-wrapped line: |
| |
| . |
| The number of windows in my house is |
| 14. The number of doors is 6. |
| . |
| <p>The number of windows in my house is</p> |
| <ol start="14"> |
| <li>The number of doors is 6.</li> |
| </ol> |
| . |
| |
| Oddly, `Markdown.pl` *does* allow a blockquote to interrupt a paragraph, |
| even though the same considerations might apply. We think that the two |
| cases should be treated the same. Here are two reasons for allowing |
| lists to interrupt paragraphs: |
| |
| First, it is natural and not uncommon for people to start lists without |
| blank lines: |
| |
| I need to buy |
| - new shoes |
| - a coat |
| - a plane ticket |
| |
| Second, we are attracted to a |
| |
| > [principle of uniformity](@principle-of-uniformity): |
| > if a chunk of text has a certain |
| > meaning, it will continue to have the same meaning when put into a |
| > container block (such as a list item or blockquote). |
| |
| (Indeed, the spec for [list items] and [block quotes] presupposes |
| this principle.) This principle implies that if |
| |
| * I need to buy |
| - new shoes |
| - a coat |
| - a plane ticket |
| |
| is a list item containing a paragraph followed by a nested sublist, |
| as all Markdown implementations agree it is (though the paragraph |
| may be rendered without `<p>` tags, since the list is "tight"), |
| then |
| |
| I need to buy |
| - new shoes |
| - a coat |
| - a plane ticket |
| |
| by itself should be a paragraph followed by a nested sublist. |
| |
| Our adherence to the [principle of uniformity] |
| thus inclines us to think that there are two coherent packages: |
| |
| 1. Require blank lines before *all* lists and blockquotes, |
| including lists that occur as sublists inside other list items. |
| |
| 2. Require blank lines in none of these places. |
| |
| [reStructuredText](http://docutils.sourceforge.net/rst.html) takes |
| the first approach, for which there is much to be said. But the second |
| seems more consistent with established practice with Markdown. |
| |
| There can be blank lines between items, but two blank lines end |
| a list: |
| |
| . |
| - foo |
| |
| - bar |
| |
| |
| - baz |
| . |
| <ul> |
| <li> |
| <p>foo</p> |
| </li> |
| <li> |
| <p>bar</p> |
| </li> |
| </ul> |
| <ul> |
| <li>baz</li> |
| </ul> |
| . |
| |
| As illustrated above in the section on [list items], |
| two blank lines between blocks *within* a list item will also end a |
| list: |
| |
| . |
| - foo |
| |
| |
| bar |
| - baz |
| . |
| <ul> |
| <li>foo</li> |
| </ul> |
| <p>bar</p> |
| <ul> |
| <li>baz</li> |
| </ul> |
| . |
| |
| Indeed, two blank lines will end *all* containing lists: |
| |
| . |
| - foo |
| - bar |
| - baz |
| |
| |
| bim |
| . |
| <ul> |
| <li>foo |
| <ul> |
| <li>bar |
| <ul> |
| <li>baz</li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| </ul> |
| <pre><code> bim |
| </code></pre> |
| . |
| |
| Thus, two blank lines can be used to separate consecutive lists of |
| the same type, or to separate a list from an indented code block |
| that would otherwise be parsed as a subparagraph of the final list |
| item: |
| |
| . |
| - foo |
| - bar |
| |
| |
| - baz |
| - bim |
| . |
| <ul> |
| <li>foo</li> |
| <li>bar</li> |
| </ul> |
| <ul> |
| <li>baz</li> |
| <li>bim</li> |
| </ul> |
| . |
| |
| . |
| - foo |
| |
| notcode |
| |
| - foo |
| |
| |
| code |
| . |
| <ul> |
| <li> |
| <p>foo</p> |
| <p>notcode</p> |
| </li> |
| <li> |
| <p>foo</p> |
| </li> |
| </ul> |
| <pre><code>code |
| </code></pre> |
| . |
| |
| List items need not be indented to the same level. The following |
| list items will be treated as items at the same list level, |
| since none is indented enough to belong to the previous list |
| item: |
| |
| . |
| - a |
| - b |
| - c |
| - d |
| - e |
| - f |
| - g |
| - h |
| - i |
| . |
| <ul> |
| <li>a</li> |
| <li>b</li> |
| <li>c</li> |
| <li>d</li> |
| <li>e</li> |
| <li>f</li> |
| <li>g</li> |
| <li>h</li> |
| <li>i</li> |
| </ul> |
| . |
| |
| . |
| 1. a |
| |
| 2. b |
| |
| 3. c |
| . |
| <ol> |
| <li> |
| <p>a</p> |
| </li> |
| <li> |
| <p>b</p> |
| </li> |
| <li> |
| <p>c</p> |
| </li> |
| </ol> |
| . |
| |
| This is a loose list, because there is a blank line between |
| two of the list items: |
| |
| . |
| - a |
| - b |
| |
| - c |
| . |
| <ul> |
| <li> |
| <p>a</p> |
| </li> |
| <li> |
| <p>b</p> |
| </li> |
| <li> |
| <p>c</p> |
| </li> |
| </ul> |
| . |
| |
| So is this, with a empty second item: |
| |
| . |
| * a |
| * |
| |
| * c |
| . |
| <ul> |
| <li> |
| <p>a</p> |
| </li> |
| <li></li> |
| <li> |
| <p>c</p> |
| </li> |
| </ul> |
| . |
| |
| These are loose lists, even though there is no space between the items, |
| because one of the items directly contains two block-level elements |
| with a blank line between them: |
| |
| . |
| - a |
| - b |
| |
| c |
| - d |
| . |
| <ul> |
| <li> |
| <p>a</p> |
| </li> |
| <li> |
| <p>b</p> |
| <p>c</p> |
| </li> |
| <li> |
| <p>d</p> |
| </li> |
| </ul> |
| . |
| |
| . |
| - a |
| - b |
| |
| [ref]: /url |
| - d |
| . |
| <ul> |
| <li> |
| <p>a</p> |
| </li> |
| <li> |
| <p>b</p> |
| </li> |
| <li> |
| <p>d</p> |
| </li> |
| </ul> |
| . |
| |
| This is a tight list, because the blank lines are in a code block: |
| |
| . |
| - a |
| - ``` |
| b |
| |
| |
| ``` |
| - c |
| . |
| <ul> |
| <li>a</li> |
| <li> |
| <pre><code>b |
| |
| |
| </code></pre> |
| </li> |
| <li>c</li> |
| </ul> |
| . |
|