markdown_it
This document can be opened to execute with Jupytext!
markdown-it-py may be used as an API via the markdown_it
package.
The raw text is first parsed to syntax ‘tokens’, then these are converted to other formats using ‘renderers’.
+++
The simplest way to understand how text will be parsed is using:
from markdown_it import MarkdownIt
md = MarkdownIt() md.render("some *text*")
for token in md.parse("some *text*"): print(token) print()
+++
The MarkdownIt
class is instantiated with parsing configuration options, dictating the syntax rules and additional options for the parser and renderer. You can define this configuration via a preset name ('zero'
, 'commonmark'
or 'default'
), or by directly supplying a dictionary.
from markdown_it.presets import zero zero.make()
md = MarkdownIt("zero") md.options
print(md.get_active_rules())
print(md.get_all_rules())
You can find all the parsing rules in the source code: parser_core.py
, parser_block.py
, parser_inline.py
. Any of the parsing rules can be enabled/disabled, and these methods are chainable:
md.render("- __*emphasise this*__")
md.enable(["list", "emphasis"]).render("- __*emphasise this*__")
You can temporarily modify rules with the reset_rules
context manager.
with md.reset_rules(): md.disable("emphasis") print(md.render("__*emphasise this*__")) md.render("__*emphasise this*__")
Additionally renderInline
runs the parser with all block syntax rules disabled.
md.renderInline("__*emphasise this*__")
Plugins load collections of additional syntax rules and render methods into the parser
from markdown_it import MarkdownIt from markdown_it.extensions.front_matter import front_matter_plugin from markdown_it.extensions.footnote import footnote_plugin md = ( MarkdownIt() .use(front_matter_plugin) .use(footnote_plugin) .enable('table') ) text = (""" --- a: 1 --- a | b - | - 1 | 2 A footnote [^1] [^1]: some details """) md.render(text)
+++
Before rendering, the text is parsed to a flat token stream of block level syntax elements, with nesting defined by opening (1) and closing (-1) attributes:
md = MarkdownIt("commonmark") tokens = md.parse(""" Here's some *text* 1. a list > a *quote*""") [(t.type, t.nesting) for t in tokens]
Naturally all openings should eventually be closed, such that:
sum([t.nesting for t in tokens]) == 0
All tokens are the same class, which can also be created outside the parser:
tokens[0]
from markdown_it.token import Token token = Token("paragraph_open", "p", 1, block=True, map=[1, 2]) token == tokens[0]
The 'inline'
type token contain the inline tokens as children:
tokens[1]
You can serialize a token (and its children) to a JSONable dictionary using:
print(tokens[1].as_dict())
This dictionary can also be deserialized:
Token.from_dict(tokens[1].as_dict())
In some use cases nest_tokens
may be useful, to collapse the opening/closing tokens into single tokens:
from markdown_it.token import nest_tokens nested_tokens = nest_tokens(tokens) [t.type for t in nested_tokens]
This introduces a single additional class NestedTokens
, containing an opening
, closing
and children
, which can be a list of mixed Token
and NestedTokens
.
nested_tokens[0]
+++
After the token stream is generated, it's passed to a renderer. It then plays all the tokens, passing each to a rule with the same name as token type.
Renderer rules are located in md.renderer.rules
and are simple functions with the same signature:
def function(renderer, tokens, idx, options, env): return htmlResult
+++
You can inject render methods into the instantiated render class.
md = MarkdownIt("commonmark") def render_em_open(self, tokens, idx, options, env): return '<em class="myclass">' md.add_render_rule("em_open", render_em_open) md.render("*a*")
This is a slight change to the JS version, where the renderer argument is at the end. Also add_render_rule
method is specific to Python, rather than adding directly to the md.renderer.rules
, this ensures the method is bound to the renderer.
+++
You can also subclass a render and add the method there:
from markdown_it.renderer import RendererHTML class MyRenderer(RendererHTML): def em_open(self, tokens, idx, options, env): return '<em class="myclass">' md = MarkdownIt("commonmark", renderer_cls=MyRenderer) md.render("*a*")
Plugins can support multiple render types, using the __ouput__
attribute (this is currently a Python only feature).
from markdown_it.renderer import RendererHTML class MyRenderer1(RendererHTML): __output__ = "html1" class MyRenderer2(RendererHTML): __output__ = "html2" def plugin(md): def render_em_open1(self, tokens, idx, options, env): return '<em class="myclass1">' def render_em_open2(self, tokens, idx, options, env): return '<em class="myclass2">' md.add_render_rule("em_open", render_em_open1, fmt="html1") md.add_render_rule("em_open", render_em_open2, fmt="html2") md = MarkdownIt("commonmark", renderer_cls=MyRenderer1).use(plugin) print(md.render("*a*")) md = MarkdownIt("commonmark", renderer_cls=MyRenderer2).use(plugin) print(md.render("*a*"))
Here‘s a more concrete example; let’s replace images with vimeo links to player's iframe:
import re from markdown_it import MarkdownIt vimeoRE = re.compile(r'^https?:\/\/(www\.)?vimeo.com\/(\d+)($|\/)') def render_vimeo(self, tokens, idx, options, env): token = tokens[idx] aIndex = token.attrIndex('src') if (vimeoRE.match(token.attrs[aIndex][1])): ident = vimeoRE.match(token.attrs[aIndex][1])[2] return ('<div class="embed-responsive embed-responsive-16by9">\n' + ' <iframe class="embed-responsive-item" src="//player.vimeo.com/video/' + ident + '"></iframe>\n' + '</div>\n') return self.image(tokens, idx, options, env) md = MarkdownIt("commonmark") md.add_render_rule("image", render_vimeo) print(md.render(""))
Here is another example, how to add target="_blank"
to all links:
from markdown_it import MarkdownIt def render_blank_link(self, tokens, idx, options, env): aIndex = tokens[idx].attrIndex('target') if (aIndex < 0): tokens[idx].attrPush(['target', '_blank']) # add new attribute else: tokens[idx].attrs[aIndex][1] = '_blank' # replace value of existing attr # pass token to default renderer. return self.renderToken(tokens, idx, options, env) md = MarkdownIt("commonmark") md.add_render_rule("link_open", render_blank_link) print(md.render("[a]\n\n[a]: b"))