Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Cross Reference Simplifications using Markdown Links

Authors
Affiliations
Curvenote Inc.
Curvenote Inc.

Summary

We propose a cross-reference syntax that uses CommonMark links to support all use cases of cross-referencing content internal to a project. The syntax aims to be familiar and work across different rendering platforms. Most internal content can be referenced using a hash-link, [](#my-id), which is the recommended replacement for the multiple role options that can do this in MyST currently (e.g. {ref}`my-id`, {eq}`my-id`, {numref}`my-id`). We provide options for increasing specificity for these links in all cases to deal with duplicate references across pages in a project.

Existing SyntaxNew Syntax
[](my-id)[1][](#my-id)
{ref}`my-id`[](#my-id)
{eq}`my-equation`[](#my-equation)
{ref}`Custom Text <my-id>`[Custom Text](#my-id)
{numref}`See "{name}" <my-id>`[See "{name}"](#my-id)
{numref}`Custom Number %s <my-id>`[Custom Number {number}](#my-id)
{numref}`Custom Number {number} <my-id>`[Custom Number {number}](#my-id)
{doc}`my-doc`[](my-doc.md)
{doc}`my-doc`[](../examples/my-doc.md)
{download}`my-doc.zip`[](my-doc.zip)

Context

In MyST (and Sphinx) there are many ways to cross-reference content:

These are all powerful roles, encoding semantic meaning and providing rich inter-linked content. These links can also be used to power rich user-interfaces, such as sphinx-hoverref. There are also simple configuration options for adding new external links in Sphinx. However, the breadth, verbosity, and overlapping functionality of these roles can be confusing and unfamiliar to new users.

For example:

Additionally, there is currently not overlap with CommonMark syntax that can, for example, reference a section header with a hash [header](#context). This has the advantage that the syntax works in multiple platforms and is a familiar pattern from using website links.

Design Goals

Our goal with this MEP is to provide a simplified syntax to make use of markdown links, and tap into rich cross-referencing capabilities. In this MEP we aim to balance:

Reuse Existing Standards
where possible syntax should reuse existing practices and standards, for example, CommonMark compliance
Graceful Degradation
syntax should aim to render with reduced functionality in places that don’t support MyST
Memorability
having a syntax that is easy to remember
Readability
having a syntax which people can understand at a glance
Terseness
limiting “boilerplate” syntax
Extensibility
having syntaxes that will not limit us from adding features in the future

Specifically for this MEP, our proposal aims to:

These link improvements are completed in the context of supporting (1) academic citations; and (2) intersphinx cross-references. However, this MEP does not specifically support the intricacies of intersphinx, bibliographies or referencing. We encourage future MEPs to address these concerns.

Background

The MEP aims to build on the existing CommonMark link format, which come in three forms (see spec). In the current MEP we are not proposing any changes to CommonMark - and are designing a cross-referencing syntax that can work with existing links. For context, the three CommonMark link types are:

  1. Inline links with optional text or titles:

    [Explicit *Markdown* text](destination "optional explicit title")
    
    or, if the destination contains spaces,
    
    [text](<a destination>)
  2. Reference links, which define the destination separately in the document and can be used multiple times:

    [Explicit *Markdown* text][label]
    
    [label]: destination "optional explicit title"
  3. Autolinks are URIs surrounded by < and >:

    <scheme:path?query#fragment>

In most cases, the scheme[2] (e.g. https:, mailto:, or ftp:) is optional and assumed to be a web URL (http:). For autolinks, however, the scheme is required; this is designed to disambiguate inline HTML elements (i.e. <b> is not a link, but <https://executablebooks.org/> is).

Current MyST Markup

Currently MyST supports external URLs (e.g. http:, https:, ftp:, mailto:) and uses Sphinx or Docutils for cross-references. The current supported syntax is listed for each component below:

External Links

Figures, Sections

Equations

Documents

Intersphinx

Multiple other sphinx documentation sites can be referenced in MyST syntax (Sphinx documentation). For example, the python documentation can be referenced from a configuration (e.g. the intersphinx_mapping in conf.py), which points to the appropriate intersphinx inventory (e.g. https://docs.python.org/3) containing a *.inv file.

Styling

All link syntax supports styling inside of the reference, (e.g. [A **bolded _reference_** to a page](./myst.md)) the reference role syntax currently does not support styling of the inner content.

Proposal

We propose a cross-reference syntax that uses CommonMark links in all three forms. The goal is to support all use cases of cross-referencing with the most common use cases of referencing a document, file, section or element being simple, terse and familiar.

Overview:

Existing SyntaxNew Syntax
[](my-id)[1][](#my-id)
{ref}`my-id`[](#my-id)
{eq}`my-equation`[](#my-equation)
{ref}`Custom Text <my-id>`[Custom Text](#my-id)
{numref}`See "{name}" <my-id>`[See "{name}"](#my-id)
{numref}`Custom Number %s <my-id>`[Custom Number {number}](#my-id)
{numref}`Custom Number {number} <my-id>`[Custom Number {number}](#my-id)
{doc}`my-doc`[](my-doc.md)
{doc}`my-doc`[](../examples/my-doc.md)
{download}`my-doc.zip`[](my-doc.zip)

All of the above link examples can be easily complemented by both adding Title Link syntax and Reference Link syntax (i.e. [Explicit *Markdown* text][label]). We have omitted the auto-link syntax from the overview for brevity, they are shown in detail below. In all cases, the existing role syntax should continue to work and receive ongoing support from the parser(s).

Syntax

The parts of the link are [text](link "title") with an optional scheme ([text](scheme:link "title")). The "title" is not modified in our proposed syntax. Auto Link syntax, <scheme:link>, requires the scheme to be present, we follow the CommonMark definition of a scheme.

text

If the text is not included, it will be filled in by the default of the target.

If the text is included it will be used as is with two additional template values ({number} and {name})

In both cases, the template can be escaped with a preceding backslash, that is \{number} or \{name}, and the text will not be replaced.

The links are defined by a scheme, which can be standard protocols (http:, mailto:). Here we propose two new schemes, project and path: the project scheme allows for cross-referencing pages, sections, equations, figures, or other components of a MyST project; the path scheme allows for referencing files outside of the project or explicitly downloading the source of a document in the project. The schemes are an extensibility point specifically described by CommonMark and are used to indicate that the link should be resolved by MyST specific logic. They follow standard URI syntax:

URI = scheme ":" pathname ["?" query] ["#" fragment]

In most cases, as seen in the summary above the scheme is optional and can be inferred safely by the context. The exception is when explicitly referring to an external MyST site, Jupyter Book or Sphinx documentation site. These URIs can be safely and easily parsed by any common URL parser. For example in Javascript:

const url = new URL('project:target.md#my-ref');
url.protocol; // "project:"
url.pathname; // "target.md"
url.hash; // "#my-ref"

The following links and references are supported:

Link TypeAuto LinkInline
External URL<https://example.com>[](https://example.com)
Local file download<path:file.txt>[](file.txt)
File download (explicit)<path:file.md>[](path:file.md)
Project document<project:file.md>[](file.md)
Target in a document<project:target.md#file>[](file.md#target)
Target in project<project:#target>[](#target)

Search Order and Specificity

All references search the local document first[3], then the local project in the order of the table of contents. A xref_ambiguous warning is raised if multiple matches are found.

In large documentation sites, a referenced target can be present in multiple documents, in that case, the parser will emit a xref_ambiguous warning letting you know that there are multiple matches for the intended target. If a link cannot be resolved, an external link should be rendered, for example, <a href="#target">#target</a>.

Implicit Section Headers

We suggest a configuration option to create anchor “slugs” for section headers, which stay close to the GitHub implementation and produces references that:

For example, ## Links and Referencing can be referenced as [](#links-and-referencing). Every heading level in a document should have an anchor, however, these are implicit references, and referring to them can raise an xref_implicit warning, which can optionally be suppressed by users.

Implicit references are not available project wide, and are only accessible in the current document, as many documents follow similar structures (Abstract, Introduction, Methods, Summary). Adding two sections of the same name does not raise a duplicate identifier warnings (xref_duplicate), section identifiers are only unique to the document.

Paths

Downloads

Files that are outside of the table of contents of the project and are referenced directly are downloads.

Warnings and Errors

xref_missing
There is a missing reference, that could not be found.
xref_implicit
You are referencing an implicit reference which could change easily in the future, consider making this explicit.
xref_unsupported
Raised if the the current environment does not support the reference look up. For example, single page builds.
xref_ambiguous
Raised when multiple conflicting targets are matched.
xref_legacy
Raised when a [](ref) is used in place of [](#ref).
For example, "Legacy syntax used for link target, please prepend a ‘#’ to your link url: “{link.url}” in “{document}”.

Specification AST

The links should follow the link AST for external links. For internal project cross-references, these should be resolved to a crossReference node (spec).

For external project links, we will extend the link object with additional data that includes the url source (urlSource), the scheme name (e.g. project or download), whether the link is internal (e.g. false), and additional optional metadata about the page that may be helpful to a renderer.

Extensibility

We hope that this syntax will be helpful in simplifying the cross-reference experience in MyST. Additionally, we believe that the scheme/protocol extension point is a powerful way to add rich cross-referencing ability to other types of structured data sources. We expect a future MEP to introduce additional logic to resolve intersphinx references, and other structured data. For example, one could imagine a <wiki:Gravitational_Waves> extension that cross-references pages in Wikipedia, or a <doi:10.5281/zenodo.6476040> extension that adds additional information about DOIs. For simple link replacements, this syntax could also be extended with simple configuration options, similar to the extlinks feature in Sphinx (see documentation).

UX implications & migration

All of the syntax is CommonMark compliant and introduces new capabilities to resolve cross references. All existing roles are being maintained indefinitely to ensure compatibility with existing content as well as long-term compatibility with Sphinx. We suggest that documentation is updated to highlight the new, consistent markdown-link references with the old styles being moved to compatibility sections.

There is a single deprecation of the existing markdown link syntax that references a target and does not have a #. When parsers encounter a legacy linked reference, they should raise an xref_legacy warning.

Questions or objections

File Protocol
We want to minimize the additional syntax, and it was suggested that we use the file: protocol. The file: protocol is a security concern in markdown parsing, and was not chosen.

References

Additional projects, specs, configuration and syntax consulted:

Other context and links:

Footnotes
  1. This is backwards compatible, however, now raises a xref_legacy warning for old syntax.

  2. the URL scheme is also known as the URL protocol.

  3. With the exception of an explicit reference to a specific page, i.e. [](./examples/my-doc.md#explicit-reference)