Per-Output AST Representation for Code Cells

Summary¶

We propose extending the MyST AST with a new node type to represent individual code-cell outputs as separate AST nodes. In the existing schema, the Output node type represents a collection of outputs, and has a one-to-one correspondence with each code-cell. This structure does not naturally admit distinguishing between AST trees for each output, precluding the ability to consider code-cell outputs during subsequent transformations of the document. With this proposal, a new Outputs node replaces the Output node as a direct child of code-cells with a unist Parent container. This new node may contain several Output child nodes, each possessing an IOutput bundle returned by code-execution. The new AST structure enables each Output to carry its own AST subtree. Subsequent enhancements may build upon this proposal to process code-cell outputs and build derived ASTs, e.g. parsing Markdown or LaTeX outputs into MyST AST that may reference and be referenced by other content.

Context¶

Current Limitations¶

Code-cells from Jupyter Notebooks can produce rich output like Markdown, LaTeX, and tables. Right now, each output is effectively treated as a Black Box, which is only interpreted at export time (to PDF, web, etc.) such that static exports and web build are required to interpret the MIME bundle outputs. As a result, the output content does not participate in a MyST build i.e. the Markdown or LaTeX output cannot generate or consume referencing labels.

This limitation prevents programmatic generation of content that integrates with the rest of the document. For example, if a code-cell generates Markdown output containing a figure with a label, that label cannot be referenced from other parts of the document, and the figure cannot appear in cross-reference resolution.

Use Cases¶

This enhancement lays the foundation to enable several important use cases in future MEPs:

Programmatic Content Generation: Users can generate MyST Markdown from code-cells and have the MyST Document Engine parse the results.
Integrated Outputs: Generated Markdown can define and consume reference targets that are visible to the rest of the project, enabling richer integration between computational content and documentation.
MyST-aware Kernels: Libraries running in Jupyter Kernels may output AST via MIME bundles with a MyST-aware MIME type, e.g. application/vnd.mystmd.ast+json;version=1. This would enable richer integrations than simple Markdown code generation.

This proposal addresses the requirements described in #1026, which tracks the need to associate AST subtrees with individual cell outputs.

Proposal¶

AST Structure Changes¶

The MyST AST will be extended to support per-output AST representation through the following changes:

Previous Structure (Version 2):

type OutputV2 = {
  type: "output";
  data?: any[]; // Array of IOutput bundles
  visibility?: any;
};

In this structure, a single Output node contained an array of output data bundles in the data field.

New Structure (Version 3):

// Outputs contains one or more Output nodes (below)
type Outputs = {
  type: "outputs";
  children: (Output | FlowContent | ListContent | PhrasingContent)[]; // Support placeholders in addition to outputs
  visibility?: Visibility; // `show`, `hide`, or `remove`
  scroll?: boolean;
};

type Output = {
  type: "output";
  children: (FlowContent | ListContent | PhrasingContent)[];
  jupyter_data: IOutput; // Single IOutput bundle from Jupyter Notebooks
};

The key changes are:

Output nodes now represent a single output with its own AST subtree in children and a single jupyter_data field (instead of an array).
Outputs node is introduced as a container with Output children, where there is a 1:1 correspondence between Output children and IOutput bundles.
Each Output child can have its own AST subtree, enabling per-output parsing and reference resolution.

Migration Strategy¶

The version migration is handled through upgrade and downgrade functions that transform between V2 and V3 representations. The migration preserves:

Identifiers (id, label, identifier, html_id)
Target properties
Placeholder nodes
Visibility settings

Existing identifiers for the Output / Outputs nodes are not modified, as these are considered “content”. The migration path includes:

Upgrade (V2 → V3):

Convert each Output node with data array into an Outputs node
Create one Output child per item in the original data array
Distribute the original children to the first output if there is only one output
Preserve placeholders and other special children

Downgrade (V3 → V2):

Convert Outputs nodes back to Output nodes
Collect all jupyter_data fields from Output children into a data array
Merge all Output children AST subtrees into a single children array
Preserve identifiers and other metadata

Example Transformation¶

Before (V2):

{
  "type": "output",
  "data": [
    { "output_type": "stream", "text": "Hello" },
    { "output_type": "display_data", "data": { "text/markdown": "**Bold**" } }
  ],
  "children": [{ "type": "text", "value": "Shared content" }]
}

After (V3):

{
  "type": "outputs",
  "children": [
    {
      "type": "output",
      "jupyter_data": { "output_type": "stream", "text": "Hello" },
      "children": []
    },
    {
      "type": "output",
      "jupyter_data": {
        "output_type": "display_data",
        "data": { "text/markdown": "**Bold**" }
      },
      "children": []
    }
  ]
}

The text/markdown output may be parsed into native children nodes in native MyST AST, however, that is out of scope for this proposal.

Implementation Details¶

Backward Compatibility¶

Existing content using V2 format can be automatically upgraded during processing (e.g. #2551). The downgrade path ensures that tools expecting V2 format can still work with V3 content when needed.

UX Implications & Migration¶

Migration Impact¶

Automatic: Existing documents will be automatically upgraded during processing. No manual migration is required.
Transparent: The change is primarily internal to the AST representation. Users working with parsed ASTs will need to handle the new structure, but the upgrade/downgrade functions provide a clear migration path.
Non-Breaking: For users writing MyST Markdown or notebooks, this change is transparent. The syntax and behavior remain the same.

Theme Considerations¶

As a direct consequence of this MEP, themes and renderers will need to be updated to handle the new Outputs container node and iterate over Output children. However, there are several ways that AST renderers such as web themes and templates can pull in incompatible ASTs:

New AST renderer provided with old AST content.
Dynamic loading of new AST in old AST renderers (web only).
Dynamic loading of old AST in new AST renderers (web only).

Content-Renderer Separation¶

The MyST Document Engine enforces a strong separation between the MyST AST production engine (the MyST Document Engine itself) and the MyST AST rendering engine (e.g. MyST Theme for web applications). This separation is typically invisible, but it is entirely possible (and indeed desired) to perform the project build and project rendering steps at different times. For example, one may produce AST and push it to a CDN, where a MyST Theme application can render it.

Due to the separation of the content rendering and the content generation, it is important to ensure that deployed AST rendering applications are prevented from attempting to render projects with unsupported AST versions. Whilst there is tooling to perform bidirectional AST migrations, these are not widely deployed in existing web themes. As such, users deploying MyST rendering engines must take care to anticipate AST mismatches, and perform AST migration steps themselves.

Although typical usage of the MyST Document Engine as a static site generator hides this separation, the existing lack of constraint between the MyST Theme and MyST Document Engine versions can result in users encountering incompatible versions at build time. We anticipate a future MEP that attempts to address this UX problem.

Dynamic loading of AST¶

Outside of deploying AST renderers in a CDN context, it is also possible for MyST Themes to pull in incompatible AST via the external cross-reference (xref) mechanism. This is performed at build time (where it is automatically migrated) and at read-time (where it is presently not). This may affect users who build sites with myst build --site and deploy them to static web servers like GitHub Pages. Sites that use external xrefs support dynamic fetching of AST, which happens at reading time, not at page build time. This means that xrefs may pull in incompatible (future or past) versions of the AST that may fail to render properly. Future work may be done to address AST upgrading and/or AST downgrading in these contexts.

References¶

Jupyter display protocol - Standard way for kernels to output rich content
MyST AST Specification - Current AST structure and versioning
GitHub Discussion on AST Output - Original discussion thread

This MEP addresses requirements from:

Per-Output AST Representation for Code Cells

Summary¶

Context¶

Current Limitations¶

Use Cases¶

Proposal¶

AST Structure Changes¶

Migration Strategy¶

Example Transformation¶

Implementation Details¶

Backward Compatibility¶

UX Implications & Migration¶

Migration Impact¶

Theme Considerations¶

Content-Renderer Separation¶

Dynamic loading of AST¶

References¶

Related Issues and Pull Requests¶