XSS

MEDIUM

rails/rails

Commit: 96aeac6c08a2

Affected: 8.1.3 and earlier

2026-04-05 12:21 UTC

Description

The commit adds escaping of CommonMark markdown metacharacters in Action Text markdown generation to ensure text nodes are treated as literal text by Markdown renderers, reducing the risk of XSS via injected markdown formatting. It introduces ActionText::MarkdownConversion.escape_markdown_text, updates Content#to_markdown to wrap attachment markdown in an action-text-markdown element (so content is passed through without escaping), and adjusts how markdown links are rendered to escape their text. Overall, it hardens markdown rendering against user-supplied content that could otherwise be interpreted as formatting (e.g., headings, links, or other markdown constructs) and potentially lead to XSS when rendered by a Markdown processor.

Proof of Concept

PoC (conceptual and executable steps):

- Rationale: Before this patch, text nodes inside HTML that contain CommonMark metacharacters could be treated as markdown formatting when converted to Markdown, potentially enabling markdown-driven injection or visual-delimiter-based manipulation in rendered output. The patch escapes these metacharacters so they render as literal text instead of formatting.

- PoC code (Rails console):

  # Demonstrates escaping of common metacharacters in a string that would be interpreted by a Markdown renderer
  html_with_script = "<p>Hello</p><script>alert('XSS')</script>"

  # Post-patch: escaping is applied to text nodes, so the script tag would be escaped rather than executed.
  escaped = ActionText::MarkdownConversion.escape_markdown_text("<script>alert('XSS')</script>")
  puts escaped
  # Expected output (shown as Ruby string):
  # \<script\>alert('XSS')\</script\>

  # To illustrate the render effect, if you pass the original HTML through the markdown converter in a post-patch app,
  # the resulting markdown would contain the escaped characters, rendering literally as "<script>alert('XSS')</script>" in HTML,
  # rather than executing a script.

- Expected outcome: After the patch, rendering the same content via to_markdown would escape the < and > characters in text nodes,
  preventing a Markdown renderer from interpreting the content as HTML/formatting and mitigating XSS risks.

Note: The exact pre-patch behavior would depend on the Markdown renderer in use; the PoC focuses on demonstrating that escaping occurs for metacharacters and prevents execution when rendered.

Commit Details

Author: Mike Dalessio

Date: 2026-02-25 21:51 UTC

Message:

Improve character escaping in Action Text markdown generation (#56873)

Text nodes in HTML that contain CommonMark metacharacters (like `*`,
`[`, `]`, `#`, `<`, `>`, etc.) now have those characters backslash-escaped
so markdown renderers treat them as literal text rather than
formatting.

So, for example, Rich Text content like this:

    <p>## Look at *this*</p>

does not become a markdown-formatted heading, instead it renders as
ordinary text:

    \## Look at \*this\*

To preserve markdown generated by attachments'
`attachable_markdown_representation` method, `Content#to_markdown`
wraps the representation in <action-text-markdown> elements. The tree
walker recognizes these and passes their content through without
escaping.

Captions and filenames in `ActiveStorage::Blob` and `RemoteImage`
attachment representations are also escaped, and
`MarkdownConversion#escape_markdown_text` is now public for use by
application custom attachables.

This PR is likely escaping Markdown metacharacters where it's not
absolutely necessary. We tried to avoid escaping in some obvious
cases, but we can continue to refine this in the future by making the
regular expression more complex.

Co-authored-by: Jeremy Daer <jeremy@rubyonrails.org>

Triage Assessment

Vulnerability Type: XSS

Confidence: MEDIUM

Reasoning:

The commit adds escaping for CommonMark metacharacters in markdown output and adjusts how attachments are rendered to ensure markdown content is treated as literal text rather than interpreted as formatting. This reduces the risk that user-provided rich-text content could execute or render unintended HTML/markdown, which can lead to XSS or injection-like issues in Markdown renderers.

Verification Assessment

Vulnerability Type: XSS

Confidence: MEDIUM

Affected Versions: 8.1.3 and earlier

Code Diff

diff --git a/actiontext/lib/action_text/attachment.rb b/actiontext/lib/action_text/attachment.rb
index afbe151db5809..971e6872587bb 100644
--- a/actiontext/lib/action_text/attachment.rb
+++ b/actiontext/lib/action_text/attachment.rb
@@ -139,7 +139,7 @@ def to_plain_text
     #       include ActionText::Attachable
     #
     #       def attachable_markdown_representation(caption)
-    #         "[@#{name}](#{Rails.application.routes.url_helpers.person_url(self)})"
+    #         MarkdownConversion.markdown_link("@#{name}", Rails.application.routes.url_helpers.person_url(self))
     #       end
     #     end
     #

diff --git a/actiontext/lib/action_text/content.rb b/actiontext/lib/action_text/content.rb
index 5490d0948069b..6fb9becf1b3df 100644
--- a/actiontext/lib/action_text/content.rb
+++ b/actiontext/lib/action_text/content.rb
@@ -143,7 +143,11 @@ def to_plain_text
     # NOTE: that the returned string is not HTML safe and should not be rendered in
     # browsers without additional sanitization.
     def to_markdown
-      render_attachments(with_full_attributes: false, &:to_markdown).fragment.to_markdown
+      render_attachments(with_full_attributes: false) { |attachment|
+        ActionText::HtmlConversion.create_element("action-text-markdown").tap do |node|
+          node.content = attachment.to_markdown
+        end
+      }.fragment.to_markdown
     end
 
     def to_trix_html

diff --git a/actiontext/lib/action_text/engine.rb b/actiontext/lib/action_text/engine.rb
index 86b38589efdd9..64061c8311ddb 100644
--- a/actiontext/lib/action_text/engine.rb
+++ b/actiontext/lib/action_text/engine.rb
@@ -61,7 +61,7 @@ def attachable_plain_text_representation(caption = nil)
         end
 
         def attachable_markdown_representation(caption = nil)
-          "[#{caption || filename}]"
+          "[#{MarkdownConversion.escape_markdown_text((caption || filename).to_s)}]"
         end
 
         def to_trix_content_attachment_partial_path

diff --git a/actiontext/lib/action_text/markdown_conversion.rb b/actiontext/lib/action_text/markdown_conversion.rb
index 762c0139d5aea..3fce10ef272a3 100644
--- a/actiontext/lib/action_text/markdown_conversion.rb
+++ b/actiontext/lib/action_text/markdown_conversion.rb
@@ -9,6 +9,12 @@ module ActionText
   #
   # Converts an HTML fragment into a Markdown string. Used by `ActionText::Content#to_markdown`
   # and `ActionText::Fragment#to_markdown` to produce Markdown representations of rich text.
+  #
+  # Example: `<h1>Release Notes</h1>` => `# Release Notes`, a markdown heading.
+  #
+  # Note that this converter escapes text nodes so it won't render as markdown.
+  #
+  # Example: `<p># Release Notes</p>` => `\# Release Notes`, not a heading.
   module MarkdownConversion
     extend self
 
@@ -29,7 +35,16 @@ def node_to_markdown(node)
     #     MarkdownConversion.markdown_link("photo", "https://example.com/photo_(large).png")
     #     # => "[photo](https://example.com/photo_%28large%29.png)"
     def markdown_link(title, url)
-      "[#{escape_link_text(title)}](#{encode_href(url)})"
+      "[#{escape_markdown_text(title)}](#{encode_href(url)})"
+    end
+
+    # Backslash-escapes CommonMark metacharacters in +text+ so they are treated
+    # as literal characters by Markdown renderers.
+    #
+    #     MarkdownConversion.escape_markdown_text("**Important**")
+    #     # => "\\*\\*Important\\*\\*"
+    def escape_markdown_text(text)
+      text.gsub(MARKDOWN_METACHARACTERS) { |c| "\\#{c}" }
     end
 
     private
@@ -38,14 +53,31 @@ def markdown_link(title, url)
       LIST_BULLET = /\A(-|\d+\.) /
       LIST_INDENT = "  "
       ENCODE_HREF_CHARS = /[() <>\n\r\t]/
-      private_constant :BOLD_TAGS, :ITALIC_TAGS, :LIST_BULLET, :LIST_INDENT, :ENCODE_HREF_CHARS
+      MARKDOWN_METACHARACTERS = /
+        [\\`*_{}\[\]|~<>]     # metacharacters that should be escaped generally
+        | \A\#(?=[\s\#]|\z)   # leading hash before space or another hash: ATX heading
+        | \A=(?=[=\s]|\z)     # leading equals before space or another equals: setext heading
+        | \A-                 # leading hyphen: list item, thematic break, or setext heading
+        | \A\+(?=\s|\z)       # leading plus before space: list item
+        | \A\d+\K\.(?=\s|\z)  # leading "1." with trailing space: ordered list item (only the dot is matched)
+      /x
+      SKIP_ESCAPING_PARENTS = %w[ action-text-markdown code pre ].freeze
+      INLINE_ELEMENTS = %w[
+        action-text-markdown
+        a abbr b bdi bdo cite code data dfn em i kbd mark q
+        rp rt ruby s samp small span strong sub sup time u var
+      ].freeze
+      private_constant :BOLD_TAGS, :ITALIC_TAGS, :LIST_BULLET, :LIST_INDENT, :ENCODE_HREF_CHARS,
+        :MARKDOWN_METACHARACTERS, :SKIP_ESCAPING_PARENTS, :INLINE_ELEMENTS
 
       def markdown_for_node(node, child_values)
         if node.text?
           if node.content.blank? && !significant_whitespace?(node)
             ""
-          else
+          elsif skip_markdown_escaping?(node)
             node.content
+          else
+            escape_markdown_text(node.content)
           end
         elsif node.element?
           method_name = :"visit_#{node.name.tr("-", "_")}"
@@ -134,7 +166,7 @@ def visit_ol(node, child_values)
       def visit_a(node, child_values)
         inner = join_children(child_values)
         if (href = node["href"]) && Rails::HTML::Sanitizer.allowed_uri?(href)
-          markdown_link(inner, href)
+          "[#{inner}](#{encode_href(href)})"
         else
           inner
         end
@@ -162,6 +194,12 @@ def visit_hr(_node, _child_values)
         "---\n\n"
       end
 
+      # Attachment markdown is wrapped in <action-text-markdown> by Content#to_markdown so it passes
+      # through without text escaping.
+      def visit_action_text_markdown(_node, child_values)
+        join_children(child_values)
+      end
+
       # Avoid including content from elements that aren't meaningful for markdown output
       def visit__unsupported(_node, _child_values)
         ""
@@ -276,7 +314,12 @@ def inline_code(content)
       end
 
       def significant_whitespace?(node)
-        node.previous_sibling&.text? && node.next_sibling&.text?
+        significant_inline_sibling?(node.previous_sibling) &&
+          significant_inline_sibling?(node.next_sibling)
+      end
+
+      def significant_inline_sibling?(sibling)
+        sibling&.text? || sibling&.name&.in?(INLINE_ELEMENTS)
       end
 
       def ancestor_named?(node, names, max_depth:)
@@ -293,8 +336,8 @@ def encode_href(href)
         URI::RFC2396_PARSER.escape(href, ENCODE_HREF_CHARS)
       end
 
-      def escape_link_text(text)
-        text.gsub(/[\\\[\]]/) { |c| "\\#{c}" }
+      def skip_markdown_escaping?(node)
+        node.parent&.name.in?(SKIP_ESCAPING_PARENTS)
       end
   end
 end

diff --git a/actiontext/test/unit/markdown_conversion_test.rb b/actiontext/test/unit/markdown_conversion_test.rb
index 959bec6836ef5..409010fafd7fe 100644
--- a/actiontext/test/unit/markdown_conversion_test.rb
+++ b/actiontext/test/unit/markdown_conversion_test.rb
@@ -3,13 +3,144 @@
 require "test_helper"
 
 class ActionText::MarkdownConversionTest < ActiveSupport::TestCase
+  # --- Text tests ---
+
   test "plain text passes through unchanged" do
     assert_converted_to(
       "hello world",
       "hello world"
     )
+    assert_converted_to(
+      "hello world",
+      "<p>hello world</p>"
+    )
+  end
+
+  test "empty content" do
+    assert_converted_to("", "")
+  end
+
+  test "leading and trailing whitespace is stripped" do
+    assert_converted_to("hello", "<p>  hello  </p>")
+  end
+
+  test "HTML entities are decoded" do
+    assert_converted_to(
+      'asdf \< asdf & asdf \> asdf',
+      "<p>asdf &lt; asdf &amp; asdf &gt; asdf</p>"
+    )
+  end
+
+  test "text escaping: angle bracket (HTML block)" do
+    assert_converted_to('\<!-- hidden --\>', "<p>&lt;!-- hidden --&gt;</p>")
+  end
+
+  test "text escaping: backslash (escape character)" do
+    assert_converted_to('a \\\\ b', '<p>a \\ b</p>')
+  end
+
+  test "text escaping: backtick (code span)" do
+    assert_converted_to('a \\` b', "<p>a ` b</p>")
+  end
+
+  test "text escaping: asterisk (emphasis)" do
+    assert_converted_to('\\*not italic\\*', "<p>*not italic*</p>")
+    assert_converted_to('\\*\\*not bold\\*\\*', "<p>**not bold**</p>")
+  end
+
+  test "text escaping: underscore (emphasis)" do
+    assert_converted_to('\\_not italic\\_', "<p>_not italic_</p>")
+    assert_converted_to('\\_\\_not bold\\_\\_', "<p>__not bold__</p>")
+  end
+
+  test "text escaping: curly braces (reserved for extensions)" do
+    assert_converted_to('\\{attr\\}', "<p>{attr}</p>")
+  end
+
+  test "text escaping: square brackets (link text)" do
+    assert_converted_to('\\[not a link\\]', "<p>[not a link]</p>")
+  end
+
+  test "text escaping: parentheses are not escaped because bracket escaping prevents link syntax" do
+    assert_converted_to("(aside)", "<p>(aside)</p>")
+    assert_converted_to("f(x)", "<p>f(x)</p>")
+  end
+
+  test "text escaping: hash (ATX heading)" do
+    assert_converted_to('\# not a heading', "<p># not a heading</p>")
+    assert_converted_to('\## not a heading', "<p>## not a heading</p>")
+    assert_converted_to('\#', "<p>#</p>")
+    assert_converted_to("> \\# not a heading", "<blockquote># not a heading</blockquote>")
+  end
+
+  test "text escaping: hash is not escaped in normal prose" do
+    assert_converted_to("C# is a language", "<p>C# is a language</p>")
+    assert_converted_to("issue #123", "<p>issue #123</p>")
+    assert_converted_to("#hashtag", "<p>#hashtag</p>")
   end
 
+  test "text escaping: plus (unordered list item)" do
+    assert_converted_to('\+ not a list', "<p>+ not a list</p>")
+  end
+
+  test "text escaping: plus is not escaped in normal prose" do
+    assert_converted_to("C++ is a language", "<p>C++ is a language</p>")
+    assert_converted_to("a + b = c", "<p>a + b = c</p>")
+    assert_converted_to("+1 for this", "<p>+1 for this</p>")
+  end
+
+  test "text escaping: hyphen (list item, thematic break, setext heading)" do
+    assert_converted_to('\- not a list', "<p>- not a list</p>")
+    assert_converted_to('\---', "<p>---</p>")
+    assert_converted_to("> \\- not a list", "<blockquote>- not a list</blockquote>")
+    assert_converted_to('> \---', "<blockquote>---</blockquote>")
+  end
+
+  test "text escaping: equals (setext heading)" do
+    assert_converted_to('\===', "<p>===</p>")
+    assert_converted_to('\=', "<p>=</p>")
+    assert_converted_to('> \===', "<blockquote>===</blockquote>")
+  end
+
+  test "text escaping: equals is not escaped in normal prose" do
+    assert_converted_to("a = b", "<p>a = b</p>")
+    assert_converted_to("a == b", "<p>a == b</p>")
+    assert_converted_to("=foo", "<p>=foo</p>")
+  end
+
+  test "text escaping: hyphen is not escaped in normal prose" do
+    assert_converted_to("well-known fact", "<p>well-known fact</p>")
+    assert_converted_to("foo -- bar", "<p>foo -- bar</p>")
+    assert_converted_to("foo --- bar", "<p>foo --- bar</p>")
+  end
+
+  test "text escaping: dot (ordered list item)" do
+    assert_converted_to('1\. not a list', "<p>1. not a list</p>")
+    assert_converted_to('1\.', "<p>1.</p>")
+  end
+
+  test "text escaping: dot is not escaped in normal prose" do
+    assert_converted_to("end of sentence. start of next", "<p>end of sentence. start of next</p>")
+    assert_converted_to("e.g. for example", "<p>e.g. for example</p>")
+    assert_converted_to("Agent 007. Licensed to kill.", "<p>Agent 007. Licensed to kill.</p>")
+    assert_converted_to("3.1 this is not a list item", "<p>3.1 this is not a list item</p>")
+  end
+
+  test "text escaping: pipe (table delimiter)" do
+    assert_converted_to('a \\| b', "<p>a | b</p>")
+  end
+
+  test "text escaping: tilde (strikethrough)" do
+    assert_converted_to('\\~\\~not struck\\~\\~', "<p>~~not struck~~</p>")
+  end
+
+  test "text escaping: greater-than (block quote)" do
+    assert_converted_to('\> not a quote', "<p>> not a quote</p>")
+    assert_converted_to('\> not a quote', "<p>&gt; not a quote</p>")
+  end
+
+  # --- Tag tests ---
+
   test "<strong> tags are converted to bold" do
     assert_converted_to("**hello**", "<strong>hello</strong>")
   end
@@ -295,14 +426,14 @@ class ActionText::MarkdownConversionTest < ActiveSupport::TestCase
 
   test "<a> tags with bracket injection in link text are escaped" do
     assert_converted_to(
-      "[text\\](https://evil.com)](https://example.com)",
+      '[text\](https://evil.com)](https://example.com)',
       '<a href="https://example.com">text](https://evil.com)</a>'
     )
   end
 
   test "<a> tags with backslash+bracket injection in link text are escaped" do
     assert_converted_to(
-      "[text\\\\\\](https://evil.com)](https://example.com)",
+      '[text\\\\\](https://evil.com)](https://example.com)',
       '<a href="https://example.com">text\](https://evil.com)</a>'
     )
   end
@@ -402,21 +533,6 @@ class ActionText::MarkdownConversionTest < ActiveSupport::TestCase
     )
   end
 
-  test "empty content" do
-    assert_converted_to("", "")
-  end
-
-  test "leading and trailing whitespace is stripped" do
-    assert_converted_to("hello", "<p>  hello  </p>")
-  end
-
-  test "HTML entities are decoded" do
-    assert_converted_to(
-      "asdf < asdf & asdf > asdf",
-      "<p>asdf &lt; asdf &amp; asdf &gt; asdf</p>"
-    )
-  end
-
   test "unknown elements pass through their content" do
     assert_converted_to("hello", "<asdf>hello</asdf>")
   end
@@ -453,6 +569,57 @@ class ActionText::MarkdownConversionTest < ActiveSupport::TestCase
     )
   end
 
+  test "attachment with surrounding text" do
+    assert_converted_to(
+      "Hello world! ![Cat](http://example.com/cat.jpg)",
+      'Hello world! <action-text-attachment url="http://example.com/cat.jpg" content-type="image/jpeg" caption="Cat"></action-text-attachment>'
+    )
+  end
+
+  test "multiple attachments separated by whitespace preserve the whitespace" do
+    assert_converted_to(
+      "![A](https://example.com/a.jpg) ![B](https://example.com/b.jpg)",
+      '<action-text-attachment content-type="image/jpeg" url="https://example.com/a.jpg" caption="A"></action-text-attachment> <action-text-attachment content-type="image/jpeg" url="https://example.com/b.jpg" caption="B"></action-text-attachment>'
+    )
+  end
+
+  test "multiple attachments and text separated by whitespace preserve the whitespace" do
+    assert_converted_to(
+      "![A](https://example.com/a.jpg) and ![B](https://example.com/b.jpg)",
+      '<action-text-attachment content-type="image/jpeg" url="https://example.com/a.jpg" caption="A"></action-text-attachment> and <action-text-attachment content-type="image/jpeg" url="https://example.com/b.jpg" caption="B"></action-text-attachment>'
+    )
+  end
+
+  test "attachment with whitespace before inline element preserves the whitesp
... [truncated]

← Back to Alerts View on GitHub →