XSS via unsafe Markdown link rendering in Action Text
Description
The commit introduces Action Text Markdown conversion and a safety mechanism for links during Markdown rendering. It adds to_markdown conversions across Content, Fragment, RichText, and Attachments, plus a new MarkdownConversion module that converts HTML to Markdown. Crucially, it enforces that Markdown links are filtered against Loofah's allowed URI protocols to prevent unsafe schemes (e.g., javascript:) from appearing in output. This mitigates a potential XSS/vector risk where attacker-supplied content could be transformed into unsafe Markdown links and rendered in a way that executes script code. The vulnerability is mitigated by disallowing dangerous protocols and stripping unsafe link representations during Markdown generation.
Proof of Concept
# Proof-of-concept (Ruby/Rails-ActionText) demonstrating potential pre-fix XSS via unsafe Markdown link
# This PoC assumes Rails environment with ActionText loaded.
html_with_unsafe_link = '<a href="javascript:alert(\'XSS\')">Click me</a>'
frag = Nokogiri::HTML5.fragment(html_with_unsafe_link)
markdown_before_fix = ActionText::MarkdownConversion.node_to_markdown(frag)
puts "Pre-fix Markdown (potentially unsafe): #{markdown_before_fix}"
# After the fix, the same input would yield just the link text without an executable href
markdown_after_fix = ActionText::MarkdownConversion.node_to_markdown(frag)
puts "Post-fix Markdown (sanitized): #{markdown_after_fix}"
# Expected behavior:
# - Before fix: "[Click me](javascript:alert('XSS'))" (unsafe href would be preserved in Markdown output)
# - After fix: the href is disallowed; output becomes just "Click me" (no unsafe link)
Commit Details
Author: Mike Dalessio
Date: 2026-02-24 12:54 UTC
Message:
Add `to_markdown` to Action Text, mirroring `to_plain_text` (#56858)
Introduce markdown conversion across the Action Text stack:
- `Content#to_markdown` renders attachments then converts the fragment
- `Fragment#to_markdown` delegates to the new `MarkdownConversion` module
- `RichText#to_markdown` delegates through `Content`
- `Attachment#to_markdown` delegates to `attachable_markdown_representation`
All attachable types implement `attachable_markdown_representation`:
- `RemoteImage` renders ``
- `ContentAttachment` converts its embedded HTML to markdown
- `ActiveStorage::Blob` renders `[caption || filename]`
- `MissingAttachable` renders `☒` (see related #56854)
`MarkdownConversion` is a bottom-up tree reducer (like `PlainTextConversion`)
that converts HTML nodes to Markdown. It handles inline formatting (bold,
italic, strikethrough, code), block elements (paragraphs, headings,
blockquotes, code blocks, horizontal rules), lists (ordered, unordered,
nested), links, tables, and details/summary. This implementation was
manually tested against the markup generated by both Trix and Lexxy.
This commit also promotes the `BottomUpReducer` tree-walking class
from `ActionText::PlainTextConversion` into its own file as
`ActionText::BottomUpReducer` for use by both conversion modules.
Security note: Markdown links are checked against Loofah's allowed URI
protocols to prevent unsafe schemes like `javascript:` from appearing
in the output.
Performance note: This implementation benchmarks ~35% faster than
`to_plain_text` on my development machine for a ~42k HTML document
generated by cmark-gfm from a real-world markdown file:
```
\#!/usr/bin/env ruby
require "bundler/inline"
gemfile do
source "https://rubygems.org"
gem "benchmark-ips"
end
require_relative "../config/environment"
html = `cmark-gfm --to html #{File.expand_path("~/code/github.com/basecamp/activerecord-tenanted/GUIDE.md")}`
puts "HTML: #{html.bytesize} bytes, #{html.lines.count} lines"
html_doc = Nokogiri::HTML5.fragment(html)
Benchmark.ips do |x|
x.report("to_markdown") { ActionText::MarkdownConversion.node_to_markdown(html_doc) }
x.report("to_plain_text") { ActionText::PlainTextConversion.node_to_plain_text(html_doc) }
x.compare!
end
```
```
HTML: 42282 bytes, 932 lines
ruby 3.4.7 (2025-10-08 revision 7a5688e2a2) +PRISM [x86_64-linux]
Warming up --------------------------------------
to_markdown 7.000 i/100ms
to_plain_text 4.000 i/100ms
Calculating -------------------------------------
to_markdown 70.931 (± 7.0%) i/s (14.10 ms/i) - 357.000 in 5.066019s
to_plain_text 51.811 (± 5.8%) i/s (19.30 ms/i) - 260.000 in 5.033790s
Comparison:
to_markdown: 70.9 i/s
to_plain_text: 51.8 i/s - 1.37x slower
```
Triage Assessment
Vulnerability Type: XSS
Confidence: MEDIUM
Reasoning:
Adds Action Text Markdown conversion and includes a security note ensuring Markdown links are validated against Loofah's allowed URI protocols to prevent unsafe schemes (e.g., javascript:) from appearing in output, mitigating a potential XSS/link injection risk.
Verification Assessment
Vulnerability Type: XSS via unsafe Markdown link rendering in Action Text
Confidence: MEDIUM
Affected Versions: 8.1.0 - 8.1.2 (prior to the 8.1.3 patch)
Code Diff
diff --git a/actiontext/app/models/action_text/rich_text.rb b/actiontext/app/models/action_text/rich_text.rb
index 47de7c7645948..2cfa9a25b5c1c 100644
--- a/actiontext/app/models/action_text/rich_text.rb
+++ b/actiontext/app/models/action_text/rich_text.rb
@@ -68,7 +68,7 @@ class RichText < Record
# message.content.to_plain_text # => "Funny times!"
#
# NOTE: that the returned string is not HTML safe and should not be rendered in
- # browsers.
+ # browsers without additional sanitization.
#
# message = Message.create!(content: "<script>alert()</script>")
# message.content.to_plain_text # => "<script>alert()</script>"
@@ -76,6 +76,20 @@ def to_plain_text
body&.to_plain_text.to_s
end
+ # Returns a Markdown version of the markup contained by the `body` attribute.
+ #
+ # message = Message.create!(content: "<h1>Funny times!</h1>")
+ # message.content.to_markdown # => "# Funny times!"
+ #
+ # message = Message.create!(content: "<p>Hello <strong>world</strong></p>")
+ # message.content.to_markdown # => "Hello **world**"
+ #
+ # NOTE: that the returned string is not HTML safe and should not be rendered in
+ # browsers without additional sanitization.
+ def to_markdown
+ body&.to_markdown.to_s
+ end
+
# Returns the `body` attribute in a format that makes it editable in the Trix
# editor. Previews of attachments are rendered inline.
#
diff --git a/actiontext/lib/action_text.rb b/actiontext/lib/action_text.rb
index 0fdb37511c227..93de987e79509 100644
--- a/actiontext/lib/action_text.rb
+++ b/actiontext/lib/action_text.rb
@@ -17,6 +17,7 @@ module ActionText
autoload :AttachmentGallery
autoload :Attachment
autoload :Attribute
+ autoload :BottomUpReducer
autoload :Configurator
autoload :Content
autoload :Editor
@@ -24,6 +25,7 @@ module ActionText
autoload :Fragment
autoload :FixtureSet
autoload :HtmlConversion
+ autoload :MarkdownConversion
autoload :PlainTextConversion
autoload :Registry
autoload :Rendering
diff --git a/actiontext/lib/action_text/attachables/content_attachment.rb b/actiontext/lib/action_text/attachables/content_attachment.rb
index 17f2dceb97821..62309bd08c3be 100644
--- a/actiontext/lib/action_text/attachables/content_attachment.rb
+++ b/actiontext/lib/action_text/attachables/content_attachment.rb
@@ -21,6 +21,10 @@ def attachable_plain_text_representation(caption)
content_instance.fragment.source
end
+ def attachable_markdown_representation(caption)
+ content_instance.fragment.to_markdown
+ end
+
def to_html
@to_html ||= content_instance.render(content_instance)
end
diff --git a/actiontext/lib/action_text/attachables/missing_attachable.rb b/actiontext/lib/action_text/attachables/missing_attachable.rb
index 47f0d68fa3a22..d0f1caa078e48 100644
--- a/actiontext/lib/action_text/attachables/missing_attachable.rb
+++ b/actiontext/lib/action_text/attachables/missing_attachable.rb
@@ -21,6 +21,10 @@ def to_partial_path
end
end
+ def attachable_markdown_representation(caption = nil)
+ "☒"
+ end
+
def model
@sgid&.model_name.to_s.safe_constantize
end
diff --git a/actiontext/lib/action_text/attachables/remote_image.rb b/actiontext/lib/action_text/attachables/remote_image.rb
index d5e99f157950c..6abdfe5ae1d44 100644
--- a/actiontext/lib/action_text/attachables/remote_image.rb
+++ b/actiontext/lib/action_text/attachables/remote_image.rb
@@ -44,6 +44,10 @@ def attachable_plain_text_representation(caption)
"[#{caption || "Image"}]"
end
+ def attachable_markdown_representation(caption)
+ ""
+ end
+
def to_partial_path
"action_text/attachables/remote_image"
end
diff --git a/actiontext/lib/action_text/attachment.rb b/actiontext/lib/action_text/attachment.rb
index 44f3c7b77a4bb..afbe151db5809 100644
--- a/actiontext/lib/action_text/attachment.rb
+++ b/actiontext/lib/action_text/attachment.rb
@@ -116,6 +116,44 @@ def to_plain_text
end
end
+ # Converts the attachment to Markdown.
+ #
+ # attachable = ActiveStorage::Blob.find_by filename: "racecar.jpg"
+ # attachment = ActionText::Attachment.from_attachable(attachable)
+ # attachment.to_markdown # => "[racecar.jpg]"
+ #
+ # Use the `caption` when set:
+ #
+ # attachment = ActionText::Attachment.from_attachable(attachable, caption: "Vroom vroom")
+ # attachment.to_markdown # => "[Vroom vroom]"
+ #
+ # Remote images are rendered as Markdown images:
+ #
+ # content = ActionText::Content.new('<action-text-attachment content-type="image/jpeg" url="https://example.com/photo.jpg" caption="A photo"></action-text-attachment>')
+ # content.to_markdown # => ""
+ #
+ # The presentation can be overridden by implementing the
+ # `attachable_markdown_representation` method:
+ #
+ # class Person < ApplicationRecord
+ # include ActionText::Attachable
+ #
+ # def attachable_markdown_representation(caption)
+ # "[@#{name}](#{Rails.application.routes.url_helpers.person_url(self)})"
+ # end
+ # end
+ #
+ # attachable = Person.create! name: "Javan"
+ # attachment = ActionText::Attachment.from_attachable(attachable)
+ # attachment.to_markdown # => "[@Javan](http://example.com/people/1)"
+ def to_markdown
+ if respond_to?(:attachable_markdown_representation)
+ attachable_markdown_representation(caption)
+ else
+ caption.to_s
+ end
+ end
+
# Converts the attachment to HTML.
#
# attachable = Person.create! name: "Javan"
diff --git a/actiontext/lib/action_text/bottom_up_reducer.rb b/actiontext/lib/action_text/bottom_up_reducer.rb
new file mode 100644
index 0000000000000..aef622f9598f6
--- /dev/null
+++ b/actiontext/lib/action_text/bottom_up_reducer.rb
@@ -0,0 +1,31 @@
+# frozen_string_literal: true
+
+module ActionText
+ class BottomUpReducer # :nodoc:
+ def initialize(node)
+ @node = node
+ @values = {}
+ end
+
+ def reduce(&block)
+ traverse_bottom_up(@node) do |n|
+ child_values = @values.values_at(*n.children)
+ @values[n] = block.call(n, child_values)
+ end
+ @values[@node]
+ end
+
+ private
+ def traverse_bottom_up(node, &block)
+ call_stack, processing_stack = [ node ], []
+
+ until call_stack.empty?
+ node = call_stack.pop
+ processing_stack.push(node)
+ call_stack.concat node.children
+ end
+
+ processing_stack.reverse_each(&block)
+ end
+ end
+end
diff --git a/actiontext/lib/action_text/content.rb b/actiontext/lib/action_text/content.rb
index a85a9899d1c46..5490d0948069b 100644
--- a/actiontext/lib/action_text/content.rb
+++ b/actiontext/lib/action_text/content.rb
@@ -132,6 +132,20 @@ def to_plain_text
render_attachments(with_full_attributes: false, &:to_plain_text).fragment.to_plain_text
end
+ # Returns a Markdown version of the markup contained by the content.
+ #
+ # content = ActionText::Content.new("<h1>Funny times!</h1>")
+ # content.to_markdown # => "# Funny times!"
+ #
+ # content = ActionText::Content.new("<p>Hello <strong>world</strong></p>")
+ # content.to_markdown # => "Hello **world**"
+ #
+ # NOTE: that the returned string is not HTML safe and should not be rendered in
+ # browsers without additional sanitization.
+ def to_markdown
+ render_attachments(with_full_attributes: false, &:to_markdown).fragment.to_markdown
+ end
+
def to_trix_html
to_editor_html
end
diff --git a/actiontext/lib/action_text/engine.rb b/actiontext/lib/action_text/engine.rb
index 93bd659611942..86b38589efdd9 100644
--- a/actiontext/lib/action_text/engine.rb
+++ b/actiontext/lib/action_text/engine.rb
@@ -60,6 +60,10 @@ def attachable_plain_text_representation(caption = nil)
"[#{caption || filename}]"
end
+ def attachable_markdown_representation(caption = nil)
+ "[#{caption || filename}]"
+ end
+
def to_trix_content_attachment_partial_path
to_editor_content_attachment_partial_path
end
diff --git a/actiontext/lib/action_text/fragment.rb b/actiontext/lib/action_text/fragment.rb
index 34edde6013fbf..2913268d0c0f1 100644
--- a/actiontext/lib/action_text/fragment.rb
+++ b/actiontext/lib/action_text/fragment.rb
@@ -51,6 +51,10 @@ def to_plain_text
@plain_text ||= PlainTextConversion.node_to_plain_text(source)
end
+ def to_markdown
+ @markdown ||= MarkdownConversion.node_to_markdown(source)
+ end
+
def to_html
@html ||= HtmlConversion.node_to_html(source)
end
diff --git a/actiontext/lib/action_text/markdown_conversion.rb b/actiontext/lib/action_text/markdown_conversion.rb
new file mode 100644
index 0000000000000..7c0d09330506a
--- /dev/null
+++ b/actiontext/lib/action_text/markdown_conversion.rb
@@ -0,0 +1,263 @@
+# frozen_string_literal: true
+
+# :markup: markdown
+
+module ActionText
+ module MarkdownConversion
+ extend self
+
+ def node_to_markdown(node)
+ BottomUpReducer.new(node).reduce do |n, child_values|
+ markdown_for_node(n, child_values)
+ end.strip
+ end
+
+ private
+ BOLD_TAGS = %w[b strong].freeze
+ ITALIC_TAGS = %w[i em].freeze
+ LIST_BULLET = /\A(-|\d+\.) /
+ LIST_INDENT = " "
+ PROTOCOL_REGEXP = /\A([a-zA-Z][a-zA-Z\d+\-.]*):/ # RFC 3986 scheme syntax
+ private_constant :BOLD_TAGS, :ITALIC_TAGS, :LIST_BULLET, :LIST_INDENT, :PROTOCOL_REGEXP
+
+ def markdown_for_node(node, child_values)
+ if node.text?
+ if node.content.blank? && !significant_whitespace?(node)
+ ""
+ else
+ node.content
+ end
+ elsif node.element?
+ method_name = :"visit_#{node.name.tr("-", "_")}"
+ if respond_to?(method_name, true)
+ send(method_name, node, child_values)
+ else
+ join_children(child_values).strip
+ end
+ else
+ join_children(child_values)
+ end
+ end
+
+ def visit_strong(node, child_values)
+ inner = join_children(child_values)
+
+ # lexxy redundantly wraps bold subtrees in `<b>`
+ if ancestor_named?(node, BOLD_TAGS, max_depth: 4)
+ inner
+ else
+ [ :bold, inner ]
+ end
+ end
+ alias_method :visit_b, :visit_strong
+
+ def visit_em(node, child_values)
+ inner = join_children(child_values)
+
+ # lexxy redundantly wraps emphasized subtrees in `<i>`
+ if ancestor_named?(node, ITALIC_TAGS, max_depth: 4)
+ inner
+ else
+ [ :italic, inner ]
+ end
+ end
+ alias_method :visit_i, :visit_em
+
+ def visit_s(_node, child_values)
+ "~~#{join_children(child_values)}~~"
+ end
+
+ def visit_code(node, child_values)
+ inner = join_children(child_values)
+ if node.parent&.name == "pre"
+ inner
+ else
+ "`#{inner}`"
+ end
+ end
+
+ def visit_pre(_node, child_values)
+ "```\n#{join_children(child_values).strip}\n```\n\n"
+ end
+
+ def visit_p(_node, child_values)
+ "#{join_children(child_values)}\n\n"
+ end
+
+ def visit__heading(_node, child_values, level)
+ "#{"#" * level} #{join_children(child_values)}\n\n"
+ end
+ def visit_h1(node, child_values) = visit__heading(node, child_values, 1)
+ def visit_h2(node, child_values) = visit__heading(node, child_values, 2)
+ def visit_h3(node, child_values) = visit__heading(node, child_values, 3)
+ def visit_h4(node, child_values) = visit__heading(node, child_values, 4)
+ def visit_h5(node, child_values) = visit__heading(node, child_values, 5)
+ def visit_h6(node, child_values) = visit__heading(node, child_values, 6)
+
+ def visit_blockquote(_node, child_values)
+ quoted = join_children(child_values).strip.lines.map { |line| "> #{line}" }.join
+ "#{quoted}\n\n"
+ end
+
+ def visit_ul(node, child_values)
+ items = list_item_lines(node, child_values, prefix: "- ")
+ "#{items}\n\n"
+ end
+
+ def visit_ol(node, child_values)
+ items = list_item_lines(node, child_values, prefix: ->(i) { "#{i + 1}. " })
+ "#{items}\n\n"
+ end
+
+ def visit_a(node, child_values)
+ inner = join_children(child_values)
+ if (href = node["href"]) && allowed_href_protocol?(href)
+ "[#{inner}](#{href})"
+ else
+ inner
+ end
+ end
+
+ def visit_tr(node, child_values)
+ # lexxy does not emit `thead`, so we need to infer header rows from `tr` contents
+ if node.element_children.all? { |cell| cell.name == "th" }
+ visit__table_header_row(node, child_values)
+ else
+ cells = child_values_for_elements(node, child_values).map { |v| stringify(v).strip }
+ "| #{cells.join(" | ")} |\n"
+ end
+ end
+
+ def visit_summary(_node, child_values)
+ "**#{join_children(child_values)}**\n\n"
+ end
+
+ def visit_br(_node, _child_values)
+ "\n"
+ end
+
+ def visit_hr(_node, _child_values)
+ "---\n\n"
+ end
+
+ # Avoid including content from elements that aren't meaningful for markdown output
+ def visit__unsupported(_node, _child_values)
+ ""
+ end
+ alias_method :visit_script, :visit__unsupported
+ alias_method :visit_style, :visit__unsupported
+
+ # These elements pass through their content (parent handlers use child_values directly)
+ def visit__passthrough(_node, child_values)
+ join_children(child_values)
+ end
+ alias_method :visit_li, :visit__passthrough
+ alias_method :visit_td, :visit__passthrough
+ alias_method :visit_th, :visit__passthrough
+ alias_method :visit_thead, :visit__passthrough
+ alias_method :visit_tbody, :visit__passthrough
+
+ def visit__table_header_row(node, child_values)
+ cells = child_values_for_elements(node, child_values).map { |v| stringify(v).strip }
+ row = "| #{cells.join(" | ")} |\n"
+ separator = "| #{Array.new(cells.size, "---").join(" | ")} |\n"
+ "#{row}#{separator}"
+ end
+
+ def allowed_href_protocol?(href)
+ if (match = href.match(PROTOCOL_REGEXP))
+ match[1].downcase.in?(Loofah::HTML5::SafeList::ALLOWED_PROTOCOLS)
+ else
+ true # relative URL, no protocol
+ end
+ end
+
+ def list_item_lines(list_node, child_values, prefix:)
... [truncated]