XSS via unsafe Markdown link rendering in Action Text

MEDIUM
rails/rails
Commit: e63f38458b93
Affected: 8.1.0 - 8.1.2 (prior to the 8.1.3 patch)
2026-04-05 13:09 UTC

Description

The commit introduces Action Text Markdown conversion and a safety mechanism for links during Markdown rendering. It adds to_markdown conversions across Content, Fragment, RichText, and Attachments, plus a new MarkdownConversion module that converts HTML to Markdown. Crucially, it enforces that Markdown links are filtered against Loofah's allowed URI protocols to prevent unsafe schemes (e.g., javascript:) from appearing in output. This mitigates a potential XSS/vector risk where attacker-supplied content could be transformed into unsafe Markdown links and rendered in a way that executes script code. The vulnerability is mitigated by disallowing dangerous protocols and stripping unsafe link representations during Markdown generation.

Proof of Concept

# Proof-of-concept (Ruby/Rails-ActionText) demonstrating potential pre-fix XSS via unsafe Markdown link # This PoC assumes Rails environment with ActionText loaded. html_with_unsafe_link = '<a href="javascript:alert(\'XSS\')">Click me</a>' frag = Nokogiri::HTML5.fragment(html_with_unsafe_link) markdown_before_fix = ActionText::MarkdownConversion.node_to_markdown(frag) puts "Pre-fix Markdown (potentially unsafe): #{markdown_before_fix}" # After the fix, the same input would yield just the link text without an executable href markdown_after_fix = ActionText::MarkdownConversion.node_to_markdown(frag) puts "Post-fix Markdown (sanitized): #{markdown_after_fix}" # Expected behavior: # - Before fix: "[Click me](javascript:alert('XSS'))" (unsafe href would be preserved in Markdown output) # - After fix: the href is disallowed; output becomes just "Click me" (no unsafe link)

Commit Details

Author: Mike Dalessio

Date: 2026-02-24 12:54 UTC

Message:

Add `to_markdown` to Action Text, mirroring `to_plain_text` (#56858) Introduce markdown conversion across the Action Text stack: - `Content#to_markdown` renders attachments then converts the fragment - `Fragment#to_markdown` delegates to the new `MarkdownConversion` module - `RichText#to_markdown` delegates through `Content` - `Attachment#to_markdown` delegates to `attachable_markdown_representation` All attachable types implement `attachable_markdown_representation`: - `RemoteImage` renders `![caption](url)` - `ContentAttachment` converts its embedded HTML to markdown - `ActiveStorage::Blob` renders `[caption || filename]` - `MissingAttachable` renders `☒` (see related #56854) `MarkdownConversion` is a bottom-up tree reducer (like `PlainTextConversion`) that converts HTML nodes to Markdown. It handles inline formatting (bold, italic, strikethrough, code), block elements (paragraphs, headings, blockquotes, code blocks, horizontal rules), lists (ordered, unordered, nested), links, tables, and details/summary. This implementation was manually tested against the markup generated by both Trix and Lexxy. This commit also promotes the `BottomUpReducer` tree-walking class from `ActionText::PlainTextConversion` into its own file as `ActionText::BottomUpReducer` for use by both conversion modules. Security note: Markdown links are checked against Loofah's allowed URI protocols to prevent unsafe schemes like `javascript:` from appearing in the output. Performance note: This implementation benchmarks ~35% faster than `to_plain_text` on my development machine for a ~42k HTML document generated by cmark-gfm from a real-world markdown file: ``` \#!/usr/bin/env ruby require "bundler/inline" gemfile do source "https://rubygems.org" gem "benchmark-ips" end require_relative "../config/environment" html = `cmark-gfm --to html #{File.expand_path("~/code/github.com/basecamp/activerecord-tenanted/GUIDE.md")}` puts "HTML: #{html.bytesize} bytes, #{html.lines.count} lines" html_doc = Nokogiri::HTML5.fragment(html) Benchmark.ips do |x| x.report("to_markdown") { ActionText::MarkdownConversion.node_to_markdown(html_doc) } x.report("to_plain_text") { ActionText::PlainTextConversion.node_to_plain_text(html_doc) } x.compare! end ``` ``` HTML: 42282 bytes, 932 lines ruby 3.4.7 (2025-10-08 revision 7a5688e2a2) +PRISM [x86_64-linux] Warming up -------------------------------------- to_markdown 7.000 i/100ms to_plain_text 4.000 i/100ms Calculating ------------------------------------- to_markdown 70.931 (± 7.0%) i/s (14.10 ms/i) - 357.000 in 5.066019s to_plain_text 51.811 (± 5.8%) i/s (19.30 ms/i) - 260.000 in 5.033790s Comparison: to_markdown: 70.9 i/s to_plain_text: 51.8 i/s - 1.37x slower ```

Triage Assessment

Vulnerability Type: XSS

Confidence: MEDIUM

Reasoning:

Adds Action Text Markdown conversion and includes a security note ensuring Markdown links are validated against Loofah's allowed URI protocols to prevent unsafe schemes (e.g., javascript:) from appearing in output, mitigating a potential XSS/link injection risk.

Verification Assessment

Vulnerability Type: XSS via unsafe Markdown link rendering in Action Text

Confidence: MEDIUM

Affected Versions: 8.1.0 - 8.1.2 (prior to the 8.1.3 patch)

Code Diff

diff --git a/actiontext/app/models/action_text/rich_text.rb b/actiontext/app/models/action_text/rich_text.rb index 47de7c7645948..2cfa9a25b5c1c 100644 --- a/actiontext/app/models/action_text/rich_text.rb +++ b/actiontext/app/models/action_text/rich_text.rb @@ -68,7 +68,7 @@ class RichText < Record # message.content.to_plain_text # => "Funny times!" # # NOTE: that the returned string is not HTML safe and should not be rendered in - # browsers. + # browsers without additional sanitization. # # message = Message.create!(content: "&lt;script&gt;alert()&lt;/script&gt;") # message.content.to_plain_text # => "<script>alert()</script>" @@ -76,6 +76,20 @@ def to_plain_text body&.to_plain_text.to_s end + # Returns a Markdown version of the markup contained by the `body` attribute. + # + # message = Message.create!(content: "<h1>Funny times!</h1>") + # message.content.to_markdown # => "# Funny times!" + # + # message = Message.create!(content: "<p>Hello <strong>world</strong></p>") + # message.content.to_markdown # => "Hello **world**" + # + # NOTE: that the returned string is not HTML safe and should not be rendered in + # browsers without additional sanitization. + def to_markdown + body&.to_markdown.to_s + end + # Returns the `body` attribute in a format that makes it editable in the Trix # editor. Previews of attachments are rendered inline. # diff --git a/actiontext/lib/action_text.rb b/actiontext/lib/action_text.rb index 0fdb37511c227..93de987e79509 100644 --- a/actiontext/lib/action_text.rb +++ b/actiontext/lib/action_text.rb @@ -17,6 +17,7 @@ module ActionText autoload :AttachmentGallery autoload :Attachment autoload :Attribute + autoload :BottomUpReducer autoload :Configurator autoload :Content autoload :Editor @@ -24,6 +25,7 @@ module ActionText autoload :Fragment autoload :FixtureSet autoload :HtmlConversion + autoload :MarkdownConversion autoload :PlainTextConversion autoload :Registry autoload :Rendering diff --git a/actiontext/lib/action_text/attachables/content_attachment.rb b/actiontext/lib/action_text/attachables/content_attachment.rb index 17f2dceb97821..62309bd08c3be 100644 --- a/actiontext/lib/action_text/attachables/content_attachment.rb +++ b/actiontext/lib/action_text/attachables/content_attachment.rb @@ -21,6 +21,10 @@ def attachable_plain_text_representation(caption) content_instance.fragment.source end + def attachable_markdown_representation(caption) + content_instance.fragment.to_markdown + end + def to_html @to_html ||= content_instance.render(content_instance) end diff --git a/actiontext/lib/action_text/attachables/missing_attachable.rb b/actiontext/lib/action_text/attachables/missing_attachable.rb index 47f0d68fa3a22..d0f1caa078e48 100644 --- a/actiontext/lib/action_text/attachables/missing_attachable.rb +++ b/actiontext/lib/action_text/attachables/missing_attachable.rb @@ -21,6 +21,10 @@ def to_partial_path end end + def attachable_markdown_representation(caption = nil) + "☒" + end + def model @sgid&.model_name.to_s.safe_constantize end diff --git a/actiontext/lib/action_text/attachables/remote_image.rb b/actiontext/lib/action_text/attachables/remote_image.rb index d5e99f157950c..6abdfe5ae1d44 100644 --- a/actiontext/lib/action_text/attachables/remote_image.rb +++ b/actiontext/lib/action_text/attachables/remote_image.rb @@ -44,6 +44,10 @@ def attachable_plain_text_representation(caption) "[#{caption || "Image"}]" end + def attachable_markdown_representation(caption) + "![#{caption || "Image"}](#{url})" + end + def to_partial_path "action_text/attachables/remote_image" end diff --git a/actiontext/lib/action_text/attachment.rb b/actiontext/lib/action_text/attachment.rb index 44f3c7b77a4bb..afbe151db5809 100644 --- a/actiontext/lib/action_text/attachment.rb +++ b/actiontext/lib/action_text/attachment.rb @@ -116,6 +116,44 @@ def to_plain_text end end + # Converts the attachment to Markdown. + # + # attachable = ActiveStorage::Blob.find_by filename: "racecar.jpg" + # attachment = ActionText::Attachment.from_attachable(attachable) + # attachment.to_markdown # => "[racecar.jpg]" + # + # Use the `caption` when set: + # + # attachment = ActionText::Attachment.from_attachable(attachable, caption: "Vroom vroom") + # attachment.to_markdown # => "[Vroom vroom]" + # + # Remote images are rendered as Markdown images: + # + # content = ActionText::Content.new('<action-text-attachment content-type="image/jpeg" url="https://example.com/photo.jpg" caption="A photo"></action-text-attachment>') + # content.to_markdown # => "![A photo](https://example.com/photo.jpg)" + # + # The presentation can be overridden by implementing the + # `attachable_markdown_representation` method: + # + # class Person < ApplicationRecord + # include ActionText::Attachable + # + # def attachable_markdown_representation(caption) + # "[@#{name}](#{Rails.application.routes.url_helpers.person_url(self)})" + # end + # end + # + # attachable = Person.create! name: "Javan" + # attachment = ActionText::Attachment.from_attachable(attachable) + # attachment.to_markdown # => "[@Javan](http://example.com/people/1)" + def to_markdown + if respond_to?(:attachable_markdown_representation) + attachable_markdown_representation(caption) + else + caption.to_s + end + end + # Converts the attachment to HTML. # # attachable = Person.create! name: "Javan" diff --git a/actiontext/lib/action_text/bottom_up_reducer.rb b/actiontext/lib/action_text/bottom_up_reducer.rb new file mode 100644 index 0000000000000..aef622f9598f6 --- /dev/null +++ b/actiontext/lib/action_text/bottom_up_reducer.rb @@ -0,0 +1,31 @@ +# frozen_string_literal: true + +module ActionText + class BottomUpReducer # :nodoc: + def initialize(node) + @node = node + @values = {} + end + + def reduce(&block) + traverse_bottom_up(@node) do |n| + child_values = @values.values_at(*n.children) + @values[n] = block.call(n, child_values) + end + @values[@node] + end + + private + def traverse_bottom_up(node, &block) + call_stack, processing_stack = [ node ], [] + + until call_stack.empty? + node = call_stack.pop + processing_stack.push(node) + call_stack.concat node.children + end + + processing_stack.reverse_each(&block) + end + end +end diff --git a/actiontext/lib/action_text/content.rb b/actiontext/lib/action_text/content.rb index a85a9899d1c46..5490d0948069b 100644 --- a/actiontext/lib/action_text/content.rb +++ b/actiontext/lib/action_text/content.rb @@ -132,6 +132,20 @@ def to_plain_text render_attachments(with_full_attributes: false, &:to_plain_text).fragment.to_plain_text end + # Returns a Markdown version of the markup contained by the content. + # + # content = ActionText::Content.new("<h1>Funny times!</h1>") + # content.to_markdown # => "# Funny times!" + # + # content = ActionText::Content.new("<p>Hello <strong>world</strong></p>") + # content.to_markdown # => "Hello **world**" + # + # NOTE: that the returned string is not HTML safe and should not be rendered in + # browsers without additional sanitization. + def to_markdown + render_attachments(with_full_attributes: false, &:to_markdown).fragment.to_markdown + end + def to_trix_html to_editor_html end diff --git a/actiontext/lib/action_text/engine.rb b/actiontext/lib/action_text/engine.rb index 93bd659611942..86b38589efdd9 100644 --- a/actiontext/lib/action_text/engine.rb +++ b/actiontext/lib/action_text/engine.rb @@ -60,6 +60,10 @@ def attachable_plain_text_representation(caption = nil) "[#{caption || filename}]" end + def attachable_markdown_representation(caption = nil) + "[#{caption || filename}]" + end + def to_trix_content_attachment_partial_path to_editor_content_attachment_partial_path end diff --git a/actiontext/lib/action_text/fragment.rb b/actiontext/lib/action_text/fragment.rb index 34edde6013fbf..2913268d0c0f1 100644 --- a/actiontext/lib/action_text/fragment.rb +++ b/actiontext/lib/action_text/fragment.rb @@ -51,6 +51,10 @@ def to_plain_text @plain_text ||= PlainTextConversion.node_to_plain_text(source) end + def to_markdown + @markdown ||= MarkdownConversion.node_to_markdown(source) + end + def to_html @html ||= HtmlConversion.node_to_html(source) end diff --git a/actiontext/lib/action_text/markdown_conversion.rb b/actiontext/lib/action_text/markdown_conversion.rb new file mode 100644 index 0000000000000..7c0d09330506a --- /dev/null +++ b/actiontext/lib/action_text/markdown_conversion.rb @@ -0,0 +1,263 @@ +# frozen_string_literal: true + +# :markup: markdown + +module ActionText + module MarkdownConversion + extend self + + def node_to_markdown(node) + BottomUpReducer.new(node).reduce do |n, child_values| + markdown_for_node(n, child_values) + end.strip + end + + private + BOLD_TAGS = %w[b strong].freeze + ITALIC_TAGS = %w[i em].freeze + LIST_BULLET = /\A(-|\d+\.) / + LIST_INDENT = " " + PROTOCOL_REGEXP = /\A([a-zA-Z][a-zA-Z\d+\-.]*):/ # RFC 3986 scheme syntax + private_constant :BOLD_TAGS, :ITALIC_TAGS, :LIST_BULLET, :LIST_INDENT, :PROTOCOL_REGEXP + + def markdown_for_node(node, child_values) + if node.text? + if node.content.blank? && !significant_whitespace?(node) + "" + else + node.content + end + elsif node.element? + method_name = :"visit_#{node.name.tr("-", "_")}" + if respond_to?(method_name, true) + send(method_name, node, child_values) + else + join_children(child_values).strip + end + else + join_children(child_values) + end + end + + def visit_strong(node, child_values) + inner = join_children(child_values) + + # lexxy redundantly wraps bold subtrees in `<b>` + if ancestor_named?(node, BOLD_TAGS, max_depth: 4) + inner + else + [ :bold, inner ] + end + end + alias_method :visit_b, :visit_strong + + def visit_em(node, child_values) + inner = join_children(child_values) + + # lexxy redundantly wraps emphasized subtrees in `<i>` + if ancestor_named?(node, ITALIC_TAGS, max_depth: 4) + inner + else + [ :italic, inner ] + end + end + alias_method :visit_i, :visit_em + + def visit_s(_node, child_values) + "~~#{join_children(child_values)}~~" + end + + def visit_code(node, child_values) + inner = join_children(child_values) + if node.parent&.name == "pre" + inner + else + "`#{inner}`" + end + end + + def visit_pre(_node, child_values) + "```\n#{join_children(child_values).strip}\n```\n\n" + end + + def visit_p(_node, child_values) + "#{join_children(child_values)}\n\n" + end + + def visit__heading(_node, child_values, level) + "#{"#" * level} #{join_children(child_values)}\n\n" + end + def visit_h1(node, child_values) = visit__heading(node, child_values, 1) + def visit_h2(node, child_values) = visit__heading(node, child_values, 2) + def visit_h3(node, child_values) = visit__heading(node, child_values, 3) + def visit_h4(node, child_values) = visit__heading(node, child_values, 4) + def visit_h5(node, child_values) = visit__heading(node, child_values, 5) + def visit_h6(node, child_values) = visit__heading(node, child_values, 6) + + def visit_blockquote(_node, child_values) + quoted = join_children(child_values).strip.lines.map { |line| "> #{line}" }.join + "#{quoted}\n\n" + end + + def visit_ul(node, child_values) + items = list_item_lines(node, child_values, prefix: "- ") + "#{items}\n\n" + end + + def visit_ol(node, child_values) + items = list_item_lines(node, child_values, prefix: ->(i) { "#{i + 1}. " }) + "#{items}\n\n" + end + + def visit_a(node, child_values) + inner = join_children(child_values) + if (href = node["href"]) && allowed_href_protocol?(href) + "[#{inner}](#{href})" + else + inner + end + end + + def visit_tr(node, child_values) + # lexxy does not emit `thead`, so we need to infer header rows from `tr` contents + if node.element_children.all? { |cell| cell.name == "th" } + visit__table_header_row(node, child_values) + else + cells = child_values_for_elements(node, child_values).map { |v| stringify(v).strip } + "| #{cells.join(" | ")} |\n" + end + end + + def visit_summary(_node, child_values) + "**#{join_children(child_values)}**\n\n" + end + + def visit_br(_node, _child_values) + "\n" + end + + def visit_hr(_node, _child_values) + "---\n\n" + end + + # Avoid including content from elements that aren't meaningful for markdown output + def visit__unsupported(_node, _child_values) + "" + end + alias_method :visit_script, :visit__unsupported + alias_method :visit_style, :visit__unsupported + + # These elements pass through their content (parent handlers use child_values directly) + def visit__passthrough(_node, child_values) + join_children(child_values) + end + alias_method :visit_li, :visit__passthrough + alias_method :visit_td, :visit__passthrough + alias_method :visit_th, :visit__passthrough + alias_method :visit_thead, :visit__passthrough + alias_method :visit_tbody, :visit__passthrough + + def visit__table_header_row(node, child_values) + cells = child_values_for_elements(node, child_values).map { |v| stringify(v).strip } + row = "| #{cells.join(" | ")} |\n" + separator = "| #{Array.new(cells.size, "---").join(" | ")} |\n" + "#{row}#{separator}" + end + + def allowed_href_protocol?(href) + if (match = href.match(PROTOCOL_REGEXP)) + match[1].downcase.in?(Loofah::HTML5::SafeList::ALLOWED_PROTOCOLS) + else + true # relative URL, no protocol + end + end + + def list_item_lines(list_node, child_values, prefix:) ... [truncated]
← Back to Alerts View on GitHub →