Denial of Service (DoS) / performance issue in XML deserialization
Description
The commit fixes CVE-2025-64460: a Denial of Service (DoS) / performance vulnerability in Django's XML deserializer. Previously, the inner text collection in the XML deserializer used getInnerText() which recursively extended a list with the results from child nodes. This approach caused quadratic time complexity on crafted XML with deep nesting, enabling CPU consumption attacks. The patch splits collection of inner texts from the join operation by introducing a separate getInnerTextList() that collects per-subtree text, and only joins once at the end, reducing the overall complexity to linear in input size. It also includes a minor performance mitigation for minidom checks and wraps expandNode calls with a fast-path cache clearing. The change is targeted at the XML deserialization path used by fixtures/XML serialization.
Proof of Concept
# PoC: Demonstrate potential DoS in pre-patch XML deserializer (Python)
import time
from django.core.serializers.xml_serializer import Deserializer as XMLDeserializer
def build_crafted_xml(depth, leaf_text_len):
nested_open = "<nested>" * depth
nested_close = "</nested>" * depth
leaf = "x" * leaf_text_len
field_content = f"{nested_open}{leaf}{nested_close}"
return f"""
<django-objects version="1.0">
<object model="contenttypes.contenttype" pk="1">
<field name="app_label">{field_content}</field>
<field name="model">m</field>
</object>
</django-objects>
"""
def time_to_deserialize(depth, leaf_len):
crafted_xml = build_crafted_xml(depth, leaf_len)
start = time.perf_counter()
list(XMLDeserializer(crafted_xml))
end = time.perf_counter()
return end - start
if __name__ == "__main__":
for d in [50, 100, 200, 400]:
t = time_to_deserialize(d, 2000)
print(f"depth={d}, time={t:.6f}s")
Commit Details
Author: Shai Berger
Date: 2025-10-11 18:42 UTC
Message:
Fixed CVE-2025-64460 -- Corrected quadratic inner text accumulation in XML serializer.
Previously, `getInnerText()` recursively used `list.extend()` on strings,
which added each character from child nodes as a separate list element.
On deeply nested XML content, this caused the overall deserialization
work to grow quadratically with input size, potentially allowing
disproportionate CPU consumption for crafted XML.
The fix separates collection of inner texts from joining them, so that
each subtree is joined only once, reducing the complexity to linear in
the size of the input. These changes also include a mitigation for a
xml.dom.minidom performance issue.
Thanks Seokchan Yoon (https://ch4n3.kr/) for report.
Co-authored-by: Jacob Walls <jacobtylerwalls@gmail.com>
Co-authored-by: Natalia <124304+nessita@users.noreply.github.com>
Triage Assessment
Vulnerability Type: Denial of Service (DoS) / performance issue
Confidence: HIGH
Reasoning:
The commit addresses CVE-2025-64460 by correcting quadratic time complexity in the XML deserializer caused by inefficient inner text accumulation. By restructuring getInnerText to accumulate text per element and reducing pathological input handling, it mitigates potential denial-of-service via crafted XML inputs.
Verification Assessment
Vulnerability Type: Denial of Service (DoS) / performance issue in XML deserialization
Confidence: HIGH
Affected Versions: Django 5.1.x before 5.1.15 (i.e., 5.1.0 through 5.1.14)
Code Diff
diff --git a/django/core/serializers/xml_serializer.py b/django/core/serializers/xml_serializer.py
index 0557af395498..f8ec0865a76b 100644
--- a/django/core/serializers/xml_serializer.py
+++ b/django/core/serializers/xml_serializer.py
@@ -3,7 +3,8 @@
"""
import json
-from xml.dom import pulldom
+from contextlib import contextmanager
+from xml.dom import minidom, pulldom
from xml.sax import handler
from xml.sax.expatreader import ExpatParser as _ExpatParser
@@ -15,6 +16,25 @@
from django.utils.xmlutils import SimplerXMLGenerator, UnserializableContentError
+@contextmanager
+def fast_cache_clearing():
+ """Workaround for performance issues in minidom document checks.
+
+ Speeds up repeated DOM operations by skipping unnecessary full traversal
+ of the DOM tree.
+ """
+ module_helper_was_lambda = False
+ if original_fn := getattr(minidom, "_in_document", None):
+ module_helper_was_lambda = original_fn.__name__ == "<lambda>"
+ if not module_helper_was_lambda:
+ minidom._in_document = lambda node: bool(node.ownerDocument)
+ try:
+ yield
+ finally:
+ if original_fn and not module_helper_was_lambda:
+ minidom._in_document = original_fn
+
+
class Serializer(base.Serializer):
"""Serialize a QuerySet to XML."""
@@ -210,7 +230,8 @@ def _make_parser(self):
def __next__(self):
for event, node in self.event_stream:
if event == "START_ELEMENT" and node.nodeName == "object":
- self.event_stream.expandNode(node)
+ with fast_cache_clearing():
+ self.event_stream.expandNode(node)
return self._handle_object(node)
raise StopIteration
@@ -397,20 +418,26 @@ def _get_model_from_node(self, node, attr):
def getInnerText(node):
"""Get all the inner text of a DOM node (recursively)."""
+ inner_text_list = getInnerTextList(node)
+ return "".join(inner_text_list)
+
+
+def getInnerTextList(node):
+ """Return a list of the inner texts of a DOM node (recursively)."""
# inspired by
# https://mail.python.org/pipermail/xml-sig/2005-March/011022.html
- inner_text = []
+ result = []
for child in node.childNodes:
if (
child.nodeType == child.TEXT_NODE
or child.nodeType == child.CDATA_SECTION_NODE
):
- inner_text.append(child.data)
+ result.append(child.data)
elif child.nodeType == child.ELEMENT_NODE:
- inner_text.extend(getInnerText(child))
+ result.extend(getInnerTextList(child))
else:
pass
- return "".join(inner_text)
+ return result
# Below code based on Christian Heimes' defusedxml
diff --git a/docs/releases/4.2.27.txt b/docs/releases/4.2.27.txt
index e95dc63f74ef..b843f6a4436e 100644
--- a/docs/releases/4.2.27.txt
+++ b/docs/releases/4.2.27.txt
@@ -15,6 +15,16 @@ using a suitably crafted dictionary, with dictionary expansion, as the
``**kwargs`` passed to :meth:`.QuerySet.annotate` or :meth:`.QuerySet.alias` on
PostgreSQL.
+CVE-2025-64460: Potential denial-of-service vulnerability in XML ``Deserializer``
+=================================================================================
+
+:ref:`XML Serialization <serialization-formats-xml>` was subject to a potential
+denial-of-service attack due to quadratic time complexity when deserializing
+crafted documents containing many nested invalid elements. The internal helper
+``django.core.serializers.xml_serializer.getInnerText()`` previously
+accumulated inner text inefficiently during recursion. It now collects text per
+element, avoiding excessive resource usage.
+
Bugfixes
========
diff --git a/docs/releases/5.1.15.txt b/docs/releases/5.1.15.txt
index f55623ea9684..63ff22732eeb 100644
--- a/docs/releases/5.1.15.txt
+++ b/docs/releases/5.1.15.txt
@@ -15,6 +15,16 @@ using a suitably crafted dictionary, with dictionary expansion, as the
``**kwargs`` passed to :meth:`.QuerySet.annotate` or :meth:`.QuerySet.alias` on
PostgreSQL.
+CVE-2025-64460: Potential denial-of-service vulnerability in XML ``Deserializer``
+=================================================================================
+
+:ref:`XML Serialization <serialization-formats-xml>` was subject to a potential
+denial-of-service attack due to quadratic time complexity when deserializing
+crafted documents containing many nested invalid elements. The internal helper
+``django.core.serializers.xml_serializer.getInnerText()`` previously
+accumulated inner text inefficiently during recursion. It now collects text per
+element, avoiding excessive resource usage.
+
Bugfixes
========
diff --git a/docs/releases/5.2.9.txt b/docs/releases/5.2.9.txt
index 08c298999a56..ba235d05c69b 100644
--- a/docs/releases/5.2.9.txt
+++ b/docs/releases/5.2.9.txt
@@ -15,6 +15,16 @@ using a suitably crafted dictionary, with dictionary expansion, as the
``**kwargs`` passed to :meth:`.QuerySet.annotate` or :meth:`.QuerySet.alias` on
PostgreSQL.
+CVE-2025-64460: Potential denial-of-service vulnerability in XML ``Deserializer``
+=================================================================================
+
+:ref:`XML Serialization <serialization-formats-xml>` was subject to a potential
+denial-of-service attack due to quadratic time complexity when deserializing
+crafted documents containing many nested invalid elements. The internal helper
+``django.core.serializers.xml_serializer.getInnerText()`` previously
+accumulated inner text inefficiently during recursion. It now collects text per
+element, avoiding excessive resource usage.
+
Bugfixes
========
diff --git a/docs/topics/serialization.txt b/docs/topics/serialization.txt
index f0ac0811be98..2b28f5e15a84 100644
--- a/docs/topics/serialization.txt
+++ b/docs/topics/serialization.txt
@@ -173,6 +173,8 @@ Identifier Information
.. _jsonl: https://jsonlines.org/
.. _PyYAML: https://pyyaml.org/
+.. _serialization-formats-xml:
+
XML
---
diff --git a/tests/serializers/test_deserialization.py b/tests/serializers/test_deserialization.py
index 0bbb46b7ce1c..a718a990385a 100644
--- a/tests/serializers/test_deserialization.py
+++ b/tests/serializers/test_deserialization.py
@@ -1,11 +1,15 @@
import json
+import time
import unittest
from django.core.serializers.base import DeserializationError, DeserializedObject
from django.core.serializers.json import Deserializer as JsonDeserializer
from django.core.serializers.jsonl import Deserializer as JsonlDeserializer
from django.core.serializers.python import Deserializer
+from django.core.serializers.xml_serializer import Deserializer as XMLDeserializer
+from django.db import models
from django.test import SimpleTestCase
+from django.test.utils import garbage_collect
from .models import Author
@@ -133,3 +137,53 @@ def test_yaml_bytes_input(self):
self.assertEqual(first_item.object, self.jane)
self.assertEqual(second_item.object, self.joe)
+
+ def test_crafted_xml_performance(self):
+ """The time to process invalid inputs is not quadratic."""
+
+ def build_crafted_xml(depth, leaf_text_len):
+ nested_open = "<nested>" * depth
+ nested_close = "</nested>" * depth
+ leaf = "x" * leaf_text_len
+ field_content = f"{nested_open}{leaf}{nested_close}"
+ return f"""
+ <django-objects version="1.0">
+ <object model="contenttypes.contenttype" pk="1">
+ <field name="app_label">{field_content}</field>
+ <field name="model">m</field>
+ </object>
+ </django-objects>
+ """
+
+ def deserialize(crafted_xml):
+ iterator = XMLDeserializer(crafted_xml)
+ garbage_collect()
+
+ start_time = time.perf_counter()
+ result = list(iterator)
+ end_time = time.perf_counter()
+
+ self.assertEqual(len(result), 1)
+ self.assertIsInstance(result[0].object, models.Model)
+ return end_time - start_time
+
+ def assertFactor(label, params, factor=2):
+ factors = []
+ prev_time = None
+ for depth, length in params:
+ crafted_xml = build_crafted_xml(depth, length)
+ elapsed = deserialize(crafted_xml)
+ if prev_time is not None:
+ factors.append(elapsed / prev_time)
+ prev_time = elapsed
+
+ with self.subTest(label):
+ # Assert based on the average factor to reduce test flakiness.
+ self.assertLessEqual(sum(factors) / len(factors), factor)
+
+ assertFactor(
+ "varying depth, varying length",
+ [(50, 2000), (100, 4000), (200, 8000), (400, 16000), (800, 32000)],
+ 2,
+ )
+ assertFactor("constant depth, varying length", [(100, 1), (100, 1000)], 2)