Race condition
Description
The commit fixes a race condition in the HSR (High-Availability Seamless Redundancy) net layer. During node merging in net/hsr/hsr_framereg.c, hsr_handle_sup_frame() previously updated node_real->seq_blocks without holding node_curr->seq_out_lock, allowing concurrent mutations from duplicate registration paths. This can lead to an inconsistent state or corruption of data structures (notably XArray/bitmap) under concurrent access, potentially enabling instability or information leakage through timing or state discrepancies. The fix introduces pairwise locking of both involved nodes' seq_out_lock in a well-defined order to avoid ABBA deadlocks, ensuring the merge is serialized safely.
In short: before the patch, a data race could occur during node merge that risks memory/state corruption; after the patch, the merge is protected by acquiring both seq_out_locks consistently. The vulnerability is a race condition in the network subsystem that could be exploited to cause memory corruption or DoS under crafted concurrent operations on HSR frames.
Proof of Concept
Note: This PoC is conceptual and non-executable. It is intended to illustrate how a race could be triggered and why the fix is necessary, not to provide an exploit.
Conceptual PoC (high level):
- Environment: Linux kernel with HSR enabled, vulnerable to this race prior to the fix. Two threads race during a merge of HSR nodes via hsr_handle_sup_frame.
- Setup: Create two hsr_node instances A and B with seq_out_lock initialized. Ensure there is an ongoing duplicate-discard/merge path that touches node_real and node_curr concurrently.
- Thread T1: Triggers hsr_handle_sup_frame(frame1) that will progress into the code path updating node_real->seq_blocks while potentially reading/updating node_curr data (without holding node_curr->seq_out_lock).
- Thread T2: Simultaneously triggers hsr_handle_sup_frame(frame2) for the same pair of nodes, causing further mutations on seq_out structures (e.g., time_in arrays, seq_nrs, or related frames) but acquiring node_curr->seq_out_lock concurrently.
- Race window: The code path in hsr_handle_sup_frame() updates node_real under only node_real->seq_out_lock, while another path mutates node_curr or related seq_out state, creating a data race on shared structures (seq_blocks, time_in, etc.).
- Expected result (vulnerable prior to fix): Inconsistent state between node_real and node_curr, possible corruption of XArray/bitmap data structures, leading to memory corruption, kernel crash, or subtle information leakage via inconsistent state.
- Expected result (post-fix): The two seq_out_locks are acquired in a defined order (by memory address) to prevent the race, ensuring safe merge and no data structure corruption.
Note: A real exploit would require crafting sustained concurrent frames or timing to hit the race during node merge, which is highly environment- and traffic-dependent. The PoC above focuses on demonstrating the race scenario conceptually and why the double-lock ordering is required.
Commit Details
Author: Luka Gejak
Date: 2026-04-01 09:22 UTC
Message:
net: hsr: serialize seq_blocks merge across nodes
During node merging, hsr_handle_sup_frame() walks node_curr->seq_blocks
to update node_real without holding node_curr->seq_out_lock. This
allows concurrent mutations from duplicate registration paths, risking
inconsistent state or XArray/bitmap corruption.
Fix this by locking both nodes' seq_out_lock during the merge.
To prevent ABBA deadlocks, locks are acquired in order of memory
address.
Reviewed-by: Felix Maurer <fmaurer@redhat.com>
Fixes: 415e6367512b ("hsr: Implement more robust duplicate discard for PRP")
Signed-off-by: Luka Gejak <luka.gejak@linux.dev>
Link: https://patch.msgid.link/20260401092243.52121-2-luka.gejak@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Triage Assessment
Vulnerability Type: Race condition
Confidence: MEDIUM
Reasoning:
The commit fixes a race condition during node merging by acquiring locks on both nodes' seq_out_lock to prevent concurrent mutations that could lead to inconsistent state or corruption of data structures (XArray/bitmap). Such races can be exploitable as a security vulnerability (e.g., corruption, information leakage via timing or state). The change is specifically addressing a synchronization issue with potential security impact, not just a general bug.
Verification Assessment
Vulnerability Type: Race condition
Confidence: MEDIUM
Affected Versions: v7.0-rc6 and earlier
Code Diff
diff --git a/net/hsr/hsr_framereg.c b/net/hsr/hsr_framereg.c
index 50996f4de7f9e7..d418635936743a 100644
--- a/net/hsr/hsr_framereg.c
+++ b/net/hsr/hsr_framereg.c
@@ -123,6 +123,40 @@ static void hsr_free_node_rcu(struct rcu_head *rn)
hsr_free_node(node);
}
+static void hsr_lock_seq_out_pair(struct hsr_node *node_a,
+ struct hsr_node *node_b)
+{
+ if (node_a == node_b) {
+ spin_lock_bh(&node_a->seq_out_lock);
+ return;
+ }
+
+ if (node_a < node_b) {
+ spin_lock_bh(&node_a->seq_out_lock);
+ spin_lock_nested(&node_b->seq_out_lock, SINGLE_DEPTH_NESTING);
+ } else {
+ spin_lock_bh(&node_b->seq_out_lock);
+ spin_lock_nested(&node_a->seq_out_lock, SINGLE_DEPTH_NESTING);
+ }
+}
+
+static void hsr_unlock_seq_out_pair(struct hsr_node *node_a,
+ struct hsr_node *node_b)
+{
+ if (node_a == node_b) {
+ spin_unlock_bh(&node_a->seq_out_lock);
+ return;
+ }
+
+ if (node_a < node_b) {
+ spin_unlock(&node_b->seq_out_lock);
+ spin_unlock_bh(&node_a->seq_out_lock);
+ } else {
+ spin_unlock(&node_a->seq_out_lock);
+ spin_unlock_bh(&node_b->seq_out_lock);
+ }
+}
+
void hsr_del_nodes(struct list_head *node_db)
{
struct hsr_node *node;
@@ -432,7 +466,7 @@ void hsr_handle_sup_frame(struct hsr_frame_info *frame)
}
ether_addr_copy(node_real->macaddress_B, ethhdr->h_source);
- spin_lock_bh(&node_real->seq_out_lock);
+ hsr_lock_seq_out_pair(node_real, node_curr);
for (i = 0; i < HSR_PT_PORTS; i++) {
if (!node_curr->time_in_stale[i] &&
time_after(node_curr->time_in[i], node_real->time_in[i])) {
@@ -455,7 +489,7 @@ void hsr_handle_sup_frame(struct hsr_frame_info *frame)
src_blk->seq_nrs[i], HSR_SEQ_BLOCK_SIZE);
}
}
- spin_unlock_bh(&node_real->seq_out_lock);
+ hsr_unlock_seq_out_pair(node_real, node_curr);
node_real->addr_B_port = port_rcv->type;
spin_lock_bh(&hsr->list_lock);