NULL pointer dereference / memory safety issue (kernel crash risk)

HIGH
torvalds/linux
Commit: 42ea37b07742
Affected: v7.0-rc6 (and earlier branches carrying the mana driver)
2026-04-25 14:14 UTC

Description

The commit fixes multiple error-path handling issues in the mana_probe()/mana_remove() code path of the Mana NIC driver. The main vulnerabilities addressed are memory-safety related: potential NULL pointer dereferences, use-after-free/warn_on in work structs, and resource leaks when probe steps fail or during PM resume/unbind sequences. Specific fixes include: initializing work structures before error paths that could trigger removal, guarding mana_remove() against NULL gdma_context/driver_data to prevent double invocations, avoiding overwrites of port probe errors via add_adev(), and ensuring EQ cleanup is always reached in port-loop cleanup. Collectively these changes reduce crash risk and kernel warnings, and reduce the likelihood of an attacker triggering kernel instability via malformed probe/remove paths.

Proof of Concept

Poc (high-level reproduction steps, not a real exploit): 1) Build and run a kernel with the Mana driver and a test device. 2) Force a failure path during mana_probe() after some initialization (e.g., port probes partial success) so that mana_remove() could be invoked as part of the error cleanup. The patch ensures that link_change_work is initialized before any error path could trigger removal, preventing WARN_ON in __flush_work() or debug warnings on uninitialized work structs. 3) Simulate a PM resume failure scenario where mana_probe() calls mana_remove() and NULLs gdma_context/driver_data as part of cleanup, then later trigger a device unbind (unbound driver path) which would call mana_remove() again. 4) Observe that in pre-fix scenarios, the second mana_remove() path could dereference NULL (gdma_context/driver_data) leading to a kernel NULL-pointer dereference or crash. With this patch, mana_remove() early-returns when the relevant pointers are NULL, avoiding the crash. 5) For validation, trigger an unbind/removal sequence twice in quick succession on a devicelock or PM resume away path and verify there is no kernel oops or WARN_ON due to NULL dereferences or resource leaks. Note: This PoC is a controlled repro for a kernel memory-safety issue in the Mana driver’s error paths and requires controlled triggering of probe/remove during error paths or PM resume. It is not a remote-exploit PoC; it demonstrates a crash/lockup scenario that the patch mitigates.

Commit Details

Author: Paolo Abeni

Date: 2026-04-23 10:49 UTC

Message:

Merge branch 'net-mana-fix-probe-remove-error-path-bugs' Erni Sri Satya Vennela says: ==================== net: mana: Fix probe/remove error path bugs Fix five bugs in mana_probe()/mana_remove() error handling that can cause warnings on uninitialized work structs, NULL pointer dereferences, masked errors, and resource leaks when early probe steps fail. Patches 1-2 move work struct initialization (link_change_work and gf_stats_work) to before any error path that could trigger mana_remove(), preventing WARN_ON in __flush_work() or debug object warnings when sync cancellation runs on uninitialized work structs. Patch 3 guards mana_remove() against double invocation. If PM resume fails, mana_probe() calls mana_remove() which sets gdma_context and driver_data to NULL. A failed resume does not unbind the driver, so when the device is eventually unbound, mana_remove() is called again and dereferences NULL, causing a kernel panic. An early return on NULL gdma_context or driver_data makes the second call harmless. Patch 4 prevents add_adev() from overwriting a port probe error, which could leave the driver in a broken state with NULL ports while reporting success. Patch 5 changes 'goto out' to 'break' in mana_remove()'s port loop so that mana_destroy_eq() is always reached, preventing EQ leaks when a NULL port is encountered. ==================== Link: https://patch.msgid.link/20260420124741.1056179-1-ernis@linux.microsoft.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Triage Assessment

Vulnerability Type: NULL pointer dereference / memory safety

Confidence: HIGH

Reasoning:

The patch fixes multiple error-path handling and NULL/uninitialized access issues in mana_probe/mana_remove. It adds early initialization, guards against NULL gdma_context/driver_data, prevents double invocations, and ensures cleanup paths are taken to avoid NULL dereferences, use-after-free, and resource leaks that could crash the kernel or be exploitable. This corresponds to memory safety and potential crash/privilege/security impact mitigations.

Verification Assessment

Vulnerability Type: NULL pointer dereference / memory safety issue (kernel crash risk)

Confidence: HIGH

Affected Versions: v7.0-rc6 (and earlier branches carrying the mana driver)

Code Diff

diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c index 6302432b9bf66c..98e2fcc797cac9 100644 --- a/drivers/net/ethernet/microsoft/mana/mana_en.c +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c @@ -3631,8 +3631,12 @@ int mana_probe(struct gdma_dev *gd, bool resuming) ac->gdma_dev = gd; gd->driver_data = ac; + + INIT_WORK(&ac->link_change_work, mana_link_state_handle); } + INIT_DELAYED_WORK(&ac->gf_stats_work, mana_gf_stats_work_handler); + err = mana_create_eq(ac); if (err) { dev_err(dev, "Failed to create EQs: %d\n", err); @@ -3648,8 +3652,6 @@ int mana_probe(struct gdma_dev *gd, bool resuming) if (!resuming) { ac->num_ports = num_ports; - - INIT_WORK(&ac->link_change_work, mana_link_state_handle); } else { if (ac->num_ports != num_ports) { dev_err(dev, "The number of vPorts changed: %d->%d\n", @@ -3678,10 +3680,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming) if (!resuming) { for (i = 0; i < ac->num_ports; i++) { err = mana_probe_port(ac, i, &ac->ports[i]); - /* we log the port for which the probe failed and stop - * probes for subsequent ports. - * Note that we keep running ports, for which the probes - * were successful, unless add_adev fails too + /* Log the port for which the probe failed, stop probing + * subsequent ports, and skip add_adev. + * mana_remove() will clean up already-probed ports. */ if (err) { dev_err(dev, "Probe Failed for port %d\n", i); @@ -3695,10 +3696,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming) enable_work(&apc->queue_reset_work); err = mana_attach(ac->ports[i]); rtnl_unlock(); - /* we log the port for which the attach failed and stop - * attach for subsequent ports - * Note that we keep running ports, for which the attach - * were successful, unless add_adev fails too + /* Log the port for which the attach failed, stop + * attaching subsequent ports, and skip add_adev. + * mana_remove() will clean up already-attached ports. */ if (err) { dev_err(dev, "Attach Failed for port %d\n", i); @@ -3707,9 +3707,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming) } } - err = add_adev(gd, "eth"); + if (!err) + err = add_adev(gd, "eth"); - INIT_DELAYED_WORK(&ac->gf_stats_work, mana_gf_stats_work_handler); schedule_delayed_work(&ac->gf_stats_work, MANA_GF_STATS_PERIOD); out: @@ -3730,11 +3730,16 @@ void mana_remove(struct gdma_dev *gd, bool suspending) struct gdma_context *gc = gd->gdma_context; struct mana_context *ac = gd->driver_data; struct mana_port_context *apc; - struct device *dev = gc->dev; + struct device *dev; struct net_device *ndev; int err; int i; + if (!gc || !ac) + return; + + dev = gc->dev; + disable_work_sync(&ac->link_change_work); cancel_delayed_work_sync(&ac->gf_stats_work); @@ -3747,7 +3752,7 @@ void mana_remove(struct gdma_dev *gd, bool suspending) if (!ndev) { if (i == 0) dev_err(dev, "No net device to remove\n"); - goto out; + break; } apc = netdev_priv(ndev); @@ -3778,7 +3783,7 @@ void mana_remove(struct gdma_dev *gd, bool suspending) } mana_destroy_eq(ac); -out: + if (ac->per_port_queue_reset_wq) { destroy_workqueue(ac->per_port_queue_reset_wq); ac->per_port_queue_reset_wq = NULL;
← Back to Alerts View on GitHub →