Bypass / Privilege escalation (corrupt-object deletion bypass)

HIGH

kubernetes/kubernetes

Commit: 650b8e7f2934

Affected: v1.36.0-beta.0 and earlier in the 1.36.x line

2026-05-26 19:40 UTC

Description

The commit changes the unsafe-delete path for corrupt objects to ensure corruption is detected at the latest revision. Previously, there existed a path (IgnoreStoreReadError) that could skip transform/decode and allow deletion of a corrupt object without performing full validation, potentially bypassing admission checks and finalizers. The new implementation (ExpectTransformOrDecodeError) always attempts to transform/decode and only proceeds with delete if that attempt fails; if the object turns out to be decodable after all, the delete is rejected with an InvalidObj error. This mitigates a security edge case that could have allowed tampering or bypass of security controls via corrupt object deletion.

Proof of Concept

PoC overview (exploit before fix):
- Preconditions: A corrupt object exists in storage for a resource (e.g., a Pod) that would fail decoding/transform under a normal path but could be deleted via the unsafe delete path.
- Attack vector: Use the internal unsafe-deletion path that relies on the IgnoreStoreReadError behavior to skip decoding/transform checks and perform a delete, bypassing admission checks and finalizers.
- Expected outcome before fix: The delete operation succeeds, removing the corrupt object and potentially bypassing security controls.

PoC outline (illustrative, not runnable against current code):
1) Prepare a test environment with a resource name "pods/foo" whose underlying storage entry is corrupt.
2) Invoke the corrupt-object deleter with the legacy/private option that enables unsafe delete (IgnoreStoreReadErrorWithClusterBreakingPotential in older code paths).
3) Observe that the storage.Delete is called and the object is deleted without validating the object through the standard admission/validation path.
4) After fix, this path is changed to require an actual transformation/decoding failure; if the object is still decodable, the delete will be rejected with an InvalidObj error, preventing bypass.

Illustrative (conceptual) Go-like pseudo-code for pre-fix behavior (not runnable against current code):
// Pseudo-code illustrating the vulnerability path prior to the fix
store := NewStore(...)
deleter := NewCorruptObjectDeleter(store)
ctx := context.WithValue(context.Background(), "namespace", "default")
// Simulate a corrupt object in storage and a delete request that uses the unsafe path
deleteOpts := &metav1.DeleteOptions{IgnoreStoreReadErrorWithClusterBreakingPotential: ptr.To(true)}
_, deleted, err := deleter.Delete(ctx, "foo", someDeleteFunc, deleteOpts)
// Expected (pre-fix): deletion may succeed despite corruption, bypassing admission checks
// After fix: deletion would fail if the object is decodable, preventing bypass

Commit Details

Author: Ben Luddy

Date: 2026-03-04 20:31 UTC

Message:

Ensure corruption at latest revision for "unsafe delete" path.

The original option (IgnoreStoreReadError) would skip transform/decode entirely, relying on the
caller to ensure that the stored object is corrupt at the time of deletion. The new
implementation (ExpectTransformOrDecodeError) always attempts to transform and decode, and only
proceeds with the delete if that attempt fails. If the object turns out to be successfully
decodable (e.g., a concurrent write fixed the corruption), the delete is rejected with an InvalidObj
error. This addresses an edge case that could have allowed admission and finalizers to be bypassed
using the corrupt object deletion mechanism.

Co-authored-by: Krzysztof Ostrowski <kostrows@redhat.com>

Triage Assessment

Vulnerability Type: Bypass / Privilege escalation

Confidence: HIGH

Reasoning:

Commit changes the unsafe-delete path for corrupt objects to ensure corruption is detected at the latest revision. It prevents bypassing admission checks and finalizers by deleting corrupt objects without proper validation, addressing a security edge case that could have allowed tampering or bypass of security controls.

Verification Assessment

Vulnerability Type: Bypass / Privilege escalation (corrupt-object deletion bypass)

Confidence: HIGH

Affected Versions: v1.36.0-beta.0 and earlier in the 1.36.x line

Code Diff

diff --git a/staging/src/k8s.io/apiserver/pkg/registry/generic/registry/corrupt_obj_deleter.go b/staging/src/k8s.io/apiserver/pkg/registry/generic/registry/corrupt_obj_deleter.go
index 2c37b0b5d353f..8be475c7f2036 100644
--- a/staging/src/k8s.io/apiserver/pkg/registry/generic/registry/corrupt_obj_deleter.go
+++ b/staging/src/k8s.io/apiserver/pkg/registry/generic/registry/corrupt_obj_deleter.go
@@ -19,13 +19,10 @@ package registry
 import (
 	"context"
 	"errors"
-	"strings"
 
 	apierrors "k8s.io/apimachinery/pkg/api/errors"
 	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
 	"k8s.io/apimachinery/pkg/runtime"
-	"k8s.io/apimachinery/pkg/runtime/schema"
-	"k8s.io/apimachinery/pkg/util/validation/field"
 	genericapirequest "k8s.io/apiserver/pkg/endpoints/request"
 	"k8s.io/apiserver/pkg/registry/rest"
 	"k8s.io/apiserver/pkg/storage"
@@ -38,13 +35,12 @@ import (
 // the corrupt object deleter has the same interface as rest.GracefulDeleter
 var _ rest.GracefulDeleter = &corruptObjectDeleter{}
 
-// NewCorruptObjectDeleter returns a deleter that can perform unsafe deletion
-// of corrupt objects, it makes an attempt to perform a normal deletion flow
-// first, and if the normal deletion flow fails with a corrupt object error
-// then it performs the unsafe delete of the object.
+// NewCorruptObjectDeleter returns a deleter that can perform unsafe deletion of corrupt
+// objects. The deletion will be rejected if the object turns out to be decodable (i.e. not actually
+// corrupt).
 //
-// NOTE: it skips precondition checks, finalizer constraints, and any
-// post deletion hook defined in 'AfterDelete' of the registry.
+// NOTE: it skips precondition checks, finalizer constraints, and any post deletion hook defined in
+// 'AfterDelete' of the registry.
 //
 // WARNING: This may break the cluster if the resource being deleted has dependencies.
 func NewCorruptObjectDeleter(store *Store) rest.GracefulDeleter {
@@ -72,41 +68,12 @@ func (d *corruptObjectDeleter) Delete(ctx context.Context, name string, deleteVa
 	if err != nil {
 		return nil, false, err
 	}
-	obj := d.store.NewFunc()
 	qualifiedResource := d.store.qualifiedResourceFromContext(ctx)
 	// use the storage implementation directly, bypass the dryRun layer
 	storageBackend := d.store.Storage.Storage
-	// we leave ResourceVersion as empty in the GetOptions so the
-	// object is retrieved from the underlying storage directly
-	err = storageBackend.Get(ctx, key, storage.GetOptions{}, obj)
-	if err == nil || !storage.IsCorruptObject(err) {
-		// TODO: The Invalid error should have a field for Resource.
-		// After that field is added, we should fill the Resource and
-		// leave the Kind field empty. See the discussion in #18526.
-		qualifiedKind := schema.GroupKind{Group: qualifiedResource.Group, Kind: qualifiedResource.Resource}
-		fieldErrList := field.ErrorList{
-			field.Invalid(field.NewPath("ignoreStoreReadErrorWithClusterBreakingPotential"), true, "is exclusively used to delete corrupt object(s), try again by removing this option"),
-		}
-		return nil, false, apierrors.NewInvalid(qualifiedKind, name, fieldErrList)
-	}
-
-	// try normal deletion anyway, it is expected to fail
-	obj, deleted, err := d.store.Delete(ctx, name, deleteValidation, opts)
-	if err == nil {
-		return obj, deleted, err
-	}
-	// TODO: unfortunately we can't do storage.IsCorruptObject(err),
-	// conversion to API error drops the inner error chain
-	if !strings.Contains(err.Error(), "corrupt object") {
-		return obj, deleted, err
-	}
-
-	// TODO: at this instant, some actor may have a) managed to recreate this
-	// object by doing a delete+create, or b) the underlying error has resolved
-	// since the last time we checked, and the object is readable now.
 	klog.FromContext(ctx).V(1).Info("Going to perform unsafe object deletion", "object", klog.KRef(genericapirequest.NamespaceValue(ctx), name))
 	out := d.store.NewFunc()
-	storageOpts := storage.DeleteOptions{IgnoreStoreReadError: true}
+	storageOpts := storage.DeleteOptions{ExpectTransformOrDecodeError: true}
 	// we don't have the old object in the cache, neither can it be
 	// retrieved from the storage and decoded into an object
 	// successfully, so we do the following:
@@ -115,14 +82,9 @@ func (d *corruptObjectDeleter) Delete(ctx context.Context, name string, deleteVa
 	var nilPreconditions *storage.Preconditions = nil
 	var nilCachedExistingObject runtime.Object = nil
 	if err := storageBackend.Delete(ctx, key, out, nilPreconditions, rest.ValidateAllObjectFunc, nilCachedExistingObject, storageOpts); err != nil {
-		if storage.IsNotFound(err) {
-			// the DELETE succeeded, but we don't have the object since it's
-			// not retrievable from the storage, so we send a nil object
-			return nil, false, nil
-		}
 		return nil, false, storeerr.InterpretDeleteError(err, qualifiedResource, name)
 	}
-	// the DELETE succeeded, but we don't have the object sine it's
-	// not retrievable from the storage, so we send a nil objct
+	// the DELETE succeeded, but we don't have the object since it's
+	// not retrievable from the storage, so we send a nil object
 	return nil, true, nil
 }

diff --git a/staging/src/k8s.io/apiserver/pkg/registry/generic/registry/corrupt_obj_deleter_test.go b/staging/src/k8s.io/apiserver/pkg/registry/generic/registry/corrupt_obj_deleter_test.go
index 3a74e609c6dcb..351cb94839298 100644
--- a/staging/src/k8s.io/apiserver/pkg/registry/generic/registry/corrupt_obj_deleter_test.go
+++ b/staging/src/k8s.io/apiserver/pkg/registry/generic/registry/corrupt_obj_deleter_test.go
@@ -19,314 +19,138 @@ package registry
 import (
 	"context"
 	"fmt"
-	"strings"
+	"path"
 	"testing"
 
 	"k8s.io/apimachinery/pkg/api/errors"
 	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
 	"k8s.io/apimachinery/pkg/runtime"
+	"k8s.io/apimachinery/pkg/runtime/schema"
 	"k8s.io/apiserver/pkg/apis/example"
 	genericapirequest "k8s.io/apiserver/pkg/endpoints/request"
-	"k8s.io/apiserver/pkg/registry/rest"
 	"k8s.io/apiserver/pkg/storage"
-
 	"k8s.io/utils/ptr"
 )
 
-type result struct {
-	deleted bool
-	err     error
-}
+// fakeStorage is a test double for storage.Interface that intercepts Delete calls, records the
+// arguments, and returns a pre-configured error.
+type fakeStorage struct {
+	storage.Interface
+	deleteErr error
 
-type deleteWant struct {
-	deleted  bool
-	checkErr func(err error) bool
+	deleteCalled            bool
+	key                     string
+	expectTransformOrDecode bool
+	preconditions           *storage.Preconditions
+	cachedExistingObject    runtime.Object
 }
 
-var (
-	wantNoError     = func(err error) bool { return err == nil }
-	wantErrContains = func(shouldContain string) func(error) bool {
-		return func(err error) bool {
-			return err != nil && strings.Contains(err.Error(), shouldContain)
-		}
-	}
-)
-
-func (w deleteWant) verify(t *testing.T, got result) {
-	t.Helper()
-
-	if !w.checkErr(got.err) {
-		t.Errorf("Unexpected failure with the deletion operation, got: %v", got.err)
-	}
-	if w.deleted != got.deleted {
-		t.Errorf("Expected deleted to be: %t, but got: %t", w.deleted, got.deleted)
-	}
-
+func (s *fakeStorage) Delete(ctx context.Context, key string, out runtime.Object, preconditions *storage.Preconditions, deleteValidation storage.ValidateObjectFunc, cachedExistingObject runtime.Object, opts storage.DeleteOptions) error {
+	s.deleteCalled = true
+	s.key = key
+	s.expectTransformOrDecode = opts.ExpectTransformOrDecodeError
+	s.preconditions = preconditions
+	s.cachedExistingObject = cachedExistingObject
+	return s.deleteErr
 }
 
-func TestUnsafeDeletePrecondition(t *testing.T) {
-	option := func(enabled bool) *metav1.DeleteOptions {
-		return &metav1.DeleteOptions{
-			IgnoreStoreReadErrorWithClusterBreakingPotential: ptr.To[bool](enabled),
-		}
-	}
-
-	const (
-		unsafeDeleteNotAllowed = "ignoreStoreReadErrorWithClusterBreakingPotential: Invalid value: true: is exclusively used to delete corrupt object(s), try again by removing this option"
-		internalErr            = "Internal error occurred: initialization error, expected normal deletion flow to be used"
-	)
-
-	tests := []struct {
-		name    string
-		err     error
-		opts    *metav1.DeleteOptions
-		invoked int
-		want    deleteWant
+func TestCorruptObjectDeleterDelete(t *testing.T) {
+	for _, test := range []struct {
+		name               string
+		deleteErr          error
+		opts               *metav1.DeleteOptions
+		expectDeleteCalled bool
+		wantDeleted        bool
+		wantErr            func(error) bool
 	}{
 		{
-			name: "option nil, should throw internal error",
-			opts: nil,
-			want: deleteWant{checkErr: wantErrContains(internalErr)},
+			name:    "options nil, should return internal error",
+			opts:    nil,
+			wantErr: errors.IsInternalError,
 		},
 		{
-			name: "option empty, should throw internal error",
-			opts: &metav1.DeleteOptions{},
-			want: deleteWant{checkErr: wantErrContains(internalErr)},
+			name:    "options empty, should return internal error",
+			opts:    &metav1.DeleteOptions{},
+			wantErr: errors.IsInternalError,
 		},
 		{
-			name: "option false, should throw internal error",
-			opts: option(false),
-			want: deleteWant{checkErr: wantErrContains(internalErr)},
+			name:    "option false, should return internal error",
+			opts:    &metav1.DeleteOptions{IgnoreStoreReadErrorWithClusterBreakingPotential: ptr.To(false)},
+			wantErr: errors.IsInternalError,
 		},
 		{
-			name: "option true, object readable, should throw invalid error",
-			opts: option(true),
-			want: deleteWant{
-				checkErr: wantErrContains(unsafeDeleteNotAllowed),
-			},
+			name:               "option true, object decodable, store returns InvalidObj",
+			opts:               &metav1.DeleteOptions{IgnoreStoreReadErrorWithClusterBreakingPotential: ptr.To(true)},
+			deleteErr:          storage.NewInvalidObjError("/pods/foo", "object is decodable"),
+			expectDeleteCalled: true,
+			wantErr:            errors.IsConflict,
 		},
 		{
-			name: "option true, object not readable with unexpected error, should throw invalid error",
-			opts: option(true),
-			err:  fmt.Errorf("unexpected error"),
-			want: deleteWant{
-				checkErr: wantErrContains(unsafeDeleteNotAllowed),
-			},
+			name:               "option true, object not decodable, delete succeeds",
+			opts:               &metav1.DeleteOptions{IgnoreStoreReadErrorWithClusterBreakingPotential: ptr.To(true)},
+			deleteErr:          nil,
+			expectDeleteCalled: true,
+			wantDeleted:        true,
+			wantErr:            func(err error) bool { return err == nil },
 		},
 		{
-			name: "option true, object not readable with storage internal error, should throw invalid error",
-			opts: option(true),
-			err:  storage.NewInternalError(fmt.Errorf("unexpected error")),
-			want: deleteWant{
-				checkErr: wantErrContains(unsafeDeleteNotAllowed),
-			},
+			name:               "option true, object not found, store returns NotFound",
+			opts:               &metav1.DeleteOptions{IgnoreStoreReadErrorWithClusterBreakingPotential: ptr.To(true)},
+			deleteErr:          storage.NewKeyNotFoundError("/pods/foo", 0),
+			expectDeleteCalled: true,
+			wantErr:            errors.IsNotFound,
 		},
-		{
-			name: "option true, object not readable with corrupt object error, unsafe-delete should trigger",
-			opts: option(true),
-			err:  storage.NewCorruptObjError("foo", fmt.Errorf("object not decodable")),
-			want: deleteWant{
-				deleted:  true,
-				checkErr: wantNoError,
-			},
-			invoked: 1,
-		},
-	}
-
-	for _, test := range tests {
+	} {
 		t.Run(test.name, func(t *testing.T) {
 			ctx := genericapirequest.WithNamespace(genericapirequest.NewContext(), "test")
-			destroyFunc, registry := NewTestGenericStoreRegistry(t)
-			defer destroyFunc()
+			fs := &fakeStorage{deleteErr: test.deleteErr}
+			const podPrefix = "/pods/"
+			store := &Store{
+				NewFunc:                   func() runtime.Object { return &example.Pod{} },
+				DefaultQualifiedResource:  schema.GroupResource{Resource: "pods"},
+				SingularQualifiedResource: schema.GroupResource{Resource: "pod"},
+				KeyRootFunc:               func(ctx context.Context) string { return podPrefix },
+				KeyFunc: func(ctx context.Context, name string) (string, error) {
+					if _, ok := genericapirequest.NamespaceFrom(ctx); !ok {
+						return "", fmt.Errorf("namespace is required")
+					}
+					return path.Join(podPrefix, name), nil
+				},
+				Storage: DryRunnableStorage{Storage: fs},
+			}
+			deleter := NewCorruptObjectDeleter(store)
 
-			object := &example.Pod{
-				ObjectMeta: metav1.ObjectMeta{Name: "foo"},
-				Spec:       example.PodSpec{NodeName: "machine"},
+			obj, deleted, err := deleter.Delete(ctx, "foo", func(context.Context, runtime.Object) error {
+				t.Fatal("caller-provided admission was invoked")
+				return nil
+			}, test.opts)
+
+			if obj != nil {
+				t.Errorf("Expected nil object, but got %v", obj)
 			}
-			_, err := registry.Create(ctx, object, rest.ValidateAllObjectFunc, &metav1.CreateOptions{})
-			if err != nil {
-				t.Fatalf("Unexpected error from Create: %v", err)
+			if !test.wantErr(err) {
+				t.Errorf("Unexpected error: %v", err)
 			}
-
-			// wrap the storage so it returns the expected error
-			cs := &corruptStorage{
-				Interface: registry.Storage.Storage,
-				err:       test.err,
+			if test.wantDeleted != deleted {
+				t.Errorf("Expected deleted to be %t, but got %t", test.wantDeleted, deleted)
 			}
-			registry.Storage.Storage = cs
-			deleter := NewCorruptObjectDeleter(registry)
-
-			_, deleted, err := deleter.Delete(ctx, "foo", rest.ValidateAllObjectFunc, test.opts)
-
-			got := result{deleted: deleted, err: err}
-			test.want.verify(t, got)
-			if want, got := test.invoked, cs.unsafeDeleteInvoked; want != got {
-				t.Errorf("Expected unsafe-delete to be invoked %d time(s), but got: %d", want, got)
+			if test.expectDeleteCalled != fs.deleteCalled {
+				t.Errorf("Expected storage Delete called=%t, but got %t", test.expectDeleteCalled, fs.deleteCalled)
+			}
+			if fs.deleteCalled {
+				if want, got := "/pods/foo", fs.key; want != got {
+					t.Errorf("Expected storage Delete to be called with key %q, but got %q", want, got)
+				}
+				if !fs.expectTransformOrDecode {
+					t.Error("Expected storage Delete to be called with ExpectTransformOrDecodeError=true")
+				}
+				if fs.preconditions != nil {
+					t.Error("Expected storage Delete to be called with nil preconditions")
+				}
+				if fs.cachedExistingObject != nil {
+					t.Error("Expected storage Delete to be called with nil cachedExistingObject")
+				}
 			}
 		})
 	}
 }
-
-func TestUnsafeDeleteWithCorruptObject(t *testing.T) {
-	ctx := genericapirequest.WithNamespace(genericapirequest.NewContext(), "test")
-	destroyFunc, registry := NewTestGenericStoreRegistry(t)
-	defer destroyFunc()
-
-	object := &example.Pod{
-		ObjectMeta: metav1.ObjectMeta{Name: "foo"},
-		Spec:       example.PodSpec{NodeName: "machine"},
-	}
-	// a) prerequisite: try deleting th
... [truncated]

← Back to Alerts View on GitHub →