Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Conditional Authorization for Kubernetes, SIG A...

Conditional Authorization for Kubernetes, SIG Auth presentation

Presented to Kubernetes Special Interest Group (SIG) Auth on June 4, 2025.

Recording to be added soon, SIG meetings are recorded.

Meeting notes are available at https://docs.google.com/document/d/1woLGRoONE3EBVx-wTb4pvp4CI7tmLZ6lS26VTbosLKM/edit?tab=t.0#heading=h.hophvu703yb0

Information about SIG Auth is in https://github.com/kubernetes/community/tree/master/sig-auth

Avatar for Lucas Käldström

Lucas Käldström

June 04, 2025
Tweet

More Decks by Lucas Käldström

Other Decks in Technology

Transcript

  1. TL;DR; Unified UX across authz & admission Unified UX across

    admission reads & writes ⇒ Conditional Authorization @luxas.dev
  2. Fast Safe Analyzable Expressive Correct RBAC Webhooks VAP UX gap

    in authorization Turing-completeness wall
  3. Fast Safe Analyzable Expressive Correct RBAC Webhooks VAP UX gap

    in authorization Turing-completeness wall In cloud providers Not dependable upon
  4. “Over-grant” in RBAC, deny in VAP RBAC Role Allow in

    authorization RBAC Role Binding Allow in authorization CEL Policy Deny in admission create, update, delete gateways — object=* oldobject=* .class != ‘test-gateway’ .class == ‘test-gateway’ Kubernetes RBAC CEL Rule Amount of permissions for lucas Desired permissions
  5. Examples of logically unified authz + admission - DRA Admin

    Access: allow, when old/new object has adminAccess == false @luxas.dev
  6. Examples of logically unified authz + admission - DRA Admin

    Access: allow, when old/new object has adminAccess == false - Ingress/Gateway: Only listwatch .type=tls Secrets and/or labelled ones @luxas.dev
  7. Examples of logically unified authz + admission - DRA Admin

    Access: allow, when old/new object has adminAccess == false - Ingress/Gateway: Only listwatch .type=tls Secrets and/or labelled ones - Class: Only some users are allowed to use certain classes @luxas.dev
  8. Examples of logically unified authz + admission - DRA Admin

    Access: allow, when old/new object has adminAccess == false - Ingress/Gateway: Only listwatch .type=tls Secrets and/or labelled ones - Class: Only some users are allowed to use certain classes - Node bounds: Agent to only see CRD with .nodeName=<bound node> @luxas.dev
  9. Examples of logically unified authz + admission - DRA Admin

    Access: allow, when old/new object has adminAccess == false - Ingress/Gateway: Only listwatch .type=tls Secrets and/or labelled ones - Class: Only some users are allowed to use certain classes - Node bounds: Agent to only see CRD with .nodeName=<bound node> - CSR signers: Some users can only certain signers (today compound authz) @luxas.dev
  10. Examples of logically unified authz + admission - DRA Admin

    Access: allow, when old/new object has adminAccess == false - Ingress/Gateway: Only listwatch .type=tls Secrets and/or labelled ones - Class: Only some users are allowed to use certain classes - Node bounds: Agent to only see CRD with .nodeName=<bound node> - CSR signers: Some users can only certain signers (today compound authz) - Impersonation: Predicate on full UserInfo shape that can be impersonated @luxas.dev
  11. These all boil down to label and field predicates @luxas.dev

    The solution I present here, conditional authorization, is a superset of the “selectors for all verbs” KEP idea
  12. Fast Safe Analyzable Expressive RBAC Webhooks VAP UX gap in

    authorization Turing-completeness wall Correct In cloud providers Not dependable upon
  13. Fast Safe Analyzable Expressive Correct RBAC Webhooks VAP Turing-completeness wall

    Analyzability wall In cloud providers Not dependable upon
  14. Fast Safe Analyzable Expressive Correct RBAC Webhooks VAP Turing-completeness wall

    Analyzability wall NEW In cloud providers Not dependable upon
  15. Analyzability wall? @luxas.dev n variables, m statements Propositional Logic in

    SAT First-order Logic in SMT Variable data types Booleans only Booleans, Strings, Ints, Objects, Arrays, Sets, Functions, etc. Operators =, ¬, ⋀, ⋁, ⟹, ⟺ PL and e.g. +, -, <, has, etc. Quantifiers None forall (∀) and exists (∃) Decidable Yes Without quantifiers only* * Church-Turing theorems 1937 ⇒ No loops and quantifiers to keep analyzability
  16. Why keep analyzability? Partially order permissiveness of policy set ⇒

    Check for logical inconsistencies in a policy set (shadowing, no-ops) ⇒ Check for equality (help refactors) ⇒ Prevent privilege escalation (in kube-api-server) ← No effect ← Allow shadows allow ← Deny shadows allow Allow policy Deny policy @luxas.dev
  17. Thankfully, CEL should mostly map to analyzable SMT CEL Analyzable

    SMT e.all e.exists e.exists_one e.map e.filter str.matches str.contains str.startsWith str.endsWith uint int bool string bytes* null map** list double has size*** * not sure, didn’t dig very deep ** map of string values should work, but is a bit of work *** might need some extra work () . [] {} - (unary) ! * / % + - (binary) == != < > <= >= in && || ?: uninterpreted functions + probably more Some SMT solvers also support more advanced features, like regexps, but they might not always be analyzable for all cases, or standardized in the SMT-LIB standard. Disclaimer: I’m not an expert
  18. Strive to keep it simple The logic analyzability constraint offers

    a guideline on how much expressiveness is reasonable at the authorization layer, and what is not. Loops, quantifiers, full-blown regexps, and such would anyways be “overkill” * From a SMT solver like Z3 or cvc5 @luxas.dev
  19. Strive to keep it simple The logic analyzability constraint offers

    a guideline on how much expressiveness is reasonable at the authorization layer, and what is not. Loops, quantifiers, full-blown regexps, and such would anyways be “overkill” But even all decidable SMT might not be needed, e.g. we probably don’t want floats as part of an authz decision. * From a SMT solver like Z3 or cvc5 @luxas.dev
  20. Strive to keep it simple The logic analyzability constraint offers

    a guideline on how much expressiveness is reasonable at the authorization layer, and what is not. Loops, quantifiers, full-blown regexps, and such would anyways be “overkill” But even all decidable SMT might not be needed, e.g. we probably don’t want floats as part of an authz decision. If we maintain an encoding into decidable SMT, we get analyzability “for free”* * From a SMT solver like Z3 or cvc5 @luxas.dev
  21. Strive to keep it simple The logic analyzability constraint offers

    a guideline on how much expressiveness is reasonable at the authorization layer, and what is not. Loops, quantifiers, full-blown regexps, and such would anyways be “overkill” But even all decidable SMT might not be needed, e.g. we probably don’t want floats as part of an authz decision. If we maintain an encoding into decidable SMT, we get analyzability “for free”* This subset of CEL might also map “just right” into the efficient-ish SQL encoding we want for the new unified selector syntax. * From a SMT solver like Z3 or cvc5 @luxas.dev
  22. Fast Safe Analyzable Expressive Correct RBAC Webhooks VAP Turing-completeness wall

    Analyzability wall NEW In cloud providers Not dependable upon
  23. Unify policy authoring for both authorization and admission Authorization RequestInfo

    UserInfo Kubernetes Path for write request Project Partial Evaluation Yes, No, Maybe 403 Policies Webhook Authentication @luxas.dev
  24. Partial Evaluation: Work with incomplete data Even though the request

    object is not decoded at authorization time, we can resolve everything else based on request attributes. We might get unconditional allow/deny already! Or then, a “residual” expression over the unknown object CEL might be able to do this to some degree, but didn’t test it yet.
  25. Unify policy authoring for both authorization and admission Authorization RequestInfo

    UserInfo Kubernetes Path for write request Project Partial Evaluation Yes, No, Maybe 403 Policies Webhook Authentication @luxas.dev
  26. Unify policy authoring for both authorization and admission Authorization RequestInfo

    UserInfo Admission Control RequestInfo UserInfo Body Body Kubernetes Path for write request Project Partial Evaluation Yes, No, Maybe Full Evaluation Yes, No 403 403 Policies Webhooks Authentication Storage @luxas.dev
  27. Condition addition to SubjectAccessReview Allow a SAR client to opt-in

    to conditional authz through feature flags and e.g. an annotation on the SAR SAR server responds either “Allow + condition” or “Conditional + condition” ⇒ Means that SAR server might want to loop until the end, to try to find unconditional allow, instead of short-circuiting on a conditional. Moreover, multiple conditions might be ORed Note: Today, this all can be handled out of core! But it’d be neater to have: a) Kube API server enforce the condition, instead of requiring catch-all admission b) (Possibly in the long road) Some core API that can be “depended upon” @luxas.dev
  28. (Cluster)RoleBinding (Cluster)Role ValidatingAdmissionPolicy Input Username, Group (Namespace) RoleRef APIGroup CombinedResource

    (Name) (Namespace) Username, Group UID, User Extra GVR Subresource Name Namespace GVK New + Old Object Ns Object Authorizer Operators == ==, In ==, != In, NotIn Prefix, Suffix Expression Fixed Fixed Arbitrary Scope Subject Object Object Applicability Reads, Writes, SAR, Custom Reads, Writes, SAR, Custom Writes Current state
  29. (Cluster)RoleBinding (Cluster)Role ValidatingAdmissionPolicy UnifiedAuthorization Input Username, Group (Namespace) RoleRef APIGroup

    CombinedResource (Name) (Namespace) Username, Group UID, User Extra GVR Subresource Name Namespace GVK New + Old Object Ns Object Authorizer Username, Group UID, User Extra APIGroup CombinedResource Name (!) (Namespace) (GVK) (New + Old Object) (Ns Object) Operators == ==, In ==, != In, NotIn Prefix, Suffix ==, != In, NotIn Prefix, Suffix Expression Fixed Fixed Arbitrary Arbitrary Scope Subject Object Object Object Applicability Reads, Writes, SAR, Custom Reads, Writes, SAR, Custom Writes Reads, Writes, SAR, Custom Example with one unified authz + admission API
  30. ConditionalRoleBinding ConditionalRole ValidatingAdmissionPolicy Input Username, Group UID, User Extra (Namespace)

    RoleRef Username, Group UID, User Extra APIGroup CombinedResource Name (Namespace) (GVK) (New + Old Object) (Ns Object) Username, Group UID, User Extra GVR GVK Subresource Name Namespace New + Old Object Ns Object Authorizer Operators ==, != In, NotIn Prefix, Suffix ==, != In, NotIn Prefix, Suffix ==, != In, NotIn Prefix, Suffix Expression Arbitrary Arbitrary Arbitrary Scope Subject Object Object Applicability Reads, Writes, SAR, Custom Reads, Writes, SAR, Custom Writes Example if we want a two-layer model
  31. UX for matching label and field selectors today Example from

    RBAC++: request.resourceAttributes.fieldSelector.requirements.exists(r, r.key == "type" && r.operator == "=" && sets.equivalent(r.values, ["mytype"])) How it would be written for admission: object.type == "mytype" The former is request-scoped/oriented, the second is object-scoped. The latter is clearly more user-friendly. @luxas.dev
  32. Selector dimensionality Visual of two ORed allow rule conditions: (labels.env

    != “prod” && labels.owner in [“team-1”, “team-2”]) || (labels.env == “test”) There are 22 possible label selectors that would be allowed by these policies. How can we check that every object that could be returned from storage is authorized? @luxas.dev
  33. Selector dimensionality The naive way would be to perform one

    check per object that could be matched. E.g. “owner in (‘team-1’, ‘team-2’), env in (‘test’, ‘dev’)” selectors match 4 “archetypes” of objects In this case, authorized! @luxas.dev
  34. The naive way would be to perform one check per

    object that could be matched. E.g. “owner in (‘team-2’, ‘team-3’), env in (‘test’, ‘dev’)” selectors match 4 “archetypes” of objects In this case, not authorized! Selector dimensionality @luxas.dev Concrete counterexample:
  35. However, explicit enumeration doesn’t work with NotExists, !=, NotIn @luxas.dev

    Because then the amount of possibly selected objects is infinite
  36. Unify policy authoring targeting selectors for reads and writes Authorization

    Kubernetes Path for read request Project Full Evaluation Yes, No 403 Authentication Storage Example selectors: “Label owner in (‘team-1’, ‘team-2’), env in (‘test’, ‘dev’)” “Field .spec.gatewayClassName != ‘production’” @luxas.dev Policies RequestInfo UserInfo Selectors
  37. Unify policy authoring targeting selectors for reads and writes Authorization

    RequestInfo UserInfo Selectors Kubernetes Path for read request Project 1. Partial Evaluation => Yes, No, Maybe 2. If Maybe, turn Selectors and Residual into SMT => Yield Yes or No 403 Policies Authentication Storage Authorize IFF: ∀o : objectSelected(o) ⇒ isAuthorized(o) ≡ ∃o : objectSelected(o) ∧ ¬isAuthorized(o) = UNSAT @luxas.dev
  38. Example encoding isAuthorized(o) = (o.labels.env != “prod” && o.labels.owner in

    [“team-1”, “team-2”]) || (o.labels.env == “test”) objectSelected(o) = o.labels.env in [“test“, “dev”] && o.labels.owner in [“team-1”, “team-2”] ∀o: objectSelected(o) ⇒ isAuthorized(o) IFF ∃o: objectSelected(o) ∧ ¬isAuthorized(o) = UNSAT Result is UNSAT => Request authorized!
  39. Maintains a decidable encoding into Satisfiability Modulo Theories Open Source

    Authorization Engine @luxas.dev Aims to be expressive, fast, safe, and analyzable
  40. Maintains a decidable encoding into Satisfiability Modulo Theories Open Source

    Authorization Engine @luxas.dev Aims to be expressive, fast, safe, and analyzable Supports RBAC, ReBAC and ABAC paradigms
  41. Maintains a decidable encoding into Satisfiability Modulo Theories Open Source

    Authorization Engine @luxas.dev Aims to be expressive, fast, safe, and analyzable AWS is donating Cedar to the CNCF Supports RBAC, ReBAC and ABAC paradigms
  42. Kubernetes API Server /openapi/v3/<group> API Discovery Document /apis/<group>/<version> 1. Improve

    policy authoring usability with typed schema Project Schema IDE Dev loop @luxas.dev
  43. 2. Unify policy authoring for both authorization and admission Previous

    example shown in the project’s proposed syntax. Only one policy object is needed, not three like before. @luxas.dev
  44. 3. Unify policy authoring targeting selectors for reads and writes

    The last example, but for any action, including reads. Predicates targeting resource.stored determine if a concrete object is allowed to be read from storage. @luxas.dev
  45. Use CEL x Cedar intersection to allow both “frontends” CEL

    Cedar str.contains str.startsWith str.endsWith uint int bool string bytes* null map** list double has size*** () . [] {} - (unary) ! * / % + - (binary) == != < > <= >= in && || ?: Cedar is reducible to SMT in open source, thus AuthzCEL → Cedar → SMT
  46. 5. Write backend once, use for multiple “frontends” Kubernetes CEL

    (portion w/o loops) Kubernetes RBAC New Selector-based Authorization paradigm? New Multi-cluster Policies? Project SMT Solvers @luxas.dev Policies Engine
  47. Takeaways I think (happy to be proven wrong at now

    rather than later) that conditional authorization is feasible to move forward upstream (to KEP) in a form like this. If we restrict ourselves to SMT-analyzable expressions, users can interface with either CEL or Cedar, or something else that they prefer. Privilege escalation analysis (policy ordering) might help us catch unexpected things like “I can edit an object such that it becomes readable for me” Cedar policies are nice to write in that they provide an instant IDE validation flow Users can be provided with a uniform experience across reads/writes and authorization/admission through this primitive Cedar could offload some of the complexities here, like analysis, if we want / need. @luxas.dev
  48. Future Work / Ideas Is it worth integrating some of

    the parameter ideas of VAP into this, or does that get too complex? Sometimes, a semantic property is deeply nested in the object, and the conditional authorization layer (without loops) in unable to check it. Should we recommend API authors to in this case have some kind of flag/enum “top-level” on the object, which allows enabling the privileged behavior, and then enforce this in validation?
  49. Ready-made answers to assumed questions This does NOT make authorization

    decisions dependent on object state. Authorization here is still dependent only on policies. This should work for API aggregation use-cases as well, even though the Kube API server doesn’t have access to the request body. We should still recommend that people design their APIs for specific personas, and not subdivide namespaces. We might want some way to enforce immutability once set for selectable fields and labels. Cedar released a CLI for policy analysis. Cedar is written in Rust, but provides FFI and Wasm bindings.