Purpose
kleym is a Kubernetes operator that makes inference workloads legible to workload identity by translating inference intent into deterministic SPIFFE identities. It compiles that intent into SPIRE Controller Manager resources, primarily ClusterSPIFFEID.
Scope boundary: kleym is an identity registration compiler. It stops at identity registration and selector provenance. Inference deployment, traffic routing, and policy evaluation stay in the inference stack, gateway, mesh, or external policy engines. kleym does not configure Envoy, Envoy Gateway, kgateway, ext authz, ext proc, OPA, Cedar, OAuth, or OIDC.
Core Problem
Inference stacks can be deployed reliably, but identity registration remains manual, inconsistent, and hard to standardize across teams. GAIE defines inference specific objects and clearer responsibility boundaries, but it does not define identity semantics. kleym bridges this gap by deriving stable SPIFFE ID templates and tenant safe selectors from GAIE resources.
Core Value
- Deterministic identities derived from GAIE metadata rather than ad hoc labels.
- A single namespaced control surface for identity intent that works across heterogeneous inference stacks that share GAIE semantics.
- Low operational risk by delegating issuance and rotation to SPIRE Controller Manager.
Dependencies
- SPIRE Server and SPIRE Agent.
- SPIRE Controller Manager and its
ClusterSPIFFEIDCRD. kleymwritesClusterSPIFFEIDand does not write SPIRE entries directly.
Supported Downstream Pattern
- SPIRE issues X.509 SVIDs and/or JWT SVIDs.
- Envoy consumes identity through SDS or through components adjacent to the SPIRE Workload API.
- External auth policy maps SPIFFE ID to route or model authorization.
- Audit logging happens at the gateway or policy layer, not in
kleym.
Preferred Inference Signal
GAIE v1 objects are the primary signal.
InferenceObjectiveis the primary model level object and references anInferencePoolviapoolRef.InferencePooldefines the serving pod pool for inference traffic.InferenceModelis treated as legacy.
Identity Model
- Pool identity (
PoolOnly). One SPIFFE identity representing the serving pool pods. - Objective identity (
PerObjective). One SPIFFE identity perInferenceObjective, representing the model endpoint at the GAIE layer even when multiple objectives share the same pool.
PoolOnly and PerObjective are the only identity boundaries in kleym.
Container Level Enforcement
One model per container makes model identity enforceable. SPIRE Kubernetes workload attestation supports container scoped selectors such as container name and container image, so kleym can bind an objective identity to a specific container inside a pod rather than the pod as a whole. This is the mechanism that gives PerObjective mode meaningful discrimination when multiple objectives share a pool.
Constraint
Multiple ClusterSPIFFEID resources can select the same pod set, which can result in multiple identities applying to the same pods. Some workloads only support one SVID reliably. Clusters that require per objective identities may need to restrict or disable any default identity that would collide.
Some downstream consumers and sidecars behave as single identity consumers. Multiple matching ClusterSPIFFEID objects may be valid from SPIRE's perspective but still operationally unsafe for a given serving stack. kleym only prevents deterministic collision cases it can prove. Cluster operators remain responsible for disabling overlapping default identities outside kleym.
MVP API Surface
External CRDs consumed
- GAIE
InferencePool - GAIE v1
InferenceObjective
kleym CRD
InferenceIdentityBinding expresses identity intent for a single InferenceObjective.
InferenceIdentityBinding spec
targetRefreferences anInferenceObjectivein the same namespace.spiffeIDTemplateoptionally overrides the computed template. Default is deterministic and includes trust domain, namespace, and objective name.selectorSourceis"DerivedFromPool".kleymderives pod selection from the objectivepoolRefand validates it.workloadSelectorTemplatesare required safety constraints. Every renderedClusterSPIFFEIDmust include at minimum the k8s namespace selector (k8s:ns:<namespace>) and k8s service account selector (k8s:sa:<service-account>). These safety selectors are always present, then intersected with the derived pool selection and, inPerObjectivemode, the container discriminator.modeis"PoolOnly"or"PerObjective". Default is"PerObjective".containerDiscriminator(required whenmodeis"PerObjective").typeis"ContainerName"(preferred) or"ContainerImage"(fallback).ContainerNamemaps to a SPIRE k8s workload selectork8s:container-name:<value>.ContainerImagemaps tok8s:container-image:<value>and is weaker because a single image may serve multiple models.valueis the container name or image reference to match.- The container discriminator narrows the selected workload set so that each objective identity targets exactly one container within the pod.
InferenceIdentityBinding status
computedSpiffeIDslists identities created, including pool and objective identities.renderedSelectorsshows the final selectors applied toClusterSPIFFEID.conditionsincludeReady,Conflict,InvalidRef,UnsafeSelector,RenderFailure. TheConflictcondition uses reasonIdentityCollisionwhen two objectives inPerObjectivemode resolve to the same pod set and the same container name.
Controller Behavior
- Watch
InferenceIdentityBinding,InferenceObjective, andInferencePool. - Resolve
targetReftoInferenceObjective, then resolvepoolReftoInferencePool. - Derive pod selection from
InferencePool, then intersect with the mandatory safety selectors (namespace and service account) and, inPerObjectivemode, the container discriminator. - Detect identity collisions: if two
InferenceIdentityBindingresources inPerObjectivemode would match the same pod set and the samecontainer-namevalue, set theConflictcondition with reasonIdentityCollisionon both resources and refuse to reconcile either until the collision is resolved. - Reconcile one or more
ClusterSPIFFEIDresources inspire.spiffe.iousing the computed SPIFFE IDs and validated selectors. - Update status and emit events for conflicts, unsafe selection, identity collisions, and render failures.
Multi Tenant Safety
InferenceIdentityBindingis namespaced and only references objects in the same namespace.- Derived selectors must be proven to stay within the namespace. If they can match outside, reconciliation is refused and
UnsafeSelectoris set. - Ambiguous bindings where the derived selection cannot be proven to correspond to the referenced pool are refused.
Multiple Objectives to One Pod Set
GAIE commonly maps multiple objectives to one pool. In kleym, the pool defines where it runs and the objective defines what it is. kleym can produce distinct objective identities while selectors still target the same pods.
When two objectives share a pool, the container discriminator is what keeps their identities distinct. Each objective must point to a different container name within the pod. If two objectives resolve to the same pod set and the same container-name, reconciliation is refused on both with reason IdentityCollision until the conflict is corrected, for example by assigning each model to its own container.
Acceptance Criteria
- In a cluster with existing GAIE
InferencePoolandInferenceObjectiveresources, creating anInferenceIdentityBindingcreates stableClusterSPIFFEIDresources and remains stable under resync. - Multiple objectives referencing one pool produce distinct SPIFFE IDs without unsafe selector expansion.
- Overly broad or malicious selector expansion is rejected with clear status conditions.
kleymdoes not create or modify inference deployments, pools, routes,Gateway,HTTPRoute, or schedulers.
Packaging
- Helm chart includes
kleymCRDs and controller deployment. - License is Apache 2.0.