Skip to content
Operator Spec

Operator Spec

kleym-operator is an identity registration compiler. It translates inference intent into deterministic Secure Production Identity Framework for Everyone (SPIFFE) identities and writes SPIFFE Runtime Environment (SPIRE) Controller Manager ClusterSPIFFEID resources.

Kleym stops at identity registration. It does not deploy inference workloads, route traffic, configure gateways, evaluate request policy, issue credentials, or write SPIRE registration entries directly.

Scope

Kleym owns InferenceIdentityBinding, GAIE input resolution, SPIFFE ID rendering, selector safety, managed ClusterSPIFFEID reconciliation, status, events, and finalizer cleanup.

Kleym does not own inference workloads, schedulers, routes, gateways, serving behavior, Envoy, OPA, OAuth, OIDC, SPIRE Server, SPIRE Agent, credential issuance, authorization, or audit decisions.

Dependency facts live in Dependencies. Supported GAIE inputs live in GAIE Compatibility.

Operator Configuration

kleym-operator requires install-level identity configuration at startup:

FlagRequiredBehavior
--trust-domain=<value>yesSets the SPIRE Server trust domain used when rendering every SPIFFE ID. The value must not include spiffe://, must not contain /, and must not include leading or trailing whitespace.
--clusterspiffeid-class-name=<value>noSets spec.className on every managed ClusterSPIFFEID. When empty, Kleym omits spec.className and keeps classless output.

trustDomain and ClusterSPIFFEID class are deployment concerns, not per-binding inference identity intent. They are not fields on InferenceIdentityBinding.

When --clusterspiffeid-class-name is empty, SPIRE Controller Manager must be configured to watch classless ClusterSPIFFEID resources, for example with its watchClassless behavior. When a class name is set, SPIRE Controller Manager must watch that class.

API Contract

InferenceIdentityBinding is namespaced. Pool and objective references stay in that namespace.

  1. poolRef references one InferencePool. The pool is the required workload anchor and selector provenance source.
  2. objectiveRef references one InferenceObjective. It is required for PerObjective; the objective must reference the same pool as poolRef.
  3. mode is PoolOnly or PerObjective. These are the only identity boundaries. The default is PerObjective.
  4. serviceAccountName is required. Kleym renders safety selectors internally as k8s:ns:<binding namespace> and k8s:sa:<serviceAccountName>.
  5. SPIFFE IDs are always deterministic under the configured trust domain:
    • PoolOnly: spiffe://<trustDomain>/ns/<namespace>/pool/<pool-name>
    • PerObjective: spiffe://<trustDomain>/ns/<namespace>/objective/<objective-name>
  6. containerName is required for PerObjective and must be empty for PoolOnly.
  7. Status records computedSpiffeIDs, renderedSelectors, and conditions. Conditions include Ready, Conflict, InvalidRef, UnsafeSelector, and RenderFailure.

Field details live in API Reference. Condition details live in Conditions Reference.

Required Behavior

  1. Discover supported GAIE pool and objective GVKs served by the cluster and watch only that subset.
  2. Fail startup when no supported InferencePool GVK is available. Objective GVKs are optional for PoolOnly.
  3. Resolve poolRef and objectiveRef only to documented supported GAIE groups.
  4. Derive pod selection from the referenced pool, then combine it with internal namespace and service-account safety selectors and, in PerObjective mode, k8s:container-name:<containerName>.
  5. Refuse unsafe selectors. If the selector set cannot be proven to stay within the binding namespace and required service account boundary, set UnsafeSelector and produce no managed output.
  6. Render the SPIFFE ID and managed ClusterSPIFFEID shape deterministically. Rendered output fields are documented in Managed Resources.
  7. Refuse identity collisions. If two PerObjective bindings would match the same pod set and same container-name value, set Conflict=True with reason IdentityCollision on both resources and reconcile neither until fixed.
  8. Treat missing required CRDs and infrastructure-not-ready states as transient by retrying reconciliation on a timer.
  9. On deletion, delete managed ClusterSPIFFEID children first and keep the binding finalizer until a follow-up list confirms no managed children remain.

Collision behavior is expanded in Collision Detection. Selector rationale is expanded in Selector Safety.

Safety Invariants

  1. InferenceIdentityBinding is namespaced.
  2. Pool and objective references stay in the binding namespace.
  3. PoolOnly and PerObjective are the only identity boundaries.
  4. Unsafe selectors are refused.
  5. Identity collisions are refused with Conflict reason IdentityCollision.
  6. Deletion keeps the finalizer until managed children are gone.
  7. kleym-operator does not create or modify inference deployments, pools, routes, gateways, schedulers, or policy resources.

References

Last updated on