OKE Workload Identity, from the Wire Up (in Rust)

In part one I covered instance principals: the metadata service hands your VM a leaf certificate, you federate it with Auth, and you sign requests with a fresh session keypair. The mechanism is elegant. The sharp edges are real. It works.

It also authenticates the compute instance. In Kubernetes, that is too coarse. Ten pods on the same worker node would all share the node’s identity.

OKE workload identity moves the principal boundary down to the Kubernetes service account. Each workload is identified by cluster, namespace, and service account, then authorized by OCI IAM policy. Same dance as instance principals: bootstrap, exchange, sign. Different bootstrap credential, different exchange endpoint.

I needed this for a Rust service using OCI Queue from OKE, which is part of why I built the SDK in the first place.

The OKE setup

I am not going to repeat Oracle’s setup docs. The short version:

Use an enhanced OKE cluster.
Create a Kubernetes service account.
Grant OCI permissions with an IAM policy constrained to that workload identity.
Run the pod with serviceAccountName set.

The policy shape is the important bit. There is no dynamic group here; OCI evaluates the workload identity directly:

Allow any-user to <verb> <resource> in compartment apps where all {
  request.principal.type = 'workload',
  request.principal.namespace = 'default',
  request.principal.service_account = 'queue-sender',
  request.principal.cluster_id = 'ocid1.cluster.oc1.iad...'
}

The pod spec is ordinary Kubernetes:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: queue-sender
  namespace: default
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: queue-sender
spec:
  template:
    spec:
      serviceAccountName: queue-sender
      automountServiceAccountToken: true
      containers:
      - name: app
        image: your-image
        env:
        - name: OCI_QUEUE_ID
          value: "ocid1.queue.oc1..example"
        - name: OCI_RESOURCE_PRINCIPAL_REGION
          value: "us-ashburn-1"

After that, the kubelet does the bootstrap work. It projects a signed service account JWT into the pod filesystem at /var/run/secrets/kubernetes.io/serviceaccount/token, and it mounts the cluster CA at /var/run/secrets/kubernetes.io/serviceaccount/ca.crt.

The dance, in three movements

┌──────────┐    ┌──────────────┐    ┌──────────────┐    ┌─────────────┐
│ Kubelet  │───▶│ Pod FS       │───▶│ Proxymux     │───▶│ Any OCI     │
│          │    │ /var/run/... │    │ :12250       │    │ service     │
└──────────┘    └──────────────┘    └──────────────┘    └─────────────┘
 projected       SA JWT read         session token       signed call
 SA JWT          by app              bound to            (session key,
                                      session public key   ST$ token keyId)

Bootstrap. Read the Kubernetes-projected service account JWT from disk.
Exchange. POST that JWT plus a fresh session public key to proxymux on port 12250.
Sign. Use the returned session token and the matching session private key for OCI service calls.

Movement 1: read the projected token

sequenceDiagram
    autonumber
    participant App as Your Rust process
    participant FS as Pod filesystem

    App->>FS: read /var/run/secrets/kubernetes.io/<br/>serviceaccount/token
    FS-->>App: service account JWT<br/>(rotated by kubelet)
    Note over App: Check JWT expiry before<br/>calling proxymux

No network call yet. The kubelet already mounted the token. The app reads it, checks that it still has three JWT parts and a future exp, then moves on.

That local expiry check matters:

fn validate_sa_token(token: &str) -> Result<(), AuthError> {
    let parts: Vec<&str> = token.split('.').collect();
    if parts.len() != 3 {
        return Err(AuthError::MetadataError(
            "SA token is not a valid JWT (expected 3 parts)".into(),
        ));
    }

    let payload = BASE64_URL
        .decode(parts[1])
        .or_else(|_| BASE64.decode(parts[1]))
        .map_err(|e| AuthError::MetadataError(
            format!("failed to decode SA token payload: {e}")
        ))?;

    let claims: serde_json::Value = serde_json::from_slice(&payload)?;
    let exp = claims["exp"].as_u64()
        .ok_or_else(|| AuthError::MetadataError(
            "SA token has no 'exp' claim".into()
        ))?;

    let now = SystemTime::now()
        .duration_since(SystemTime::UNIX_EPOCH)
        .map_err(|e| AuthError::MetadataError(format!("system clock error: {e}")))?
        .as_secs();

    if now >= exp {
        return Err(AuthError::MetadataError(format!(
            "Kubernetes service account token has expired (exp: {exp}, now: {now}). \
             The kubelet may not be refreshing projected tokens."
        )));
    }

    Ok(())
}

Without this check, an expired service account token gets sent to proxymux and proxymux rejects it. The error does not say “your Kubernetes token is expired.” It just fails. The useful behavior is to fail locally, before the exchange, while the error can still point at the real problem.

Movement 2: exchange through proxymux

sequenceDiagram
    autonumber
    participant App as Your Rust process
    participant Proxy as Proxymux<br/>KUBERNETES_SERVICE_HOST:12250

    Note over App: Generate fresh RSA-2048<br/>session keypair
    Note over App: Encode session public key<br/>as SPKI PEM

    App->>Proxy: POST /resourcePrincipalSessionTokens<br/>Authorization: Bearer {SA_JWT}<br/>Content-Type: application/json<br/>Body: {"podKey": "<session_pub_pem>"}
    Proxy-->>App: resource principal session token

    Note over App: Parse response, normalize ST$ prefix,<br/>cache (session_keypair, token)

The proxymux runs on every OKE node. The SDK builds its endpoint from KUBERNETES_SERVICE_HOST, port 12250, and /resourcePrincipalSessionTokens:

let url = format!(
    "https://{}:{}{}",
    self.service_host,
    self.service_port,
    Self::PROXYMUX_PATH,
);

let body = serde_json::json!({
    "podKey": sanitized_key,
});

let result = self.http_client
    .post(&url)
    .header("Authorization", format!("Bearer {sa_token}"))
    .header("Content-Type", "application/json")
    .header("opc-request-id", &opc_request_id)
    .json(&body)
    .send()
    .await;

The podKey is the public half of a session keypair generated inside the process. The service account JWT proves the pod is allowed to register that key. The private half never leaves the process.

TLS is verified against the in-cluster CA, not the public trust store:

let ca_cert_pem = std::fs::read(&sa_cert_path)?;
let ca_cert = reqwest::Certificate::from_pem(&ca_cert_pem)?;

let http_client = reqwest::Client::builder()
    .add_root_certificate(ca_cert)
    .connect_timeout(Duration::from_secs(10))
    .timeout(Duration::from_secs(60))
    .build()?;

I retry 5xx responses with exponential backoff. I fail fast on 4xx, except for one case where the generic status code is worth translating.

if status.as_u16() == 403 {
    return Err(AuthError::MetadataError(
        "Proxymux returned 403. Please ensure the cluster type is enhanced (OKE).".into()
    ));
}

OKE workload identity only works on enhanced clusters. Basic clusters can still put a proxymux-shaped thing in your path, but it will not issue workload identity tokens. The response is just 403.

If you are staring at 403 from port 12250, check the cluster type before spending an hour on IAM policy syntax.

This was the sharpest edge in the implementation.

The same proxymux endpoint can return the same logical payload in three different wire formats, depending on the proxymux version running on the node:

#[derive(Debug)]
enum ParseStrategy {
    QuotedBase64,    // "eyJhbG..." -> unquote -> base64-decode -> JSON
    RawBase64,       //  eyJhbG...  -> base64-decode -> JSON
    DirectJson,      // {"token": "..."} -> parse JSON directly
}

fn parse_proxymux_response(
    body: &str,
) -> Result<(ProxymuxResponse, ParseStrategy), AuthError> {
    let trimmed = body.trim();

    if trimmed.starts_with('"') && trimmed.ends_with('"') {
        let unquoted = &trimmed[1..trimmed.len() - 1];
        if let Ok(decoded) = BASE64.decode(unquoted.as_bytes()) {
            if let Ok(response) = serde_json::from_slice::<ProxymuxResponse>(&decoded) {
                return Ok((response, ParseStrategy::QuotedBase64));
            }
        }
    }

    if !trimmed.starts_with('{') {
        if let Ok(decoded) = BASE64.decode(trimmed.as_bytes()) {
            if let Ok(response) = serde_json::from_slice::<ProxymuxResponse>(&decoded) {
                return Ok((response, ParseStrategy::RawBase64));
            }
        }
    }

    if trimmed.starts_with('{') {
        if let Ok(response) = serde_json::from_str::<ProxymuxResponse>(trimmed) {
            return Ok((response, ParseStrategy::DirectJson));
        }
    }

    Err(AuthError::InvalidKeyFormat(
        "failed to parse proxymux response using quoted_base64, raw_base64, or direct_json".into()
    ))
}

No content negotiation. No useful Content-Type. Just parse what came back.

The quoted base64 variant is what the Go SDK expects. I found it after a parser that handled raw base64 and JSON failed during a cluster upgrade. The Rust code tries all three, returns the strategy that worked for debug logging, and treats the endpoint as hostile to assumptions.

The token inside the proxymux response may include the ST$ prefix, or it may not. The signer wants exactly one prefix.

let raw_token = proxymux_response.token;
let token = raw_token
    .strip_prefix("ST$")
    .unwrap_or(&raw_token)
    .to_string();

// Later, when signing:
let key_id = format!("ST${}", creds.security_token);

Double-prefix it and you get a 401. Forget the prefix and you get a 401. Neither error is useful.

Movement 3: signed service call

This is exactly the same as instance principals. The AuthProvider trait hides the bootstrap path from the service client. Once the provider has a session keypair and a resource principal session token, signing is the same HTTP Signature flow from part one.

let auth = Arc::new(OkeWorkloadIdentityAuth::new()?);
let queue = QueueClient::new(auth.clone(), &queue_id, None, None).await?;
queue.put_message("Hello from OKE".into(), None).await?;

Three lines. The consumer does not know whether the credentials came from IMDS federation or the proxymux exchange, and it should not have to.

When to use which

Use instance principals when the workload runs on a compute instance and the instance is the security boundary. Bare VM, instance pool, self-managed Kubernetes on OCI compute: all of those fit. Every process on the instance shares the same OCI identity.

Use OKE workload identity when the workload runs on OKE and the pod’s Kubernetes service account should be the security boundary. Each service account can have different IAM permissions. The setup is more specific: enhanced cluster, service account, and IAM policies constrained by namespace, service account, and cluster OCID. The isolation is worth it when services need different permissions.

If you run Kubernetes on OCI but not OKE, there is no proxymux path. You are back to instance principals unless you bring your own identity layer.

What I left out

Region detection has a three-level fallback: explicit config, JWT claims, then IMDS. The builder exposes overrides for the token path, certificate path, host, port, and region. The retry loop attaches opc-request-id so Oracle support can trace the proxymux call. All of that is in src/auth.rs in the SDK.

The important part is the shape of the trust exchange: the Kubernetes token does not become the signing credential. It authorizes registering a session public key. After that, every OCI call is signed with the private half of a keypair generated inside your process.

That is the useful property across both posts. You are signing with a key Oracle has never seen in private form. The IMDS certificate and the Kubernetes service account token are bootstrap credentials. They prove you are allowed to bind a new public key to a short-lived OCI session token. Once that exchange succeeds, they are off the critical path.

Code is at github.com/GEverding/oci-rust-sdk. Still a work in progress. Issues and forks welcome.

This is the fourth post in a series about building real infrastructure on OCI. Previous: The Cloud Egress Tax, DRGs: Dual-Hub, Dual-Home Networking, and OCI Instance Principal Auth, from the Wire Up (in Rust).