Skip to content

Data Enclaves

Data enclaves are the core execution environments in Alien Giraffe. They are short-lived runtimes created on demand after policy evaluation and approval. Their job is to pull an approved slice of data into an isolated in-memory runtime, prepare that runtime for query access, and expose inbound access only after the approved data has been loaded and the pull credentials have been cleared.

A data enclave exists to:

  • execute the approved access path for a request
  • resolve scoped secret references only inside the execution boundary
  • pull data only from explicitly allowed data sources and endpoints
  • provision a request-bounded in-memory query runtime from the approved pull
  • terminate when the request or access window ends

This means the platform does not grant access by handing out broad standing permissions. Instead, it provisions a temporary runtime that matches the approved policy outcome.

Data enclaves are designed to be:

  • ephemeral and created on demand
  • isolated from other request runtimes
  • non-persistent by default
  • bound to least-privilege identity and network policies
  • loaded through a single request-bounded ingestion step
  • fully auditable across startup, execution, and teardown

They are intended to reduce lateral movement risk and to make access behavior easier to reason about per request.

The exact profile is configurable, but a common baseline is:

  • CPU: 1-4 vCPU
  • memory: 2-32 GB RAM
  • storage: 4-128 GB ephemeral disk

These ranges allow the platform to support both lightweight interactive access and heavier analysis or extraction jobs without assuming a single fixed workload shape.

The intended enclave lifecycle is:

  1. A request is approved and the platform provisions a new enclave.
  2. The enclave receives the approved request contract, secret references, and the minimum secret-resolution inputs needed for that request.
  3. Inside the enclave boundary, datasource credentials are resolved just in time for the pull.
  4. The enclave starts one atomic ingestion phase and pulls all approved datasets in parallel.
  5. Pulled datasets are materialized into the enclave’s in-memory query database.
  6. Temporary credentials, secret-resolution material, and transient pull artifacts are cleared.
  7. Only after the pull completes does the enclave become ready for inbound API and tunnel access.
  8. Interactive and API queries operate on the prepared in-memory data until the enclave is torn down.

In this model, the enclave is not a live pass-through proxy to upstream systems. It prepares a request-scoped data plane first and exposes access only after that preparation step succeeds.

sequenceDiagram
    autonumber
    participant CP as Control Plane
    participant DE as Data Enclave
    participant SB as Secrets Backend
    participant DS as Datasources
    participant DB as In-Memory Query DB
    participant IN as Inbound API and Tunnel Edge

    CP->>DE: Provision enclave for approved request
    CP->>DE: Attach request contract and secret references
    CP->>DE: Attach secret-resolution inputs
    DE->>SB: Resolve scoped pull credentials
    par Parallel approved pulls
        DE->>DS: Pull dataset group A
    and
        DE->>DS: Pull dataset group B
    and
        DE->>DS: Pull dataset group N
    end
    DS-->>DE: Return approved data slices
    DE->>DB: Materialize pulled data into in-memory query runtime
    DE->>DE: Clear credentials and temporary pull artifacts
    DE->>IN: Enable readiness and inbound access
    IN->>DB: Serve explorer and API queries from prepared data

The security posture of a data enclave should be stricter than that of a general application container.

  • no inbound access is exposed until the approved ingestion phase has completed
  • outbound access is limited to explicitly allowed endpoints
  • IAM or service identity is scoped to least privilege
  • secret references and secret-resolution inputs are attached only for the approved pull path
  • resolved credentials exist only for the duration of the ingestion window
  • lifecycle events are logged so creation, execution, and teardown are traceable

By default, data enclaves should not depend on persistent storage.

  • temporary working data can exist for the lifetime of the enclave
  • transient pull artifacts can exist only for the duration of ingestion
  • the primary query surface is backed by in-memory prepared data
  • long-lived state should remain outside the enclave unless a deployment explicitly requires otherwise

This helps keep the runtime disposable and reduces the amount of residual state left behind after access ends.

Because enclaves are created on demand, operators should think of them as policy-scoped workers rather than long-running application services.

That model supports:

  • stronger per-request isolation
  • parallel ingestion across approved datasources without exposing those datasources directly to users
  • credential clearing before interactive access begins
  • easier teardown after access expires
  • better alignment between policy outcomes and actual runtime permissions
  • cleaner auditability for incident review and compliance reporting