Data Enclaves

Data enclaves are the core execution environments in Alien Giraffe. They are short-lived runtimes created on demand after policy evaluation and approval. Their job is to pull an approved slice of data into an isolated in-memory runtime, prepare that runtime for query access, and expose inbound access only after the approved data has been loaded and the pull credentials have been cleared.

Role in the architecture

A data enclave exists to:

execute the approved access path for a request
resolve scoped secret references only inside the execution boundary
pull data only from explicitly allowed data sources and endpoints
provision a request-bounded in-memory query runtime from the approved pull
terminate when the request or access window ends

This means the platform does not grant access by handing out broad standing permissions. Instead, it provisions a temporary runtime that matches the approved policy outcome.

Core characteristics

Data enclaves are designed to be:

ephemeral and created on demand
isolated from other request runtimes
non-persistent by default
bound to least-privilege identity and network policies
loaded through a single request-bounded ingestion step
fully auditable across startup, execution, and teardown

They are intended to reduce lateral movement risk and to make access behavior easier to reason about per request.

Baseline runtime profile

The exact profile is configurable, but a common baseline is:

CPU: 1-4 vCPU
memory: 2-32 GB RAM
storage: 4-128 GB ephemeral disk

These ranges allow the platform to support both lightweight interactive access and heavier analysis or extraction jobs without assuming a single fixed workload shape.

Pull lifecycle

The intended enclave lifecycle is:

A request is approved and the platform provisions a new enclave.
The enclave receives the approved request contract, secret references, and the minimum secret-resolution inputs needed for that request.
Inside the enclave boundary, datasource credentials are resolved just in time for the pull.
The enclave starts one atomic ingestion phase and pulls all approved datasets in parallel.
Pulled datasets are materialized into the enclave’s in-memory query database.
Temporary credentials, secret-resolution material, and transient pull artifacts are cleared.
Only after the pull completes does the enclave become ready for inbound API and tunnel access.
Interactive and API queries operate on the prepared in-memory data until the enclave is torn down.

In this model, the enclave is not a live pass-through proxy to upstream systems. It prepares a request-scoped data plane first and exposes access only after that preparation step succeeds.

Pull Sequence

sequenceDiagram
    autonumber
    participant CP as Control Plane
    participant DE as Data Enclave
    participant SB as Secrets Backend
    participant DS as Datasources
    participant DB as In-Memory Query DB
    participant IN as Inbound API and Tunnel Edge

    CP->>DE: Provision enclave for approved request
    CP->>DE: Attach request contract and secret references
    CP->>DE: Attach secret-resolution inputs
    DE->>SB: Resolve scoped pull credentials
    par Parallel approved pulls
        DE->>DS: Pull dataset group A
    and
        DE->>DS: Pull dataset group B
    and
        DE->>DS: Pull dataset group N
    end
    DS-->>DE: Return approved data slices
    DE->>DB: Materialize pulled data into in-memory query runtime
    DE->>DE: Clear credentials and temporary pull artifacts
    DE->>IN: Enable readiness and inbound access
    IN->>DB: Serve explorer and API queries from prepared data

Security posture

The security posture of a data enclave should be stricter than that of a general application container.

no inbound access is exposed until the approved ingestion phase has completed
outbound access is limited to explicitly allowed endpoints
IAM or service identity is scoped to least privilege
secret references and secret-resolution inputs are attached only for the approved pull path
resolved credentials exist only for the duration of the ingestion window
lifecycle events are logged so creation, execution, and teardown are traceable

Storage model

By default, data enclaves should not depend on persistent storage.

temporary working data can exist for the lifetime of the enclave
transient pull artifacts can exist only for the duration of ingestion
the primary query surface is backed by in-memory prepared data
long-lived state should remain outside the enclave unless a deployment explicitly requires otherwise

This helps keep the runtime disposable and reduces the amount of residual state left behind after access ends.

Operational implications

Because enclaves are created on demand, operators should think of them as policy-scoped workers rather than long-running application services.

That model supports:

stronger per-request isolation
parallel ingestion across approved datasources without exposing those datasources directly to users
credential clearing before interactive access begins
easier teardown after access expires
better alignment between policy outcomes and actual runtime permissions
cleaner auditability for incident review and compliance reporting

Read Secret Isolation for the trust-boundary model that gates enclave creation.
Read Configure Data Sources for how data sources are registered.
Read Create Your First Policy for how approved access paths are defined.