Skip to content

Benchmarks

Alien Giraffe benchmarks measure two things separately:

  • Request lifecycle performance: how fast the platform can create, approve, activate, and warm a request-scoped environment.
  • Query and load performance: how fast the active environment can execute dataset queries, including joins across datasets from different sources.

The latest artifacts referenced on this page come from the runs generated on March 30, 2026:

  • Request benchmark: artifacts/request-bench/request-20260331-021935
  • Load benchmark: artifacts/request-bench/load-20260331-021939

The request benchmark creates a template-style request, approves it, activates it in benchmark mode, waits until the first query succeeds, and then records per-query timings.

The benchmark sequence is:

  1. Obtain requester and admin credentials.
  2. Build a request payload from all currently catalogued datasets.
  3. Create the request.
  4. Approve the request.
  5. Activate the request-scoped environment with benchmark resources.
  6. Poll until the first rendered query succeeds.
  7. Execute the benchmark query set and persist the run report.

This makes the benchmark intentionally end-to-end. It does not only measure SQL execution. It includes request construction from the current catalog, approval, environment activation, warm-up, and the first usable query response.

The load benchmark reuses an already active request and runs concurrent query load against it. In the latest run it swept these concurrency levels:

1, 10, 25, 50, 100, 250, 500, 1000

Each concurrency level represents that many concurrent users issuing the full rendered query set against the same active request environment. For each level the benchmark records:

  • Total successful and failed queries
  • Average latency
  • P50 latency
  • P95 latency
  • Throughput in queries per second
  • Error rate

This load profile is meant to show saturation behavior rather than only best-case latency:

  • Low concurrency shows the base query cost once the environment is active.
  • Mid-range concurrency shows how quickly throughput scales up.
  • High concurrency shows when queueing and contention begin to dominate latency.
  • Error rate shows whether the system degrades by slowing down or by failing queries.

After measurement, the benchmark request environment is short-lived and can be revoked and removed. That keeps the benchmark aligned with the same temporary-access model used by the platform itself.

The benchmark request is built over the currently approved and catalogued datasets that expose schema-backed columns. The latest request benchmark assembled a single request across:

  • 3 datasources
  • 3 datasets
  • 19 columns

The latest query workload covered:

  • 3 distinct query scopes
  • 3 distinct datasets touched by the queries
  • 0 cross-source query scopes
  • 3 single-source query scopes

The broader request scope may include additional catalogued datasources, but the benchmark claims on this page are limited to the dataset scopes actually exercised by the rendered query workload.

That means the benchmark is not measuring a single connector in isolation. It is measuring end-to-end request provisioning over a mixed-source request scope and then executing SQL inside the active request environment.

This workload is deliberately small in query count and broad in system coverage:

  • It verifies single-source access to participating dataset families.
  • It verifies cross-source query execution inside the same temporary query surface.
  • It keeps the query text deterministic so that lifecycle and concurrency effects are easier to compare between runs.

Run timestamp: March 30, 2026 at 19:19 PDT

StageResult
Request create latency6 ms
Request approve latency14 ms
Environment ready latency3084 ms
Startup to first successful query3084 ms

All query scopes completed successfully.

MetricResult
Query scopes measured3
Total benchmark queries6
Average scope latency3.83 ms
Median scope latency4.00 ms
Success rate100.00%
Failed queries0

The main point from this run is not just low per-query latency. It is that the request became query-ready in about 3.1 seconds while spanning a broad mixed-source catalog, and the benchmark query set completed immediately once the environment was warm.

This splits the performance story into two phases:

  • Provisioning cost: request creation, approval, activation, and warm-up
  • Steady-state query cost: repeated SQL execution after the environment is ready

Run timestamp: March 30, 2026 at 19:19 PDT

The load test executed the same rendered queries at each concurrency level against the active benchmark request.

ConcurrencyAvg latencyP95 latencyThroughputError rate
14.00 ms4 ms229.56 qps0.00%
1044.13 ms150 ms172.59 qps0.00%
2574.94 ms207 ms262.73 qps0.00%
50146.11 ms248 ms301.73 qps0.00%
100294.21 ms572 ms293.18 qps0.00%
250700.11 ms1447 ms305.26 qps0.00%
5001404.24 ms3004 ms301.19 qps0.00%
10003239.71 ms7097 ms257.42 qps0.00%

Key takeaways from the latest run:

  • Peak measured throughput was 305.26 qps at 250 concurrent users.
  • Error rate remained 0% through 1000 concurrent users.
  • The highest measured concurrency was 1000 users with 257.42 qps throughput.
  • Latency increased as concurrency rose, which shows the platform staying available under heavier load while moving into the saturation region.

The current benchmark results show that Alien Giraffe can:

  • Build a request-scoped environment from a multi-source catalog automatically
  • Query datasets from different backends through one active request
  • Execute cross-source joins across those datasets when the approved scope allows it
  • Sustain high query throughput on the benchmark workload with stable error behavior in the latest run

That combination is the important capability: the platform is not just brokering isolated point queries, it is standing up a unified temporary query surface over approved datasets from multiple systems.