Benchmarks
Alien Giraffe benchmarks measure two things separately:
- Request lifecycle performance: how fast the platform can create, approve, activate, and warm a request-scoped environment.
- Query and load performance: how fast the active environment can execute dataset queries, including joins across datasets from different sources.
The latest artifacts referenced on this page come from the runs generated on March 30, 2026:
- Request benchmark:
artifacts/request-bench/request-20260331-021935 - Load benchmark:
artifacts/request-bench/load-20260331-021939
Benchmark design
Section titled “Benchmark design”Request lifecycle benchmark
Section titled “Request lifecycle benchmark”The request benchmark creates a template-style request, approves it, activates it in benchmark mode, waits until the first query succeeds, and then records per-query timings.
The benchmark sequence is:
- Obtain requester and admin credentials.
- Build a request payload from all currently catalogued datasets.
- Create the request.
- Approve the request.
- Activate the request-scoped environment with benchmark resources.
- Poll until the first rendered query succeeds.
- Execute the benchmark query set and persist the run report.
This makes the benchmark intentionally end-to-end. It does not only measure SQL execution. It includes request construction from the current catalog, approval, environment activation, warm-up, and the first usable query response.
Load test
Section titled “Load test”The load benchmark reuses an already active request and runs concurrent query load against it. In the latest run it swept these concurrency levels:
1, 10, 25, 50, 100, 250, 500, 1000
Each concurrency level represents that many concurrent users issuing the full rendered query set against the same active request environment. For each level the benchmark records:
- Total successful and failed queries
- Average latency
- P50 latency
- P95 latency
- Throughput in queries per second
- Error rate
This load profile is meant to show saturation behavior rather than only best-case latency:
- Low concurrency shows the base query cost once the environment is active.
- Mid-range concurrency shows how quickly throughput scales up.
- High concurrency shows when queueing and contention begin to dominate latency.
- Error rate shows whether the system degrades by slowing down or by failing queries.
Teardown
Section titled “Teardown”After measurement, the benchmark request environment is short-lived and can be revoked and removed. That keeps the benchmark aligned with the same temporary-access model used by the platform itself.
Workload shape
Section titled “Workload shape”The benchmark request is built over the currently approved and catalogued datasets that expose schema-backed columns. The latest request benchmark assembled a single request across:
- 3 datasources
- 3 datasets
- 19 columns
The latest query workload covered:
- 3 distinct query scopes
- 3 distinct datasets touched by the queries
- 0 cross-source query scopes
- 3 single-source query scopes
The broader request scope may include additional catalogued datasources, but the benchmark claims on this page are limited to the dataset scopes actually exercised by the rendered query workload.
That means the benchmark is not measuring a single connector in isolation. It is measuring end-to-end request provisioning over a mixed-source request scope and then executing SQL inside the active request environment.
This workload is deliberately small in query count and broad in system coverage:
- It verifies single-source access to participating dataset families.
- It verifies cross-source query execution inside the same temporary query surface.
- It keeps the query text deterministic so that lifecycle and concurrency effects are easier to compare between runs.
Latest request benchmark results
Section titled “Latest request benchmark results”Run timestamp: March 30, 2026 at 19:19 PDT
Request lifecycle
Section titled “Request lifecycle”| Stage | Result |
|---|---|
| Request create latency | 6 ms |
| Request approve latency | 14 ms |
| Environment ready latency | 3084 ms |
| Startup to first successful query | 3084 ms |
Query timings
Section titled “Query timings”All query scopes completed successfully.
| Metric | Result |
|---|---|
| Query scopes measured | 3 |
| Total benchmark queries | 6 |
| Average scope latency | 3.83 ms |
| Median scope latency | 4.00 ms |
| Success rate | 100.00% |
| Failed queries | 0 |
The main point from this run is not just low per-query latency. It is that the request became query-ready in about 3.1 seconds while spanning a broad mixed-source catalog, and the benchmark query set completed immediately once the environment was warm.
This splits the performance story into two phases:
- Provisioning cost: request creation, approval, activation, and warm-up
- Steady-state query cost: repeated SQL execution after the environment is ready
Latest load benchmark results
Section titled “Latest load benchmark results”Run timestamp: March 30, 2026 at 19:19 PDT
The load test executed the same rendered queries at each concurrency level against the active benchmark request.
| Concurrency | Avg latency | P95 latency | Throughput | Error rate |
|---|---|---|---|---|
| 1 | 4.00 ms | 4 ms | 229.56 qps | 0.00% |
| 10 | 44.13 ms | 150 ms | 172.59 qps | 0.00% |
| 25 | 74.94 ms | 207 ms | 262.73 qps | 0.00% |
| 50 | 146.11 ms | 248 ms | 301.73 qps | 0.00% |
| 100 | 294.21 ms | 572 ms | 293.18 qps | 0.00% |
| 250 | 700.11 ms | 1447 ms | 305.26 qps | 0.00% |
| 500 | 1404.24 ms | 3004 ms | 301.19 qps | 0.00% |
| 1000 | 3239.71 ms | 7097 ms | 257.42 qps | 0.00% |
Key takeaways from the latest run:
- Peak measured throughput was 305.26 qps at 250 concurrent users.
- Error rate remained 0% through 1000 concurrent users.
- The highest measured concurrency was 1000 users with 257.42 qps throughput.
- Latency increased as concurrency rose, which shows the platform staying available under heavier load while moving into the saturation region.
Load graphs
Section titled “Load graphs”Average latency across the latest concurrency sweep.
Throughput across the latest concurrency sweep.
Error rate across the latest concurrency sweep.
What these results demonstrate
Section titled “What these results demonstrate”The current benchmark results show that Alien Giraffe can:
- Build a request-scoped environment from a multi-source catalog automatically
- Query datasets from different backends through one active request
- Execute cross-source joins across those datasets when the approved scope allows it
- Sustain high query throughput on the benchmark workload with stable error behavior in the latest run
That combination is the important capability: the platform is not just brokering isolated point queries, it is standing up a unified temporary query surface over approved datasets from multiple systems.