Monitoring & Auditing

This guide walks you through setting up monitoring and audit logging for Alien Giraffe. You’ll learn how to track data access, configure alerts, and integrate with common monitoring platforms.

Built-In Monitoring & Auditing

Alien Giraffe provides comprehensive monitoring, alerting, and auditing capabilities out of the box—giving you everything you need for security operations and compliance:

Security Operations: Real-time detection of unauthorized access attempts, suspicious patterns, and anomalous behavior
Compliance: Complete audit trails that meet regulatory requirements (SOC 2, GDPR, HIPAA, PCI-DSS)
Automated Alerting: Intelligent alerts for security events, policy violations, and operational issues
System Health: Built-in performance monitoring and health checks
Access Forensics: Detailed logging of every access request, approval, and data operation
Integration Ready: Export to your existing SIEM, log aggregation, and monitoring platforms

Quick Start: Audit Logging

Let’s start by enabling comprehensive audit logging.

Step 1: Configure Audit Logging

Edit your config.toml:

# Audit logging configuration
[audit]
enabled = true
level = "standard"  # minimal, standard, verbose

# What to log
events = [
  "access_request",
  "access_granted",
  "access_denied",
  "policy_evaluation",
  "authentication",
  "source_connection",
  "query_execution",
  "administrative_action"
]

# Include additional context
includeContext = ["user_ip", "user_agent", "session_info"]
includeSourceData = false  # Don't log actual data for privacy

# Retention
[audit.retention]
default = "365d"
compliance = "7y"  # For compliance-tagged events

# File destination
[[audit.destinations]]
type = "file"
path = "/var/log/a10e/audit.log"

[audit.destinations.rotation]
maxSize = "100MB"
maxAge = "90d"
maxBackups = 10

# Elasticsearch destination
[[audit.destinations]]
type = "elasticsearch"
endpoint = "https://elasticsearch.company.com:9200"
index = "a10e-audit-logs"
credentialsRef = "elasticsearch-creds"

# S3 destination
[[audit.destinations]]
type = "s3"
bucket = "company-audit-logs"
region = "us-west-2"
prefix = "a10e/"
credentialsRef = "s3-audit-creds"

Step 2: Test Audit Logging

# Restart Alien Giraffe to apply config
a10e restart

# Make a test access request
a10e access request \
  --source production-postgres \
  --datasets customers \
  --purpose testing-audit-logs

# View recent audit logs
a10e audit logs --since 5m

# Search for specific events
a10e audit search --event access_request --since 1h

Step 3: Verify Log Destinations

# Check file logs
tail -f /var/log/a10e/audit.log

# Check Elasticsearch (if configured)
curl -X GET "https://elasticsearch.company.com:9200/a10e-audit-logs/_search?pretty"

# Check S3 (if configured)
aws s3 ls s3://company-audit-logs/a10e/

Audit Log Format

Audit logs are structured JSON for easy parsing:

{
  "timestamp": "2025-11-18T14:32:10.123Z",
  "eventType": "access_granted",
  "eventId": "evt-abc123",
  "user": {
    "email": "alice@company.com",
    "userId": "usr-xyz789",
    "teams": ["customer-support", "team-alpha"]
  },
  "source": {
    "name": "production-postgres",
    "type": "postgresql"
  },
  "datasets": ["customers", "orders"],
  "policy": {
    "name": "customer-support-access",
    "namespace": "production"
  },
  "session": {
    "id": "sess-def456",
    "duration": "4h",
    "expiresAt": "2025-11-18T18:32:10Z"
  },
  "context": {
    "ip": "203.0.113.42",
    "userAgent": "a10e-cli/1.0.0",
    "purpose": "customer-inquiry",
    "ticket": "SUPPORT-1234"
  },
  "decision": {
    "granted": true,
    "reason": "Policy customer-support-access matched",
    "evaluatedPolicies": ["customer-support-access"]
  }
}

Metrics and Monitoring

Configure Prometheus Metrics

Alien Giraffe exposes Prometheus metrics for monitoring:

[monitoring]
enabled = true
metricsPort = 9090
metricsPath = "/metrics"

# Metrics to collect
metrics = [
  "access_requests_total",
  "access_granted_total",
  "access_denied_total",
  "policy_evaluations_duration_seconds",
  "source_connections_active",
  "source_query_duration_seconds",
  "authentication_attempts_total",
  "session_duration_seconds"
]

Prometheus Configuration

Add Alien Giraffe as a scrape target:

scrape_configs:
  - job_name: 'alien-giraffe'
    static_configs:
      - targets: ['a10e.company.com:9090']
    scrape_interval: 30s
    metrics_path: /metrics

Key Metrics to Monitor

Access Metrics:

a10e_access_requests_total - Total access requests
a10e_access_granted_total - Successful access grants
a10e_access_denied_total - Denied access attempts
a10e_active_sessions - Currently active sessions

Performance Metrics:

a10e_policy_evaluation_duration_seconds - Policy evaluation time
a10e_source_query_duration_seconds - Query execution time
a10e_api_request_duration_seconds - API response times

System Health:

a10e_source_connections_active - Active database connections
a10e_source_connection_errors_total - Connection failures
a10e_identity_sync_last_success_timestamp - Last successful IdP sync

Security Metrics:

a10e_authentication_failures_total - Failed login attempts
a10e_mfa_required_total - MFA challenges issued
a10e_policy_violations_total - Policy violation attempts

Grafana Dashboards

Import Pre-Built Dashboard

# Download official Grafana dashboard
curl -O https://grafana.com/api/dashboards/18274/revisions/1/download

# Import via Grafana UI:
# 1. Go to Dashboards → Import
# 2. Upload the downloaded JSON file
# 3. Select your Prometheus data source
# 4. Click Import

Dashboard Panels

The official dashboard includes:

Overview:

Total access requests (24h)
Active sessions
Top users by access volume
Top accessed sources

Access Patterns:

Access requests over time
Granted vs denied ratio
Access by team
Access by source

Performance:

Policy evaluation latency (p50, p95, p99)
Query execution time
API response times

Security:

Failed authentication attempts
Policy violations
After-hours access
Anomalous access patterns

Create Custom Dashboard

{
  "dashboard": {
    "title": "Alien Giraffe Custom Dashboard",
    "panels": [
      {
        "title": "Access Requests by User",
        "targets": [
          {
            "expr": "sum(rate(a10e_access_requests_total[5m])) by (user)",
            "legendFormat": "{{user}}"
          }
        ]
      },
      {
        "title": "Denied Access Attempts",
        "targets": [
          {
            "expr": "sum(rate(a10e_access_denied_total[5m])) by (reason)",
            "legendFormat": "{{reason}}"
          }
        ]
      }
    ]
  }
}

Alerting

Configure Alerts

Create alerts/alien-giraffe.yaml:

groups:
  - name: alien-giraffe-alerts
    interval: 30s
    rules:
      # High rate of access denials
      - alert: HighAccessDenialRate
        expr: |
          rate(a10e_access_denied_total[5m]) > 10
        for: 5m
        labels:
          severity: warning
          component: access-control
        annotations:
          summary: "High rate of access denials"
          description: "{{ $value }} access denials per second"

      # Failed authentication attempts
      - alert: AuthenticationFailureSpike
        expr: |
          rate(a10e_authentication_failures_total[5m]) > 5
        for: 2m
        labels:
          severity: critical
          component: authentication
        annotations:
          summary: "Unusual authentication failures"
          description: "Possible brute force attack"

      # Source connection failures
      - alert: SourceConnectionFailure
        expr: |
          rate(a10e_source_connection_errors_total[5m]) > 1
        for: 5m
        labels:
          severity: critical
          component: data-sources
        annotations:
          summary: "Data source connection failures"
          description: "Source {{ $labels.source }} is failing"

      # Slow policy evaluation
      - alert: SlowPolicyEvaluation
        expr: |
          histogram_quantile(0.95,
            rate(a10e_policy_evaluation_duration_seconds_bucket[5m])
          ) > 1.0
        for: 10m
        labels:
          severity: warning
          component: performance
        annotations:
          summary: "Policy evaluation is slow"
          description: "P95 latency: {{ $value }}s"

      # IdP sync failures
      - alert: IdentitySyncFailure
        expr: |
          time() - a10e_identity_sync_last_success_timestamp > 3600
        for: 5m
        labels:
          severity: warning
          component: identity
        annotations:
          summary: "Identity provider sync failing"
          description: "No successful sync in over 1 hour"

      # Unusual after-hours access
      - alert: AfterHoursAccess
        expr: |
          sum(rate(a10e_access_granted_total{time_window="after-hours"}[5m])) > 5
        labels:
          severity: info
          component: security
        annotations:
          summary: "Unusual after-hours access detected"

Alertmanager Configuration

Configure alert routing and notifications:

route:
  group_by: ['alertname', 'severity']
  group_wait: 10s
  group_interval: 10s
  repeat_interval: 12h
  receiver: 'default'

  routes:
    # Critical security alerts
    - match:
        severity: critical
        component: authentication
      receiver: 'security-team'
      continue: true

    # Source connection issues
    - match:
        component: data-sources
      receiver: 'infrastructure-team'

receivers:
  - name: 'default'
    slack_configs:
      - api_url: 'https://hooks.slack.com/services/...'
        channel: '#alien-giraffe-alerts'

  - name: 'security-team'
    slack_configs:
      - api_url: 'https://hooks.slack.com/services/...'
        channel: '#security-alerts'
    pagerduty_configs:
      - service_key: 'your-pagerduty-key'

  - name: 'infrastructure-team'
    email_configs:
      - to: 'infrastructure@company.com'

Log Analysis

Search Audit Logs

# Search by user
a10e audit search --user alice@company.com --since 7d

# Search by source
a10e audit search --source production-postgres --since 24h

# Search by event type
a10e audit search --event access_denied --since 1w

# Complex search
a10e audit search \
  --user alice@company.com \
  --source production-postgres \
  --event access_granted \
  --since 30d \
  --format json

Generate Reports

# Access summary report
a10e audit report access-summary \
  --start 2025-11-01 \
  --end 2025-11-30 \
  --output report.pdf

# User activity report
a10e audit report user-activity \
  --user alice@company.com \
  --since 90d \
  --format csv

# Compliance report
a10e audit report compliance \
  --type sox \
  --quarter Q4-2025 \
  --output sox-q4-2025.pdf

Anomaly Detection

Configure anomaly detection rules:

[audit.anomalyDetection]
enabled = true

[[audit.anomalyDetection.rules]]
name = "unusual-time-access"
condition = "access outside user's typical hours"
action = "alert"
notify = ["security@company.com"]

[[audit.anomalyDetection.rules]]
name = "unusual-location"
condition = "access from new geographic region"
action = "require_approval"
notify = ["user", "manager"]

[[audit.anomalyDetection.rules]]
name = "high-volume-access"
condition = "queries > 10x user's average"
action = "alert"
notify = ["security@company.com", "dpo@company.com"]

[[audit.anomalyDetection.rules]]
name = "sensitive-dataset-first-access"
condition = "first-time access to PII dataset"
action = "require_approval"
notify = ["manager", "data-privacy-officer"]

Integration Examples

Elasticsearch Integration

[[audit.destinations]]
type = "elasticsearch"
endpoint = "https://elasticsearch.company.com:9200"
index = "a10e-audit-logs"
credentialsRef = "elasticsearch-creds"

# Index template settings
[audit.destinations.template.settings]
number_of_shards = 3
number_of_replicas = 2

# Index template mappings
[audit.destinations.template.mappings.properties.timestamp]
type = "date"

[audit.destinations.template.mappings.properties."user.email"]
type = "keyword"

[audit.destinations.template.mappings.properties."source.name"]
type = "keyword"

[audit.destinations.template.mappings.properties.eventType]
type = "keyword"

Query logs in Elasticsearch:

# Search for access denials
curl -X GET "https://elasticsearch.company.com:9200/a10e-audit-logs/_search" \
  -H 'Content-Type: application/json' \
  -d '{
    "query": {
      "bool": {
        "must": [
          { "term": { "eventType": "access_denied" }},
          { "range": { "timestamp": { "gte": "now-24h" }}}
        ]
      }
    }
  }'

Splunk Integration

[[audit.destinations]]
type = "splunk"
endpoint = "https://splunk.company.com:8088"
token = "${SPLUNK_HEC_TOKEN}"
index = "a10e_audit"
sourcetype = "alien_giraffe:audit"

Splunk query examples:

# Access requests by user
index=a10e_audit eventType=access_request
| stats count by user.email
| sort -count

# Denied access attempts
index=a10e_audit eventType=access_denied
| timechart span=1h count by source.name

# Policy violations
index=a10e_audit eventType=policy_violation
| table timestamp, user.email, source.name, decision.reason

Datadog Integration

[monitoring.datadog]
enabled = true
apiKey = "${DATADOG_API_KEY}"
site = "datadoghq.com"
tags = ["env:production", "service:alien-giraffe"]

Compliance Reporting

SOC 2 Compliance

Generate SOC 2 audit reports:

# Access control evidence
a10e audit report compliance \
  --type soc2 \
  --control CC6.1 \
  --period 2025-Q4 \
  --output soc2-cc6-1-q4.pdf

# User access reviews
a10e audit report access-review \
  --reviewers managers \
  --period quarterly \
  --output access-review-q4.csv

# Data subject access request (DSAR)
a10e audit dsar \
  --email alice@company.com \
  --start 2025-01-01 \
  --end 2025-12-31 \
  --output alice-dsar-2025.pdf

# Right to be forgotten - check data access
a10e audit search \
  --datasets users_pii \
  --containing "alice@company.com" \
  --since 90d

HIPAA Compliance

# PHI access audit trail
a10e audit report compliance \
  --type hipaa \
  --datasets patient_records,medical_data \
  --period 2025-Q4 \
  --output hipaa-audit-q4.pdf

Best Practices

Audit Logging

Log to multiple destinations: File + centralized (Elasticsearch/Splunk)
Retain logs appropriately: 1-7 years depending on regulations
Encrypt audit logs: Both in transit and at rest
Immutable storage: Use write-once storage for compliance
Regular reviews: Review audit logs weekly/monthly

Monitoring

Set appropriate thresholds: Avoid alert fatigue
Create runbooks: Document response procedures for alerts
Monitor the monitors: Ensure monitoring systems are healthy
Test alerts: Regularly verify alerts fire correctly
Dashboard hygiene: Keep dashboards up to date

Performance

Index audit logs: Ensure fast searching
Archive old logs: Move to cheaper storage after retention period
Optimize queries: Use indexed fields in searches
Batch log shipping: Don’t send logs one by one

Troubleshooting

Logs Not Appearing

# Check audit logging is enabled
a10e config get audit.enabled

# Check log destinations
a10e config get audit.destinations

# Test log destination connectivity
a10e audit test-destination elasticsearch

# View recent errors
a10e logs --component audit --level error --since 1h

Metrics Not Collected

# Verify metrics endpoint
curl http://localhost:9090/metrics

# Check Prometheus can scrape
curl http://localhost:9090/api/v1/targets

# Test metrics collection
a10e metrics test

Alert Not Firing

# Check alert rules are loaded
curl http://localhost:9093/api/v1/rules

# Check alert is active
curl http://localhost:9093/api/v1/alerts

# Test alert manually
a10e alert test HighAccessDenialRate

Next Steps

Now that monitoring is configured:

Security Best Practices - Harden your deployment
Access Component - Understand what’s being logged
Compliance Guide - Meet regulatory requirements
Performance Tuning - Optimize based on metrics

For integration guides: