Subjects

Subjects represent who can access data in the Alien Giraffe access control model. This component implements identity management, authentication, and team-based access control. Policies reference subjects in their subjects: field to specify which users or teams are granted access.

Relationship to Policies

Subjects are one of the five core components that policies coordinate. When you define a policy, the subjects: field specifies who is authorized to access data. This component provides the infrastructure for managing those identities—verifying who users are, organizing them into teams, and enforcing authentication requirements.

Overview

Subjects encompass all entities that can request access to data—individual users, teams, service accounts, and applications. The subject system handles identity verification, team membership management, and authentication, ensuring that access requests come from legitimate, authorized entities.

Core Principles:

Identity Verification - Authenticate users through SSO, MFA, and identity providers
Team-Based Organization - Group users into teams with inherited permissions
Role-Based Access Control (RBAC) - Assign permissions based on organizational roles
Comprehensive Auditing - Track every authentication and access attempt

Key Concepts

Authentication vs Authorization

Authentication answers “Who are you?”

Verifies user identity through IAM integration
Supports Okta, Microsoft Entra ID, Google Workspace
Authenticates via OIDC, SAML, or LDAP protocols
Supports multi-factor authentication (MFA)
Issues session tokens after successful login

Authorization answers “What can you do?”

Evaluates policies against user identity and context
Checks permissions for specific data sources and datasets
Enforces time-based and purpose-based restrictions
Returns approved/denied decision with reasoning

Access Decision Flow

Every access request follows this evaluation flow:

graph LR
    A[User Request] --> B[Authentication]
    B --> C[Context Gathering]
    C --> D[Policy Evaluation]
    D --> E{Decision}
    E -->|Approved| F[Grant Access]
    E -->|Denied| G[Deny Access]
    F --> H[Audit Log]
    G --> H

Loading diagram...

User authenticates - Proves their identity
Context gathered - Current time, location, requested data, stated purpose
Policies evaluated - All matching policies checked
Decision made - Approve with temporary credentials or deny
Action logged - Full audit trail recorded

Role-Based Access Control (RBAC)

Organize users into roles with predefined permissions:

Teams/Roles:

customer-support - Access customer data during business hours
engineering - Access development and staging databases
data-analysts - Read-only access to analytics databases
sre - Emergency production access with approval
contractors - Limited scope with strict time bounds

Users inherit permissions from all teams they belong to. Permissions are additive - if any policy grants access, it’s allowed (unless explicitly denied).

Identity Provider Integration

Subjects are sourced from your organization’s Identity and Access Management (IAM) systems. Instead of manually managing users in Alien Giraffe, subjects are automatically synchronized from your existing identity providers.

Supported IAM Systems

Enterprise Identity Providers:

Okta - Enterprise SSO and identity management
Microsoft Entra ID (Azure AD) - Microsoft’s cloud identity platform
Google Workspace - Google’s identity and directory services

On-Premises/Hybrid:

Active Directory - Windows domain services (via LDAP)
LDAP - Lightweight Directory Access Protocol

Authentication Protocols:

OIDC (OpenID Connect) - Modern OAuth 2.0-based authentication
SAML 2.0 - Enterprise federated identity standard
LDAP - Directory services protocol for on-premises systems

How It Works

User Authentication: Users authenticate via SSO using your IAM system
Identity Verification: IAM system confirms user identity and attributes
Subject Creation: Alien Giraffe creates/updates subject record
Team Synchronization: User’s IAM groups map to Alien Giraffe teams
Policy Evaluation: Subject is matched against policy subjects: field

Configuration Example: Okta Integration

apiVersion: v1
kind: SubjectSync
metadata:
  name: okta-subject-sync
  namespace: production
spec:
  identityProvider:
    type: oidc
    provider: okta
    issuer: https://company.okta.com
    clientId: alien-giraffe-prod
    clientSecretRef: okta-client-secret

  subjectMapping:
    # Map Okta user attributes to subjects
    - claim: email
      target: subject.email
      required: true

    - claim: given_name
      target: subject.firstName

    - claim: family_name
      target: subject.lastName

    - claim: groups
      target: subject.teams
      transform: lowercase

  teamSync:
    enabled: true
    filter: "a10e-*"              # Only sync groups starting with a10e-
    schedule: "*/15 * * * *"      # Sync every 15 minutes

  authentication:
    ssoEnabled: true
    mfaRequired: true             # Enforce MFA for all subjects

Configuration Example: Microsoft Entra ID

apiVersion: v1
kind: SubjectSync
metadata:
  name: entra-subject-sync
  namespace: production
spec:
  identityProvider:
    type: saml
    provider: azure-ad
    entityId: https://a10e.company.com
    ssoURL: https://login.microsoftonline.com/tenant-id/saml2

  subjectMapping:
    - claim: http://schemas.xmlsoap.org/ws/2005/05/identity/claims/emailaddress
      target: subject.email

    - claim: http://schemas.microsoft.com/ws/2008/06/identity/claims/groups
      target: subject.teams

  teamSync:
    enabled: true
    groupMapping:
      # Map Entra ID group IDs to team names
      - entraGroupId: "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
        team: engineering
      - entraGroupId: "b2c3d4e5-f6a7-8901-bcde-f12345678901"
        team: data-science

Configuration Example: Google Workspace

apiVersion: v1
kind: SubjectSync
metadata:
  name: google-workspace-sync
  namespace: production
spec:
  identityProvider:
    type: oidc
    provider: google
    hostedDomain: company.com       # Restrict to company domain
    clientId: google-oauth-client-id
    clientSecretRef: google-oauth-secret

  subjectMapping:
    - claim: email
      target: subject.email
      required: true
      validate: "@company.com$"     # Ensure company email

    - claim: hd
      target: subject.domain

    - claim: given_name
      target: subject.firstName

    - claim: family_name
      target: subject.lastName

  teamSync:
    enabled: true
    apiCredentialsRef: google-service-account
    orgUnit: /                      # Sync all organizational units
    schedule: "0 * * * *"           # Hourly sync

Team Membership Synchronization

IAM groups automatically map to Alien Giraffe teams:

Direct Mapping:

Okta group engineering → Team engineering
Entra ID group data-scientists → Team data-science
Google Workspace group customer-support@company.com → Team customer-support

Pattern-Based Mapping:

Prefix filtering: Only sync groups matching a10e-*
Transform: a10e-backend-team → backend

Real-Time Sync:

Team membership changes propagate within minutes
User joins IAM group → Automatically added to team → Policies apply
User leaves IAM group → Removed from team → Access revoked

For detailed IAM integration configuration, including HR system synchronization and lifecycle management, see the Context component.

Features

Multi-Factor Authentication (MFA)

Require additional verification for sensitive access:

MFA Methods:

TOTP - Time-based one-time passwords (Google Authenticator, Authy)
Hardware Keys - YubiKey, USB security keys
Push Notifications - Mobile app approval (Duo, Okta Verify)
SMS/Email - Backup verification codes

When to Require MFA:

Production database access
PII or financial data
Emergency break-glass access
Privileged operations (DDL, schema changes)

Session Management

Access sessions are time-bound and monitored:

Session Properties:

Duration - Automatic expiration (typically 1-8 hours)
Idle Timeout - Revoke if inactive for specified time
Concurrent Sessions - Limit multiple active sessions
Renewal - Optional extension with re-authentication

Session Lifecycle:

Create Session → Active → Idle Warning → Expired → Revoked

Context-Aware Policies

Access decisions consider rich contextual information:

Context Dimensions:

Time - Current time, day of week, business hours
Location - IP address, geographic region, VPN status
Device - Managed vs unmanaged, OS version, compliance status
Purpose - Stated reason (customer support, debugging, analytics)
Risk Score - Anomaly detection, unusual patterns

Example: Production database access might be denied outside business hours unless the user is on-call and accessing from a managed device on the corporate VPN.

Audit Logging

Comprehensive tracking of all access activity:

What’s Logged:

Access Requests - User, time, requested data, purpose
Policy Decisions - Which policies applied, approval/denial reasoning
Data Operations - Queries executed, data retrieved, modifications made
Authentication Events - Login attempts, MFA challenges, session creation
Administrative Actions - Policy changes, configuration updates

Audit Log Format:

{
  "timestamp": "2025-11-18T14:32:10Z",
  "event_type": "access_request",
  "user": "alice@company.com",
  "teams": ["customer-support"],
  "source": "production-db",
  "datasets": ["customers", "orders"],
  "channel": "sql",
  "purpose": "customer-support",
  "decision": "approved",
  "policies_applied": ["customer-support-access"],
  "session_id": "sess-abc123",
  "session_duration": "4h",
  "context": {
    "ip": "203.0.113.42",
    "location": "US-CA",
    "device": "managed-laptop",
    "time": "business-hours"
  }
}

Temporary Credentials

Access is granted through short-lived credentials:

Credential Types:

Database Credentials - Temporary username/password pairs
API Tokens - Bearer tokens with limited scope
Signed URLs - Pre-signed S3/GCS URLs for object storage
Service Account Keys - Temporary cloud provider credentials

Credential Properties:

Auto-generated, never reused
Expires with session
Scoped to approved datasets only
Rotated on policy changes
Revoked on session termination

Configuration Examples

Basic Access Configuration

Define access requirements for a data source:

apiVersion: v1
kind: AccessConfig
metadata:
  name: production-db-access
  namespace: production
spec:
  source: production-db
  authentication:
    required: true
    methods: [oidc, saml]
  authorization:
    engine: policy-based
    defaultDeny: true           # Deny by default, require explicit policy
  session:
    maxDuration: 4h
    idleTimeout: 30m
    renewalAllowed: true
  audit:
    level: verbose
    retention: 90d
    destinations:
      - type: s3
        bucket: audit-logs
      - type: elasticsearch
        index: access-logs

MFA Requirements

Enforce multi-factor authentication for sensitive data:

apiVersion: v1
kind: MFAPolicy
metadata:
  name: production-mfa-requirement
  namespace: production
spec:
  applies_to:
    sources: [production-db, production-s3]
    dataClassifications: [pii, financial]
  mfa:
    required: true
    methods: [totp, hardware-key]
    graceperiod: 0              # No grace period for production
  exceptions:
    - application: monitoring-system  # Service accounts exempt

IP Allowlist

Restrict access to specific network locations:

apiVersion: v1
kind: NetworkPolicy
metadata:
  name: corporate-network-only
  namespace: production
spec:
  sources: [production-db, production-redis]
  allowlist:
    - cidr: 203.0.113.0/24      # Corporate office
    - cidr: 198.51.100.0/24     # VPN gateway
    - cidr: 192.0.2.10/32       # Specific trusted server
  vpnRequired: true
  message: "Production access requires corporate VPN connection"

Anomaly Detection

Alert on unusual access patterns:

apiVersion: v1
kind: AnomalyDetection
metadata:
  name: production-anomaly-detection
spec:
  sources: [production-db]
  rules:
    - name: unusual-time
      condition: access outside user's typical hours
      action: require_approval
      notify: [security@company.com]

    - name: unusual-location
      condition: access from new geographic region
      action: require_mfa
      notify: [user, manager]

    - name: unusual-volume
      condition: queries > 10x user's average
      action: alert
      notify: [security@company.com, dpo@company.com]

    - name: sensitive-dataset
      condition: first-time access to PII dataset
      action: require_approval
      notify: [manager, data-privacy-officer]

Best Practices

Implement Defense in Depth

Layer multiple security controls:

Network Security - VPN, IP allowlists, private networks
Authentication - Strong passwords, SSO, MFA
Authorization - Least privilege policies, time-bounds
Monitoring - Audit logs, anomaly detection, alerts
Data Protection - Encryption in transit and at rest

Require MFA for Production

Always enforce MFA for production data:

Protects against credential theft
Adds human verification step
Satisfies compliance requirements (SOC 2, ISO 27001)
Provides stronger audit trail

Monitor Access Patterns

Regularly review access logs:

Identify unused permissions (candidates for removal)
Detect unusual access patterns
Track compliance with policies
Generate access reports for auditors

Set Appropriate Session Durations

Balance usability with security:

Short sessions (1-2h) - Sensitive data, PII, production writes
Medium sessions (4-8h) - Read-only analytics, development databases
Long sessions (24h) - Service accounts, automated pipelines

Implement Break-Glass Procedures

Plan for emergencies:

Define emergency access policies
Require strong justification
Enable verbose auditing
Notify security team immediately
Review emergency access weekly

Test Access Revocation

Verify that access is properly revoked:

Sessions expire at specified duration
Credentials become invalid after expiration
Employee offboarding triggers immediate revocation
Policy changes revoke affected sessions

Common Patterns

Read-Only Replica Access

Grant analysts access to read replicas, not production:

spec:
  subjects:
    - team: analytics
  resources:
    - source: analytics-replica  # Read replica, not production
      permissions: [SELECT]       # Explicitly read-only
  channels:
    - name: sql
      operation: r
  session:
    maxDuration: 24h              # Longer sessions for analysis

Approval Workflow

Require manager approval for sensitive access:

spec:
  resources:
    - source: production-db
      datasets: [users_pii]
  channels:
    - name: sql
      operation: rw
  approval:
    required: true
    approvers: [manager, data-privacy-officer]
    ttl: 1h                       # Approval expires
    escalation:
      after: 30m
      to: [senior-manager]

Emergency Access

Break-glass access with enhanced auditing:

spec:
  subjects:
    - team: on-call
  resources:
    - source: production-db
  channels:
    - name: sql
      operation: rw
  session:
    maxDuration: 1h               # Short emergency sessions
  audit:
    level: verbose
    notify:
      immediate: [security@company.com]
      slack: "#security-alerts"
  mfa:
    required: true

Policies - Centralize subject definitions with other access control components
Resources - Define what data subjects can access
Constraints - Set temporal limits on subject access
Channels - Specify how subjects access data
Context - Provide identity and organizational context for subjects