Browse docs
--- title: "Sizing and Redundancy" description: "Capacity planning, replica sizing, cluster configuration, and high-availability guidance for Aurora Enterprise." icon: "server" ---
Overview
Aurora is designed to be horizontally scalable and stateless. This guide covers capacity planning, cluster configuration, database redundancy, and cache HA for production Enterprise deployments.
Capacity Planning
Aurora's resource requirements depend on request volume, model complexity, and enabled features. These are starting-point guidelines — benchmark with your actual workload.
Aurora's stateless design means you can scale horizontally freely — add replicas behind a load balancer and Aurora handles the rest.
Cluster Configuration
Enterprise adds a cluster control-plane layer for managing multi-node deployments:
cluster:
enabled: true
node_id: "enterprise-node-1"
node_name: "enterprise-node-1"
region: "us-east"
zone: "us-east-a"
advertise_url: "https://aurora-enterprise.example.com"
heartbeat_interval_seconds: 30
failover_mode: "active_passive"Failover Modes
Node Metadata
The cluster service tracks node identity, region, zone, and heartbeat status. This metadata is available from the admin dashboard and can be used for:
- Multi-region routing decisions
- Zone-aware deployment patterns
- Health monitoring and alerting
High Availability
For production deployments:
- Minimum 2 replicas across different availability zones
- Health checks — Aurora exposes
/healthfor liveness and readiness probes - Graceful shutdown — Aurora handles SIGTERM for zero-downtime rolling updates
- Stateless design — Aurora stores no session state in-process; scale horizontally freely
Kubernetes (Helm)
The Helm chart includes production HA defaults:
replicaCount: 2
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 10
targetCPUUtilizationPercentage: 70
podDisruptionBudget:
enabled: true
minAvailable: 1Database Requirements
Aurora uses PostgreSQL for persistence (metadata, usage, audit logs, identity, budgets).
- Connection pooling — Use PgBouncer or similar for connection management
- Backups — Regular backups with point-in-time recovery
- MongoDB — Also supported as an alternative storage backend
Redis Cache
When caching is enabled:
- Redis 7+ with Sentinel for HA (standalone mode)
- Redis Cluster for larger deployments requiring horizontal scaling
- Memory sizing — Allocate based on cache TTL, request volume, and response sizes
- Regional cache — In multi-region setups, each region uses its own Redis (cache miss = acceptable latency hit)
Cross-Region
For multi-region deployment patterns including active-passive and active-active, see Cross-Region Deployment.