Enterprise / Deployment

Sizing Redundancy

Browse docs

--- title: "Sizing and Redundancy" description: "Capacity planning, replica sizing, cluster configuration, and high-availability guidance for Aurora Enterprise." icon: "server" ---

Overview

Aurora is designed to be horizontally scalable and stateless. This guide covers capacity planning, cluster configuration, database redundancy, and cache HA for production Enterprise deployments.

Capacity Planning

Aurora's resource requirements depend on request volume, model complexity, and enabled features. These are starting-point guidelines â€” benchmark with your actual workload.

Workload	Requests/s	Recommended Replicas	Memory per Replica	CPU per Replica
Light	< 100	2	256 MB	0.5 core
Moderate	100â€“1,000	2â€“4	512 MB	1 core
Heavy	1,000â€“5,000	4â€“8	1 GB	2 cores
Extreme	5,000+	8+	2 GB	4 cores

Aurora's stateless design means you can scale horizontally freely â€” add replicas behind a load balancer and Aurora handles the rest.

Cluster Configuration

Enterprise adds a cluster control-plane layer for managing multi-node deployments:

yaml

cluster:
  enabled: true
  node_id: "enterprise-node-1"
  node_name: "enterprise-node-1"
  region: "us-east"
  zone: "us-east-a"
  advertise_url: "https://aurora-enterprise.example.com"
  heartbeat_interval_seconds: 30
  failover_mode: "active_passive"

Failover Modes

Mode	Description
`single_node`	No failover. Single instance.
`active_passive`	One active node handles traffic; standby nodes take over on failure.
`active_active`	All nodes handle traffic simultaneously. Requires shared database or cross-region replication.

Node Metadata

The cluster service tracks node identity, region, zone, and heartbeat status. This metadata is available from the admin dashboard and can be used for:

Multi-region routing decisions
Zone-aware deployment patterns
Health monitoring and alerting

High Availability

For production deployments:

Minimum 2 replicas across different availability zones
Health checks â€” Aurora exposes /health for liveness and readiness probes
Graceful shutdown â€” Aurora handles SIGTERM for zero-downtime rolling updates
Stateless design â€” Aurora stores no session state in-process; scale horizontally freely

Kubernetes (Helm)

The Helm chart includes production HA defaults:

yaml

replicaCount: 2
autoscaling:
  enabled: true
  minReplicas: 2
  maxReplicas: 10
  targetCPUUtilizationPercentage: 70
podDisruptionBudget:
  enabled: true
  minAvailable: 1

Database Requirements

Aurora uses PostgreSQL for persistence (metadata, usage, audit logs, identity, budgets).

Environment	PostgreSQL Setup	Notes
Development	SQLite (default)	No separate database needed
Production (single region)	PostgreSQL 14+ primary + replica	Automatic failover recommended
Production (multi-region)	PostgreSQL with logical replication	Each region writes to its own primary

Connection pooling â€” Use PgBouncer or similar for connection management
Backups â€” Regular backups with point-in-time recovery
MongoDB â€” Also supported as an alternative storage backend

Redis Cache

When caching is enabled:

Redis 7+ with Sentinel for HA (standalone mode)
Redis Cluster for larger deployments requiring horizontal scaling
Memory sizing â€” Allocate based on cache TTL, request volume, and response sizes
Regional cache â€” In multi-region setups, each region uses its own Redis (cache miss = acceptable latency hit)

Cross-Region

For multi-region deployment patterns including active-passive and active-active, see Cross-Region Deployment.

← All docs

Enterprise / Deployment

Sizing Redundancy

Browse docs

--- title: "Sizing and Redundancy" description: "Capacity planning, replica sizing, cluster configuration, and high-availability guidance for Aurora Enterprise." icon: "server" ---

Overview

Aurora is designed to be horizontally scalable and stateless. This guide covers capacity planning, cluster configuration, database redundancy, and cache HA for production Enterprise deployments.

Capacity Planning

Aurora's resource requirements depend on request volume, model complexity, and enabled features. These are starting-point guidelines â€” benchmark with your actual workload.

Workload	Requests/s	Recommended Replicas	Memory per Replica	CPU per Replica
Light	< 100	2	256 MB	0.5 core
Moderate	100â€“1,000	2â€“4	512 MB	1 core
Heavy	1,000â€“5,000	4â€“8	1 GB	2 cores
Extreme	5,000+	8+	2 GB	4 cores

Aurora's stateless design means you can scale horizontally freely â€” add replicas behind a load balancer and Aurora handles the rest.

Cluster Configuration

Enterprise adds a cluster control-plane layer for managing multi-node deployments:

yaml

cluster:
  enabled: true
  node_id: "enterprise-node-1"
  node_name: "enterprise-node-1"
  region: "us-east"
  zone: "us-east-a"
  advertise_url: "https://aurora-enterprise.example.com"
  heartbeat_interval_seconds: 30
  failover_mode: "active_passive"

Failover Modes

Mode	Description
`single_node`	No failover. Single instance.
`active_passive`	One active node handles traffic; standby nodes take over on failure.
`active_active`	All nodes handle traffic simultaneously. Requires shared database or cross-region replication.

Node Metadata

The cluster service tracks node identity, region, zone, and heartbeat status. This metadata is available from the admin dashboard and can be used for:

Multi-region routing decisions
Zone-aware deployment patterns
Health monitoring and alerting

High Availability

For production deployments:

Minimum 2 replicas across different availability zones
Health checks â€” Aurora exposes /health for liveness and readiness probes
Graceful shutdown â€” Aurora handles SIGTERM for zero-downtime rolling updates
Stateless design â€” Aurora stores no session state in-process; scale horizontally freely

Kubernetes (Helm)

The Helm chart includes production HA defaults:

yaml

replicaCount: 2
autoscaling:
  enabled: true
  minReplicas: 2
  maxReplicas: 10
  targetCPUUtilizationPercentage: 70
podDisruptionBudget:
  enabled: true
  minAvailable: 1

Database Requirements

Aurora uses PostgreSQL for persistence (metadata, usage, audit logs, identity, budgets).

Environment	PostgreSQL Setup	Notes
Development	SQLite (default)	No separate database needed
Production (single region)	PostgreSQL 14+ primary + replica	Automatic failover recommended
Production (multi-region)	PostgreSQL with logical replication	Each region writes to its own primary

Connection pooling â€” Use PgBouncer or similar for connection management
Backups â€” Regular backups with point-in-time recovery
MongoDB â€” Also supported as an alternative storage backend

Redis Cache

When caching is enabled:

Redis 7+ with Sentinel for HA (standalone mode)
Redis Cluster for larger deployments requiring horizontal scaling
Memory sizing â€” Allocate based on cache TTL, request volume, and response sizes
Regional cache â€” In multi-region setups, each region uses its own Redis (cache miss = acceptable latency hit)

Cross-Region

For multi-region deployment patterns including active-passive and active-active, see Cross-Region Deployment.