Start the AI Gateway in One Command.

OpenAI & Anthropic compatible. One API for every LLM provider.

OpenAI-compatible API

Anthropic-compatible API

30+ LLM providers

Self-hosted OSS

View Docs OSS vs Enterprise

OSS

$npx -y iaurora

[ OUR NUMBERS AT A GLANCE ]

$SUCCESS_RATE64.01%

$THROUGHPUT3,130 REQ/S

$PEAK_MEMORY59 MB

[ Over 1,000+ Teams Use Aurora ]

Figma

Framer

Slack

Trello

Twitch

Dribbble

Shopify

Spotify

Vercel

Notion

Miro

GitHub

[ Engineering Performance ]

Built for raw speed & efficiency

Aurora Gateway is engineered in Go and validated with external side-by-side load tests against Bifrost using the same host, payload, and mocked upstream.

Success Rate+5.85 POINTS

Aurora64.01%

Bifrost58.16%

LiteLLM-

Memory Efficiency47% LESS

Aurora59 MB

Bifrost113 MB

LiteLLM372 MB

Throughput+242 RPS

Aurora3,130/S

Bifrost2,888/S

LiteLLM475/S

P50 Latency20MS LOWER

Aurora709MS

Bifrost730MS

LiteLLM38.65S

External Side-by-Side Results

Metric	Aurora	Bifrost	Winner
Successful requests	192,026	174,471	Aurora
Success rate	64.01%	58.16%	Aurora
Throughput	3,130.18/s	2,888.09/s	Aurora
Mean latency	833.14ms	912.47ms	Aurora
P50 latency	709.16ms	729.60ms	Aurora
P99 latency	2.155s	2.107s	Bifrost
Peak memory	59.47MB	113.11MB	Aurora

5000 requested RPS for 60 seconds using the same external load generator, same mocked OpenAI-compatible upstream, 25 warmup requests, and a 256-request prewarm.

Tested Under Real Gateway Pressure

Aurora was measured side-by-side against Bifrost with the same host, traffic profile, model payload, and OpenAI-compatible upstream.

Same Traffic

5000 requested RPS for 60 seconds

Same Upstream

Mocked OpenAI-compatible provider

Same Host

Equal conditions, no vendor shortcuts

Reproduction details live in the benchmark docs, so the homepage stays focused on the performance story.

Host

Windows 10

CPU

8 Logical Cores

Duration

60 seconds

Target

5000 RPS

[ Enterprise controls ]

Core OSS Pillars, Enterprise Gates

Practical governance, routing, and observability controls for teams moving AI traffic into managed production environments.

Governance

Manage OSS-safe API keys, usage, workflows, and audit logs from the admin plane; add tenants, users, roles, and budgets with Enterprise capabilities.

Auth & Audit: Master keys, managed keys, and auditable request history in OSS; OIDC and RBAC in Enterprise.
Usage First: Usage tracking works in OSS; forecasts, resets, and scoped spend controls require budgets.
Admin API: Configure providers, pools, auth keys, workflows, and guardrails; tenant routes are gated.

Routing & Caching

Route requests through provider pools, fallback chains, passthrough APIs, and exact or semantic response caching.

Provider Pools: Tune upstream selection and refresh runtime configuration.
Fallback Chains: Keep requests moving when a configured upstream degrades.
Built-in Cache: Inspect exact and semantic cache behavior from the dashboard.

Security Controls

Combine authentication, scoped keys, audit logging, guardrail rules, and private deployment practices for safer AI traffic.

Guardrail Rules: Apply configured input/output policy through the gateway pipeline.
PII Controls: Use redaction-oriented guardrails where sensitive content must be reduced.
Deployment Control: Keep the gateway inside your own network and storage boundary.

OpenAI & Anthropic integration

One gateway surface for apps and agents.

Point OpenAI- and Anthropic-compatible clients, provider-native passthrough clients, and server automation at Aurora to centralize routing, managed keys, audit logs, usage analytics, caching, and admin controls.

OpenAI

Anthropic

Gemini

DeepSeek

Azure

Oracle

Ollama

vLLM

Groq

OpenRouter

Z.ai

xAI

MiniMax

openai-sdk.py

from openai import OpenAI

client = OpenAI(
  api_key="change-me",
  base_url="http://localhost:8080/v1",
)

response = client.chat.completions.create(
  model="<model id @@kw@@from /v1/models>"@@end@@,
  messages=[{"role": "user", "content": "Hello"}],
)

[ Quick Start ]

Deploy in Seconds

Start with npx, ship with Docker or native binaries, and keep Enterprise delivery private.

NPX quick start

npx -y iaurora

[ OSS that works + Enterprise gates ]

Core controls stay on when Enterprise features stay off.

Model Catalog

List, route, and manage models across configured providers, aliases, and provider-specific deployments.

Usage Analytics

Track requests, tokens, cost, models, providers, user paths, and time ranges without enabling Enterprise budgets.

Fallback & Pools

Use fallback chains and provider pools to route traffic across configured upstreams.

Response Caching

Use exact and semantic response caching with cache controls, overview metrics, and request debugging.

Native Passthrough

Forward provider-native requests through /p/{provider}/... when full upstream API flexibility is required.

Admin Dashboard

Manage providers, pools, auth keys, workflows, guardrails, audit logs, usage, and OSS-safe settings.

Scoped API Keys

Create managed keys with user paths, rate limits, expiry, provider/model controls, and usage statistics.

OpenAI- & Anthropic-Format APIs

Expose chat, responses, embeddings, models, files, batches, and messages for OpenAI- and Anthropic-compatible clients.

Observability

Inspect usage, cost, audit logs, console streams, cache behavior, provider health, and Prometheus metrics.

Open Source

A Go-based gateway designed for local, self-hosted, and private AI infrastructure.

[ What Teams Say ]

Trusted by engineering teams worldwide.

“

Aurora gives teams one place to inspect usage, tune caching, and manage provider routing instead of scattering that logic across applications.

Sarah Chen

CTO, Synthwave Labs

“

Aurora's OpenAI- and Anthropic-compatible surface lets teams keep client code small while centralizing auth keys, audit logs, and provider configuration.

Marcus Rodriguez

Head of Engineering, DataForge AI

“

The dashboard brings usage, budgets, cache behavior, audit logs, and provider health into one operational view.

Priya Kapoor

VP of Infrastructure, Pulse Analytics

“

Self-hosting Aurora keeps the gateway, storage, and provider credentials inside the team's own operating boundary.

James Whitfield

Security Architect, NexGen Financial

[ FAQ ]

Frequently asked questions.

Metric

Aurora

Bifrost

Winner

Successful requests

192,026

174,471

Aurora

Success rate

64.01%

58.16%

Aurora

Throughput

3,130.18/s

2,888.09/s

Aurora

Mean latency

833.14ms

912.47ms

Aurora

P50 latency

709.16ms

729.60ms

Aurora

P99 latency

2.155s

2.107s

Bifrost

Peak memory

59.47MB

113.11MB

Aurora

from openai import OpenAI client = OpenAI( api_key="change-me", base_url="http://localhost:8080/v1", ) response = client.chat.completions.create( model="<model id @@kw@@from /v1/models>"@@end@@, messages=[{"role": "user", "content": "Hello"}], )