Browse docs
--- title: "Guardrails" description: "Intercept and modify requests before they reach LLM providers." icon: "shield-check" ---
Overview
Guardrails are a pipeline of rules that run before a request reaches any LLM provider. They can inspect, modify, or reject requests — giving you centralized control over every prompt that flows through Aurora.
OSS guardrails include system_prompt rules and additional rule types: llm_based_altering for content rewriting via an auxiliary LLM, regex_block for pattern-based blocking, pii_redact for PII redaction, and length_limit for token/character caps. For Enterprise-only guardrail features (response-side rewriting, additional rule types in future releases), see Enterprise Guardrails.
Guardrails work across all text-based endpoints:
/v1/chat/completions/v1/responses
Quick Start
Add a guardrails section to your config/config.yaml:
guardrails:
enabled: true
rules:
- name: "safety-prompt"
type: "system_prompt"
order: 0
system_prompt:
mode: "decorator"
content: "Always respond safely and respectfully."That's it. Every request now gets the safety prompt prepended to its system instructions.
How It Works
- Messages are extracted from the incoming request into a normalized format
- The guardrails pipeline processes the messages (inject, modify, or reject)
- Modified messages are applied back to the original request
- The request continues to the LLM provider
Guardrails never see the raw API request types — they operate on a normalized message list. This means the same guardrail works identically for /chat/completions and /responses.
Execution Order
Each guardrail has an order value that controls when it runs:
- Same order → run in parallel (concurrently)
- Different order → run sequentially (ascending)
Each sequential group receives the output of the previous group. If any guardrail returns an error, the request is rejected and never reaches the provider.
Configuration
Full Structure
guardrails:
enabled: true # Master switch (default: false)
rules:
- name: "rule-name" # Unique identifier for this instance
type: "system_prompt" # Guardrail type
user_path: "/team/privacy" # Optional base path for internal auxiliary calls
order: 0 # Execution order
system_prompt: # Type-specific settings
mode: "decorator"
content: "Your prompt text here."Environment Variable
You can toggle guardrails without editing the config file:
export GUARDRAILS_ENABLED=trueRule Fields
Guardrail Types
OSS guardrails support the system_prompt, llm_based_altering, regex_block, pii_redact, and length_limit types. For Enterprise guardrail capabilities, see Enterprise Guardrails.
`system_prompt`
Adds, replaces, or decorates the system prompt on every request.
`llm_based_altering`
Rewrites content via an auxiliary LLM provider for content safety or style normalization. Configure the backing provider and system instruction for the rewriting model.
`regex_block`
Blocks or sanitizes content matching one or more regular expression patterns. Useful for preventing prompt injection, blocking specific formats, or enforcing content policies.
`pii_redact`
Automatically detects and redacts personally identifiable information (email addresses, phone numbers, SSNs, credit card numbers) from request and response content. Supports configurable redaction strategies.
`length_limit`
Enforces maximum character or estimated-token limits on request and response content. Configurable per direction (input, output, or both).
Settings
Modes
Adds a system message only if none exists. Existing system prompts are left untouched.
- name: "default-system"
type: "system_prompt"
order: 0
system_prompt:
mode: "inject"
content: "You are a helpful assistant."Behavior:
- Request has no system prompt → adds one
- Request already has a system prompt → no change
Examples
Single Safety Guardrail
The simplest setup — add a safety prefix to every request:
guardrails:
enabled: true
rules:
- name: "safety"
type: "system_prompt"
order: 0
system_prompt:
mode: "decorator"
content: "Always be safe, respectful, and helpful."Multiple Guardrails in Parallel
Two guardrails running at the same order execute concurrently:
guardrails:
enabled: true
rules:
- name: "safety-prompt"
type: "system_prompt"
order: 0
system_prompt:
mode: "decorator"
content: "Always be safe and respectful."
- name: "compliance-prompt"
type: "system_prompt"
order: 0
system_prompt:
mode: "inject"
content: "Follow all company compliance policies."Sequential Pipeline
Guardrails with different orders run one after another. Later groups see the output of earlier ones:
guardrails:
enabled: true
rules:
# Step 1: ensure a system prompt exists
- name: "default-system"
type: "system_prompt"
order: 0
system_prompt:
mode: "inject"
content: "You are a helpful assistant."
# Step 2: decorate whatever system prompt is now present
- name: "safety-prefix"
type: "system_prompt"
order: 1
system_prompt:
mode: "decorator"
content: "[SAFETY] Always respond within company guidelines."
# Step 3: add a final compliance check after decoration
- name: "compliance-stamp"
type: "system_prompt"
order: 2
system_prompt:
mode: "override"
content: "[COMPLIANCE REVIEW] Respond according to company policy."Mixed Parallel and Sequential
guardrails:
enabled: true
rules:
# Order 0: these two run in parallel
- name: "safety"
type: "system_prompt"
order: 0
system_prompt:
mode: "decorator"
content: "Be safe."
- name: "policy"
type: "system_prompt"
order: 0
system_prompt:
mode: "inject"
content: "Follow company policy."
# Order 1: runs after both order-0 guardrails complete
- name: "final-override"
type: "system_prompt"
order: 1
system_prompt:
mode: "decorator"
content: "[FINAL CHECK]"How It Works With Different Endpoints
For Enterprise guardrail capabilities (response-side rewriting, additional rule types), see Enterprise Guardrails.
Guardrails operate on a normalized message format internally. The adaptation between API-specific request types and this format happens automatically:
Errors and Rejection
If a guardrail returns an error, the request is rejected immediately. The error is returned to the client and the request never reaches the LLM provider.
This is useful for future guardrail types that validate content (e.g., PII detection, content filtering). The system prompt guardrail does not reject requests — it only modifies them.