OSS / Policy Controls

Guardrails

Browse docs

--- title: "Guardrails" description: "Intercept and modify requests before they reach LLM providers." icon: "shield-check" ---

Overview

Guardrails are a pipeline of rules that run before a request reaches any LLM provider. They can inspect, modify, or reject requests â€” giving you centralized control over every prompt that flows through Aurora.

OSS guardrails include system_prompt rules and additional rule types: llm_based_altering for content rewriting via an auxiliary LLM, regex_block for pattern-based blocking, pii_redact for PII redaction, and length_limit for token/character caps. For Enterprise-only guardrail features (response-side rewriting, additional rule types in future releases), see Enterprise Guardrails.

Guardrails work across all text-based endpoints:

/v1/chat/completions
/v1/responses

Quick Start

Add a guardrails section to your config/config.yaml:

yaml

guardrails:
  enabled: true
  rules:
    - name: "safety-prompt"
      type: "system_prompt"
      order: 0
      system_prompt:
        mode: "decorator"
        content: "Always respond safely and respectfully."

That's it. Every request now gets the safety prompt prepended to its system instructions.

How It Works

Diagram

Messages are extracted from the incoming request into a normalized format
The guardrails pipeline processes the messages (inject, modify, or reject)
Modified messages are applied back to the original request
The request continues to the LLM provider

Guardrails never see the raw API request types â€” they operate on a normalized message list. This means the same guardrail works identically for /chat/completions and /responses.

Execution Order

Each guardrail has an order value that controls when it runs:

Same order â†’ run in parallel (concurrently)
Different order â†’ run sequentially (ascending)

Diagram

Each sequential group receives the output of the previous group. If any guardrail returns an error, the request is rejected and never reaches the provider.

Configuration

Full Structure

yaml

guardrails:
  enabled: true    # Master switch (default: false)
  rules:
    - name: "rule-name"         # Unique identifier for this instance
      type: "system_prompt"     # Guardrail type
      user_path: "/team/privacy" # Optional base path for internal auxiliary calls
      order: 0                  # Execution order
      system_prompt:            # Type-specific settings
        mode: "decorator"
        content: "Your prompt text here."

Environment Variable

You can toggle guardrails without editing the config file:

bash

export GUARDRAILS_ENABLED=true

Rule Fields

Field	Required	Description
`name`	Yes	Human-readable identifier. Supports spaces and unicode, but not `/`.
`type`	Yes	Guardrail type: `system_prompt`, `llm_based_altering`, `regex_block`, `pii_redact`, `length_limit`. See each type below.
`user_path`	No	Optional base user path for internal auxiliary guardrail requests.
`order`	No	Execution order. Default `0`. Same value = parallel, different = sequential.

Guardrail Types

OSS guardrails support the system_prompt, llm_based_altering, regex_block, pii_redact, and length_limit types. For Enterprise guardrail capabilities, see Enterprise Guardrails.

`system_prompt`

Adds, replaces, or decorates the system prompt on every request.

`llm_based_altering`

Rewrites content via an auxiliary LLM provider for content safety or style normalization. Configure the backing provider and system instruction for the rewriting model.

`regex_block`

Blocks or sanitizes content matching one or more regular expression patterns. Useful for preventing prompt injection, blocking specific formats, or enforcing content policies.

`pii_redact`

Automatically detects and redacts personally identifiable information (email addresses, phone numbers, SSNs, credit card numbers) from request and response content. Supports configurable redaction strategies.

`length_limit`

Enforces maximum character or estimated-token limits on request and response content. Configurable per direction (input, output, or both).

Settings

Field	Required	Description
`mode`	No	`inject`, `override`, or `decorator`. Default: `inject`.
`content`	Yes	The system prompt text to apply.

Modes

Adds a system message only if none exists. Existing system prompts are left untouched.

yaml

    - name: "default-system"
      type: "system_prompt"
      order: 0
      system_prompt:
        mode: "inject"
        content: "You are a helpful assistant."

Behavior:

Request has no system prompt â†’ adds one
Request already has a system prompt â†’ no change

Examples

Single Safety Guardrail

The simplest setup â€” add a safety prefix to every request:

yaml

guardrails:
  enabled: true
  rules:
    - name: "safety"
      type: "system_prompt"
      order: 0
      system_prompt:
        mode: "decorator"
        content: "Always be safe, respectful, and helpful."

Multiple Guardrails in Parallel

Two guardrails running at the same order execute concurrently:

yaml

guardrails:
  enabled: true
  rules:
    - name: "safety-prompt"
      type: "system_prompt"
      order: 0
      system_prompt:
        mode: "decorator"
        content: "Always be safe and respectful."

    - name: "compliance-prompt"
      type: "system_prompt"
      order: 0
      system_prompt:
        mode: "inject"
        content: "Follow all company compliance policies."

Sequential Pipeline

Guardrails with different orders run one after another. Later groups see the output of earlier ones:

yaml

guardrails:
  enabled: true
  rules:
    # Step 1: ensure a system prompt exists
    - name: "default-system"
      type: "system_prompt"
      order: 0
      system_prompt:
        mode: "inject"
        content: "You are a helpful assistant."

    # Step 2: decorate whatever system prompt is now present
    - name: "safety-prefix"
      type: "system_prompt"
      order: 1
      system_prompt:
        mode: "decorator"
        content: "[SAFETY] Always respond within company guidelines."

    # Step 3: add a final compliance check after decoration
    - name: "compliance-stamp"
      type: "system_prompt"
      order: 2
      system_prompt:
        mode: "override"
        content: "[COMPLIANCE REVIEW] Respond according to company policy."

Mixed Parallel and Sequential

yaml

guardrails:
  enabled: true
  rules:
    # Order 0: these two run in parallel
    - name: "safety"
      type: "system_prompt"
      order: 0
      system_prompt:
        mode: "decorator"
        content: "Be safe."

    - name: "policy"
      type: "system_prompt"
      order: 0
      system_prompt:
        mode: "inject"
        content: "Follow company policy."

    # Order 1: runs after both order-0 guardrails complete
    - name: "final-override"
      type: "system_prompt"
      order: 1
      system_prompt:
        mode: "decorator"
        content: "[FINAL CHECK]"

How It Works With Different Endpoints

For Enterprise guardrail capabilities (response-side rewriting, additional rule types), see Enterprise Guardrails.

Guardrails operate on a normalized message format internally. The adaptation between API-specific request types and this format happens automatically:

Endpoint	System prompt source	User messages source
`/v1/chat/completions`	`messages` with `role: "system"`	`messages` array
`/v1/responses`	`instructions` field	`input` field

Errors and Rejection

If a guardrail returns an error, the request is rejected immediately. The error is returned to the client and the request never reaches the LLM provider.

This is useful for future guardrail types that validate content (e.g., PII detection, content filtering). The system prompt guardrail does not reject requests â€” it only modifies them.

← All docs

OSS / Policy Controls

Guardrails

Browse docs

--- title: "Guardrails" description: "Intercept and modify requests before they reach LLM providers." icon: "shield-check" ---

Overview

Guardrails work across all text-based endpoints:

/v1/chat/completions
/v1/responses

Quick Start

Add a guardrails section to your config/config.yaml:

yaml

guardrails:
  enabled: true
  rules:
    - name: "safety-prompt"
      type: "system_prompt"
      order: 0
      system_prompt:
        mode: "decorator"
        content: "Always respond safely and respectfully."

That's it. Every request now gets the safety prompt prepended to its system instructions.

How It Works

Diagram

Messages are extracted from the incoming request into a normalized format
The guardrails pipeline processes the messages (inject, modify, or reject)
Modified messages are applied back to the original request
The request continues to the LLM provider

Guardrails never see the raw API request types â€” they operate on a normalized message list. This means the same guardrail works identically for /chat/completions and /responses.

Execution Order

Each guardrail has an order value that controls when it runs:

Same order â†’ run in parallel (concurrently)
Different order â†’ run sequentially (ascending)

Diagram

Each sequential group receives the output of the previous group. If any guardrail returns an error, the request is rejected and never reaches the provider.

Configuration

Full Structure

yaml

guardrails:
  enabled: true    # Master switch (default: false)
  rules:
    - name: "rule-name"         # Unique identifier for this instance
      type: "system_prompt"     # Guardrail type
      user_path: "/team/privacy" # Optional base path for internal auxiliary calls
      order: 0                  # Execution order
      system_prompt:            # Type-specific settings
        mode: "decorator"
        content: "Your prompt text here."

Environment Variable

You can toggle guardrails without editing the config file:

bash

export GUARDRAILS_ENABLED=true

Rule Fields

Field	Required	Description
`name`	Yes	Human-readable identifier. Supports spaces and unicode, but not `/`.
`type`	Yes	Guardrail type: `system_prompt`, `llm_based_altering`, `regex_block`, `pii_redact`, `length_limit`. See each type below.
`user_path`	No	Optional base user path for internal auxiliary guardrail requests.
`order`	No	Execution order. Default `0`. Same value = parallel, different = sequential.

Guardrail Types

OSS guardrails support the system_prompt, llm_based_altering, regex_block, pii_redact, and length_limit types. For Enterprise guardrail capabilities, see Enterprise Guardrails.

`system_prompt`

Adds, replaces, or decorates the system prompt on every request.

`llm_based_altering`

Rewrites content via an auxiliary LLM provider for content safety or style normalization. Configure the backing provider and system instruction for the rewriting model.

`regex_block`

Blocks or sanitizes content matching one or more regular expression patterns. Useful for preventing prompt injection, blocking specific formats, or enforcing content policies.

`pii_redact`

`length_limit`

Enforces maximum character or estimated-token limits on request and response content. Configurable per direction (input, output, or both).

Settings

Field	Required	Description
`mode`	No	`inject`, `override`, or `decorator`. Default: `inject`.
`content`	Yes	The system prompt text to apply.

Modes

Adds a system message only if none exists. Existing system prompts are left untouched.

yaml

    - name: "default-system"
      type: "system_prompt"
      order: 0
      system_prompt:
        mode: "inject"
        content: "You are a helpful assistant."

Behavior:

Request has no system prompt â†’ adds one
Request already has a system prompt â†’ no change

Examples

Single Safety Guardrail

The simplest setup â€” add a safety prefix to every request:

yaml

guardrails:
  enabled: true
  rules:
    - name: "safety"
      type: "system_prompt"
      order: 0
      system_prompt:
        mode: "decorator"
        content: "Always be safe, respectful, and helpful."

Multiple Guardrails in Parallel

Two guardrails running at the same order execute concurrently:

yaml

guardrails:
  enabled: true
  rules:
    - name: "safety-prompt"
      type: "system_prompt"
      order: 0
      system_prompt:
        mode: "decorator"
        content: "Always be safe and respectful."

    - name: "compliance-prompt"
      type: "system_prompt"
      order: 0
      system_prompt:
        mode: "inject"
        content: "Follow all company compliance policies."

Sequential Pipeline

Guardrails with different orders run one after another. Later groups see the output of earlier ones:

yaml

guardrails:
  enabled: true
  rules:
    # Step 1: ensure a system prompt exists
    - name: "default-system"
      type: "system_prompt"
      order: 0
      system_prompt:
        mode: "inject"
        content: "You are a helpful assistant."

    # Step 2: decorate whatever system prompt is now present
    - name: "safety-prefix"
      type: "system_prompt"
      order: 1
      system_prompt:
        mode: "decorator"
        content: "[SAFETY] Always respond within company guidelines."

    # Step 3: add a final compliance check after decoration
    - name: "compliance-stamp"
      type: "system_prompt"
      order: 2
      system_prompt:
        mode: "override"
        content: "[COMPLIANCE REVIEW] Respond according to company policy."

Mixed Parallel and Sequential

yaml

guardrails:
  enabled: true
  rules:
    # Order 0: these two run in parallel
    - name: "safety"
      type: "system_prompt"
      order: 0
      system_prompt:
        mode: "decorator"
        content: "Be safe."

    - name: "policy"
      type: "system_prompt"
      order: 0
      system_prompt:
        mode: "inject"
        content: "Follow company policy."

    # Order 1: runs after both order-0 guardrails complete
    - name: "final-override"
      type: "system_prompt"
      order: 1
      system_prompt:
        mode: "decorator"
        content: "[FINAL CHECK]"

How It Works With Different Endpoints

For Enterprise guardrail capabilities (response-side rewriting, additional rule types), see Enterprise Guardrails.

Guardrails operate on a normalized message format internally. The adaptation between API-specific request types and this format happens automatically:

Endpoint	System prompt source	User messages source
`/v1/chat/completions`	`messages` with `role: "system"`	`messages` array
`/v1/responses`	`instructions` field	`input` field

Errors and Rejection

If a guardrail returns an error, the request is rejected immediately. The error is returned to the client and the request never reaches the LLM provider.

This is useful for future guardrail types that validate content (e.g., PII detection, content filtering). The system prompt guardrail does not reject requests â€” it only modifies them.