> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://developers.alephant.io/llms.txt.
> For full documentation content, see https://developers.alephant.io/llms-full.txt.

# Overview

Alephant AI Gateway is an OpenAI-compatible control layer for production AI applications, available as hosted SaaS or as a self-hosted gateway. It gives developers one stable API surface while the gateway handles provider-specific adaptation, model routing, policy enforcement, layered caching, retries, fallback, usage metadata, request logging, and audit trails.

Instead of wiring every application directly to every provider, teams connect once and route across 50+ providers, 320+ models, and custom model backends. Start with Alephant Cloud for a managed workspace, or self-host the gateway when you need private infrastructure, BYO keys, and direct operational control.

```typescript
import { OpenAI } from "openai";

const client = new OpenAI({
  baseURL: "https://ai.alephant.io/v1",
  defaultHeaders: {
    Authorization: `Bearer ${ALEPHANT_VIRTUAL_KEY}`,
    "Alephant-Session-Id": "session-xxx", // optional
   });

const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Hello!" }]
});
```

## **Why this exists**

AI applications are moving from single-model prototypes to production systems that call many providers, agents, tools, and custom model backends. Without a gateway, every team ends up rebuilding the same operational layer: provider adapters, routing rules, key management, usage metadata, retries, caching, and request logs.

Alephant AI Gateway centralizes that layer behind one OpenAI-compatible API. It gives developers a stable integration surface while platform teams get policy before provider access, cache before repeated calls, fallback before outages, and audit trails before production incidents.

The goal is simple: make AI traffic observable, governable, and reliable without slowing developers down. [*Learn more ->*](https://alephant.io/)

## **Features**

| **Capability**              | **What Alephant AI Gateway provides**                                                                                                                    |
| --------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------- |
| One API surface             | OpenAI-compatible `/v1/*` and `/ai/*` routes for chat, responses, embeddings, images, and provider-style model names                                     |
| Provider and model coverage | 50+ providers, 320+ models, local runtimes, OpenRouter-style catalogs, and custom/private backends                                                       |
| Provider adaptation         | Request, tool, streaming, error, usage, finish-reason, and response normalization across provider APIs                                                   |
| Routing and resilience      | Direct provider paths, policy routers, retries, fallback, health checks, provider 429 handling, and fail-open cache paths                                |
| Agent client compatibility  | OpenAI-compatible formats for Cursor, Codex, opencode, and Antigravity workflows                                                                         |
| IDE integration             | Cursor-ready with architecture rules, workflow guides, implementation skills, and task management; opencode, Codex, and Claude Code adapters in progress |
| Policy and key control      | Virtual keys, master key resolution, model policy, workspace provider allowlists, and concurrency controls                                               |
| Caching                     | Gateway-side LLM KV cache and semantic cache to avoid repeated upstream calls                                                                            |
| Observability               | Request logs, traces, metrics, usage metadata, optional body archival, and downstream log delivery                                                       |
| Live operations             | Route, virtual key, and provider key refresh from database changes without restarting the gateway                                                        |
| Deployment                  | Hosted SaaS through Alephant Cloud, or self-hosted Rust gateway with PostgreSQL, Redis, Qdrant, and S3-compatible integrations                           |

## **Developer surface**

| **Surface**               | **Purpose**                                                                 |
| ------------------------- | --------------------------------------------------------------------------- |
| `/v1/*`                   | Drop-in OpenAI-compatible API for existing SDKs and agent clients           |
| `/router/{id}/*`          | Policy-driven routing through a configured router                           |
| `/{provider}/*`           | Direct provider passthrough when you want explicit upstream control         |
| `model=provider/model_id` | Select a provider and model without changing application code               |
| Custom backends           | Put private models or self-hosted runtimes behind the same gateway contract |

## **Architecture & request lifecycle**

<img src="https://files.buildwithfern.com/visual-editor-images/alephantai.docs.buildwithfern.com/2026-05-13T05:42:40.452Z/docs/ai-gateway/overview/ai-gateway-architecture.png" alt="ai-gateway-architecture" title="ai-gateway-architecture" noZoom={false} />

Every request passes through the same gateway lifecycle: global middleware, routing, provider mapping, dispatch, cache, fallback, and async logging. The entry path depends on how much control you want:

| **Path**         | **Use it for**                                                 |
| ---------------- | -------------------------------------------------------------- |
| `/v1/*`          | Unified OpenAI-style access with `model=provider/model_id`     |
| `/router/{id}/*` | Policy-driven routing through a configured router              |
| `/{provider}/*`  | Direct provider passthrough when you want an explicit upstream |

## **Multi-provider adaptation**

Use one OpenAI-style request shape across 50+ providers and 320+ models, including OpenAI-compatible APIs, Anthropic Messages, Gemini, Bedrock, Ollama, OpenRouter-style catalogs, and custom backends. The client selects a runtime with `model=provider/model_id`; Alephant resolves the provider, applies the right adapter, maps provider-specific fields, and returns a normalized OpenAI-style response.

Instead of listing every model in the README, this section focuses on the contract: one request format in, one consistent response out. The provider and model catalog can evolve independently without forcing application code changes.

<img src="https://files.buildwithfern.com/visual-editor-images/alephantai.docs.buildwithfern.com/2026-05-13T05:43:32.107Z/docs/ai-gateway/overview/ai-gateway-multi-provider.png" alt="ai-gateway-multi-provider" title="ai-gateway-multi-provider" noZoom={false} />

|                                |                                                                                                                                                                         |
| ------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Mainstream models**          | GPT-4o · GPT-4.1 · o3 · Claude 3.5/3.7 Sonnet · Claude Opus · Gemini 1.5/2.0 · Llama 3/4 · Mistral Large · Command R+                                                   |
| **Provider ecosystem**         | OpenAI · Anthropic · Google Gemini · AWS Bedrock · Azure OpenAI · OpenRouter · Together AI · Fireworks · Groq · Cohere · Mistral · Perplexity · DeepSeek · xAI · Ollama |
| **Agent client compatibility** | Cursor · Codex · opencode · Antigravity                                                                                                                                 |

## **IDE integration**

Alephant AI Gateway ships repository-level tooling for AI-assisted development inside supported IDEs.

| **IDE / Agent Client** | **Status**  | **What's included**                                                                                                                                                                                                                                      |
| ---------------------- | ----------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Cursor                 | Ready       | Project architecture & code-convention rules, development & API workflow guides, gated-module-implementation skill (Skill), file-based task management (Task Magic) — see the `.cursor` directory; also configure the gateway in Agent Settings → Models |
| opencode               | In progress | Adapter and configuration under development                                                                                                                                                                                                              |
| Codex                  | In progress | Adapter and configuration under development                                                                                                                                                                                                              |
| Claude Code            | In progress | Adapter and configuration under development                                                                                                                                                                                                              |

## **Comparison**

Portkey, Alephant, and LiteLLM are excellent projects, but they start from different centers of gravity. Alephant is built for teams shipping agentic AI products: a hosted SaaS workspace plus a self-hosted gateway path for agent development, cost control, provider routing, governance, and operational visibility.

| **Project**         | **Best known for**                                                                                            | **Best fit**                                                                                                           |
| ------------------- | ------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------- |
| Portkey             | Enterprise AI gateway controls, guardrails, and managed policy workflows                                      | Teams that want a managed AI control plane                                                                             |
| Helicone            | LLM observability, request analytics, sessions, and cost visibility                                           | Teams whose primary need is tracing and analytics                                                                      |
| LiteLLM             | Broad Python proxy/SDK ecosystem for many providers                                                           | Teams that want maximum provider breadth through a Python stack                                                        |
| Alephant AI Gateway | Agent development infrastructure, cost control, governance, provider routing, and SaaS + self-host deployment | Teams building production agents that need cost guardrails, request traceability, BYO keys, and multi-provider control |

| **Capability**          | **Portkey**                      | **Alephant**                            | **LiteLLM**                       | **Alephant AI Gateway**                                                                                         |
| ----------------------- | -------------------------------- | --------------------------------------- | --------------------------------- | --------------------------------------------------------------------------------------------------------------- |
| OpenAI-compatible API   | Yes                              | Yes                                     | Yes                               | Yes                                                                                                             |
| SaaS + self-host        | Enterprise/self-host options     | Hosted and self-host options            | Self-hosted proxy                 | Yes: Alephant Cloud plus self-hosted Rust gateway                                                               |
| Provider/model coverage | Broad                            | Broad logging/proxy coverage            | Very broad                        | 50+ providers, 320+ models, custom backends                                                                     |
| Agent coding clients    | No dedicated compatibility layer | No dedicated compatibility layer        | No dedicated compatibility layer  | Cursor, Codex, opencode, Antigravity workflows                                                                  |
| Agent cost control      | Guardrails and policy controls   | Cost analytics and request visibility   | Budgets and spend controls        | Agent/session-aware usage visibility, cache savings, budget controls, and governance workflows                  |
| Provider adaptation     | Gateway policies and routing     | Proxy plus observability pipeline       | Strong provider abstraction       | Explicit mappers for requests, streaming, errors, usage, and responses                                          |
| Routing and resilience  | Routing, retries, fallbacks      | Gateway controls plus observability     | Router, fallback, budgets         | Direct paths, policy routers, fallback, health checks, provider 429 handling                                    |
| BYO key control         | Key vault / enterprise controls  | BYO keys with proxy controls            | Virtual keys and self-hosted keys | BYO provider keys, master-key resolution, workspace allowlists                                                  |
| Cache                   | Gateway caching                  | Cache tracking/integrations             | Cache integrations                | LLM KV cache plus semantic cache                                                                                |
| Observability           | Logs and policy events           | Core strength                           | Callback/logging integrations     | Logs, traces, metrics, usage metadata, optional body archival                                                   |
| Governance path         | Strong enterprise guardrails     | Workspace controls around observability | Teams, budgets, rate limits       | Agent/session governance, model policy, provider allowlists, concurrency controls, and workspace-level controls |

Alephant's differentiator is the combination: hosted SaaS, self-hosted Rust gateway, agent-first developer compatibility, cost-control workflows, BYO-key governance, explicit provider adaptation, and workspace-level AI FinOps.

## **Repository structure**

```
alephant-ai-gateway/
├── ai-gateway/                 # Gateway service crate
├── crates/                     # Shared libraries and harnesses
├── docs/                       # In-repo notes; curated docs at https://api.alephant.io/
├── scripts/                    # CI and local automation
├── infrastructure/             # Deployment and observability infra
├── test/                       # Integration and runtime test helpers
├── AGENTS.md                   # Agent collaboration conventions
├── CLAUDE.md                   # Command and architecture reference
└── CHANGELOG.md                # Project changelog
```