Overview

View as Markdown

Alephant is an Agentic Finance Gateway for production agents and workflows. The AI Gateway is the model-access layer inside that platform: it gives agents one reliable way to call models while Alephant applies routing, policy, tracing, cost attribution, and financial controls around the full agent run.

Instead of wiring every agent or application directly to every provider, teams connect once and route across 50+ providers, 320+ models, and custom model backends. Start with Alephant Cloud for a managed workspace, or self-host the gateway when you need private infrastructure, BYO keys, and direct operational control.

OpenAI compatibility is the entry point, not the product boundary. A request can still look like a standard /v1/chat/completions call, but Alephant can attach that request to an Agent ID, Run ID, session, policy decision, budget, cache event, request log, payment event, and margin record.

1import { OpenAI } from "openai";
2
3const client = new OpenAI({
4 baseURL: "https://ai.alephant.io/v1",
5 apiKey: process.env.ALEPHANT_VIRTUAL_KEY,
6 defaultHeaders: {
7 "Alephant-Agent-Id": "agt_support_bot_8f3a",
8 "Alephant-Run-Id": "run_demo_001",
9 "alephant-session-id": "session-xxx", // optional
10 },
11});
12
13const response = await client.chat.completions.create({
14 model: "openai/gpt-4o-mini",
15 messages: [{ role: "user", content: "Hello!" }]
16});

Why this exists

AI applications are moving from single-model prototypes to agents and workflows that call models, tools, APIs, and paid endpoints. Without a gateway, every team ends up rebuilding the same operational layer: provider adapters, routing rules, key management, usage metadata, retries, caching, and request logs.

Alephant AI Gateway centralizes model access behind one developer-friendly API, then connects each request to the broader Alephant control plane. Developers keep a familiar SDK integration while platform and finance teams get policy before provider access, traceability before incidents, cost attribution before budget reviews, and revenue visibility when agent capabilities are sold as paid endpoints.

The goal is simple: make agent activity observable, governable, reliable, and financially accountable without slowing developers down. Learn more ->

Features

CapabilityWhat Alephant provides
Agent-aware gatewayOpenAI-compatible /v1/* and /ai/* routes that can attach traffic to Agents, Run IDs, sessions, Virtual Keys, departments, and workspaces
Provider and model coverage50+ providers, 320+ models, local runtimes, OpenRouter-style catalogs, and custom/private backends
Provider adaptationRequest, tool, streaming, error, usage, finish-reason, and response normalization across provider APIs
Routing and resilienceDirect provider paths, policy routers, retries, fallback, health checks, provider 429 handling, and fail-open cache paths
Agent client compatibilityOpenAI-compatible formats for Cursor, Codex, opencode, Antigravity, n8n, MCP, and existing OpenAI SDK workflows
Policy and key controlVirtual Keys, master key resolution, model policy, workspace provider allowlists, budgets, rate limits, and concurrency controls
CachingGateway-side LLM KV cache and semantic cache to avoid repeated upstream calls
Run observabilityRequest logs, traces, sessions, Agent IDs, Run IDs, usage metadata, policy events, optional body archival, and downstream log delivery
Agent financeToken cost, external tool/API spend, outbound payment spend, endpoint revenue, cache savings, and known margin attribution
Monetization pathGoverned paid endpoints for exposing agent capabilities through x402 and MPP payment rails
Live operationsRoute, Virtual Key, and provider key refresh from database changes without restarting the gateway
DeploymentHosted SaaS through Alephant Cloud, or self-hosted Rust gateway with PostgreSQL, Redis, Qdrant, and S3-compatible integrations

Developer surface

SurfacePurpose
/v1/*Drop-in model API for existing OpenAI SDKs and agent clients
/router/{id}/*Policy-driven routing through a configured router
/{provider}/*Direct provider passthrough when you want explicit upstream control
model=provider/model_idSelect a provider and model without changing application code
Agent/run headersAttach Alephant-Agent-Id, Alephant-Run-Id, session, request, and custom properties to model traffic
Custom backendsPut private models or self-hosted runtimes behind the same gateway contract

Architecture & request lifecycle

ai-gateway-architecture

Every request passes through the same gateway lifecycle: authentication, agent context, policy checks, routing, provider mapping, dispatch, cache, fallback, and async logging. The entry path depends on how much control you want:

PathUse it for
/v1/*Unified OpenAI-style access with model=provider/model_id
/router/{id}/*Policy-driven routing through a configured router
/{provider}/*Direct provider passthrough when you want an explicit upstream

Multi-provider adaptation

Use one familiar request shape across 50+ providers and 320+ models, including OpenAI-compatible APIs, Anthropic Messages, Gemini, Bedrock, Ollama, OpenRouter-style catalogs, and custom backends. The client selects a runtime with model=provider/model_id; Alephant resolves the provider, applies the right adapter, maps provider-specific fields, and returns a normalized response.

Instead of listing every model in the README, this section focuses on the contract: one request format in, one consistent response out. The provider and model catalog can evolve independently without forcing application code changes.

ai-gateway-multi-provider
Mainstream modelsGPT-4o · GPT-4.1 · o3 · Claude 3.5/3.7 Sonnet · Claude Opus · Gemini 1.5/2.0 · Llama 3/4 · Mistral Large · Command R+
Provider ecosystemOpenAI · Anthropic · Google Gemini · AWS Bedrock · Azure OpenAI · OpenRouter · Together AI · Fireworks · Groq · Cohere · Mistral · Perplexity · DeepSeek · xAI · Ollama
Agent client compatibilityCursor · Codex · opencode · Antigravity

IDE integration

Alephant AI Gateway ships repository-level tooling for AI-assisted development inside supported IDEs.

IDE / Agent ClientStatusWhat’s included
CursorReadyProject architecture & code-convention rules, development & API workflow guides, gated-module-implementation skill (Skill), file-based task management (Task Magic) — see the .cursor directory; also configure the gateway in Agent Settings → Models
opencodeIn progressAdapter and configuration under development
CodexIn progressAdapter and configuration under development
Claude CodeIn progressAdapter and configuration under development

Comparison

Portkey, Helicone, and LiteLLM are excellent projects, but they start from different centers of gravity. Alephant is built for teams shipping agentic AI products: a hosted SaaS workspace plus a self-hosted gateway path for agent governance, run tracing, model routing, cost control, and agent monetization.

ProjectBest known forBest fit
PortkeyEnterprise AI gateway controls, guardrails, and managed policy workflowsTeams that want a managed AI control plane
HeliconeLLM observability, request analytics, sessions, and cost visibilityTeams whose primary need is tracing and analytics
LiteLLMBroad Python proxy/SDK ecosystem for many providersTeams that want maximum provider breadth through a Python stack
Alephant AI GatewayAgentic finance, governance, run tracing, provider routing, and SaaS + self-host deploymentTeams building production agents that need cost guardrails, run traceability, BYO keys, paid endpoints, and multi-provider control
CapabilityPortkeyAlephantLiteLLMAlephant AI Gateway
OpenAI-compatible APIYesYesYesYes
SaaS + self-hostEnterprise/self-host optionsHosted and self-host optionsSelf-hosted proxyYes: Alephant Cloud plus self-hosted Rust gateway
Provider/model coverageBroadBroad logging/proxy coverageVery broad50+ providers, 320+ models, custom backends
Agent coding clientsNo dedicated compatibility layerNo dedicated compatibility layerNo dedicated compatibility layerCursor, Codex, opencode, Antigravity workflows
Agent financeGuardrails and policy controlsCost analytics and request visibilityBudgets and spend controlsAgent/session-aware usage visibility, cache savings, external spend, endpoint revenue, and margin attribution
Provider adaptationGateway policies and routingProxy plus observability pipelineStrong provider abstractionExplicit mappers for requests, streaming, errors, usage, and responses
Routing and resilienceRouting, retries, fallbacksGateway controls plus observabilityRouter, fallback, budgetsDirect paths, policy routers, fallback, health checks, provider 429 handling
BYO key controlKey vault / enterprise controlsBYO keys with proxy controlsVirtual keys and self-hosted keysBYO provider keys, master-key resolution, workspace allowlists
CacheGateway cachingCache tracking/integrationsCache integrationsLLM KV cache plus semantic cache
ObservabilityLogs and policy eventsCore strengthCallback/logging integrationsLogs, traces, metrics, usage metadata, optional body archival
Governance pathStrong enterprise guardrailsWorkspace controls around observabilityTeams, budgets, rate limitsAgent/run governance, model policy, provider allowlists, paid endpoint rules, concurrency controls, and workspace-level controls

Alephant’s differentiator is the combination: hosted SaaS, self-hosted Rust gateway, agent-first developer compatibility, run-level governance, BYO-key control, explicit provider adaptation, AI FinOps, and payment-aware monetization.

Repository structure

alephant-ai-gateway/
├── ai-gateway/ # Gateway service crate
├── crates/ # Shared libraries and harnesses
├── docs/ # In-repo notes; curated docs at https://api.alephant.io/
├── scripts/ # CI and local automation
├── infrastructure/ # Deployment and observability infra
├── test/ # Integration and runtime test helpers
├── AGENTS.md # Agent collaboration conventions
├── CLAUDE.md # Command and architecture reference
└── CHANGELOG.md # Project changelog