Agent Gateway
Alephant is an Agentic Finance Gateway for production AI agents and workflows. The Agent Gateway model explains how Alephant governs agent activity beyond individual LLM requests.
Traditional AI gateways route model requests. They answer questions such as which provider handled a request, how many tokens were used, whether the request failed, and how much the model call cost.
Production agents need a broader control layer. A single task can include model calls, tool calls, workflow steps, retries, policy checks, payment events, and revenue attribution. The primary unit is no longer only the request. It is the agent run.
AI Gateway vs Agent Gateway
Core Flow
Alephant starts with the same developer entry point teams already understand: model routing through a Virtual Key, usually with an OpenAI SDK-compatible request.
From there, Alephant adds agent context:
- Create or identify an Agent, member, workflow, or integration.
- Bind traffic to a Virtual Key.
- Route model calls through Alephant Gateway.
- Attach request activity to a run or session when available.
- Evaluate policies before expensive or risky actions continue.
- Record model cost, tool cost, policy decisions, latency, errors, and audit events.
- Attribute cost, revenue, and margin to the responsible agent or workflow.
What Alephant Controls
Alephant can govern activity across:
- LLM provider and model access
- Virtual Keys and provider keys
- Agent, member, department, and workspace budgets
- Rate limits, token limits, concurrency, and time windows
- Tool, workflow, and endpoint permissions
- Prompt templates and routing decisions
- Paid endpoint access and payment-aware execution
When To Use This Mental Model
Use the Agent Gateway model when you need to answer:
- Which agent or workflow created this cost?
- Which run caused a usage spike?
- Which tool or model call triggered a policy decision?
- Was this action allowed before it executed?
- Did a paid endpoint earn enough to cover its model and tool cost?
- What should be blocked, throttled, escalated, or optimized next?