Provider Routing | Alephant Documentation

Alephant Provider Routing lets your application use one OpenAI-compatible API surface while Alephant routes requests across AI providers, models, and custom backends.

Routing in Alephant is not only provider forwarding. It is the decision layer that connects model selection, provider access, budget policy, fallback behavior, token accounting, and request observability.

Every routed request can be:

resolved to the correct provider and model
checked against workspace, key, user, agent, session, and model policy
dispatched through the right provider adapter
retried or failed over when configured
normalized back into an OpenAI-style response
recorded with provider, model, token, cost, latency, cache, retry, and trace metadata

Why Provider Routing Matters

Production AI systems rarely use only one model or one provider.

Teams may use OpenAI for general chat, Anthropic for reasoning, Gemini for long context, Bedrock for enterprise deployment, Ollama for local models, and custom endpoints for private inference. Without a gateway, every application has to manage provider-specific SDKs, model names, credentials, error formats, usage fields, retry behavior, and cost reporting.

Alephant gives teams one routing layer for all AI traffic.

Routing Surfaces

Alephant supports multiple routing surfaces depending on how much control you want at request time.

Surface	Use Case
/v1/*	OpenAI-compatible gateway access for existing SDKs and agent clients
model=“provider/model_id”	Explicitly select a provider and model from an OpenAI-style request
model=“model_id”	Let Alephant resolve a bare model ID when it maps to one known provider
/router/{id}/*	Route through a configured policy router
/{provider}/*	Direct provider passthrough for explicit upstream control

For most applications, start with /v1/* and provider-prefixed model names.

1 curl https://ai.alephant.io/v1/chat/completions \
2   
3   -H "Authorization: Bearer $ALEPHANT_VIRTUAL_KEY" \
4   -H "Content-Type: application/json" \
5   -d '{
6     "model": "openai/gpt-4o-mini",
7     "messages": [
8       { "role": "user", "content": "Write a short product summary." }
9     ]
10   }'

Model Naming

Alephant uses model names to resolve the target provider.

Provider-prefixed model IDs

Use provider/model_id when you want the request to go to a specific provider.

1 {
2   "model": "openai/gpt-4o-mini"
3 }

Examples:

openai/gpt-4o-mini
anthropic/claude-3-5-sonnet
google/gemini-1.5-pro
bedrock/anthropic.claude-3-5-sonnet
ollama/llama3.1

Provider-prefixed model IDs are the clearest option for production traffic because they make routing intent explicit.

Bare model IDs

Alephant can also resolve a bare model ID when the model maps to exactly one known provider.

1 {
2   "model": "gpt-4o-mini"
3 }

If the bare model is unique, Alephant expands it internally to the canonical provider/model form.

gpt-4o-mini -> openai/gpt-4o-mini

If the same model ID exists under multiple providers, Alephant returns a 400 Bad Request and asks you to specify the provider.

1 {
2   "error": {
3     "message": "Ambiguous model 'gpt-4o': matches multiple providers. Please specify one of: openai/gpt-4o, azure/gpt-4o",
4     "type": "invalid_request_error",
5     "code": "ambiguous_model"
6   }
7 }

Request Lifecycle

A routed request moves through the same gateway lifecycle:

Receive request
Your application or agent sends an OpenAI-compatible request to Alephant.
Authenticate key
Alephant validates the virtual key and loads workspace, user, agent, session, and key metadata.
Resolve provider and model
Alephant resolves the provider from the request path, router ID, provider-prefixed model, bare model ID, or key-bound provider configuration.
Check policy before dispatch
Alephant evaluates model allowlists, provider access, rate limits, budget rules, concurrency limits, and route-level policy before any upstream provider cost is created.
Adapt and dispatch
Alephant maps the OpenAI-style request into the selected provider format, calls the upstream provider, and applies retry or fallback behavior when configured.
Normalize response
Provider-specific responses, usage fields, errors, streaming events, and finish reasons are normalized into a consistent gateway response.
Record cost and trace metadata
Alephant records provider, model, tokens, cost, latency, status code, cache status, retry count, fallback path, session, agent, user, and workspace metadata.

Policy-Aware Routing

Provider routing is connected to Alephant policy and budget control.

Before dispatching to a provider, Alephant can enforce:

workspace-level provider access
virtual-key scoped provider access
model allowlists and denylists
agent-level model rules
member or team-level budgets
per-session budget caps
rate limits and concurrency controls
route-specific fallback or downgrade rules

This means routing can answer both technical and financial questions:

Can this key use this provider?
Can this agent call this model?
Is the workspace still within budget?
Should this request be blocked, throttled, downgraded, or routed normally?
Which provider/model decision created the final cost?

Fallback and Reliability

Provider routing can also support reliability behavior.

When configured, Alephant can retry failed upstream calls or fall back to another model or provider. This is useful when a provider is unavailable, rate-limited, overloaded, or returning transient errors.

Example fallback chain:

openai/gpt-4o-mini -> anthropic/claude-3-5-haiku -> groq/llama-3.1-70b

Fallback decisions are recorded so the dashboard can show which model actually served the request and how the fallback affected latency, cost, and reliability.

Routing and Cost Attribution

Every route decision becomes part of the Alephant cost ledger.

For each request, Alephant can show:

requested model
resolved provider
resolved model
final provider used after retry or fallback
input tokens
output tokens
total token cost
cache hit or miss
latency
status code
workspace
virtual key
user or team member
agent
session
prompt or tool-call metadata

This makes provider routing visible to both engineering and finance teams.

Examples

OpenAI-compatible request

1 import OpenAI from "openai";
2 
3 const client = new OpenAI({
4   baseURL: "https://api.alephant.io/v1",
5   apiKey: process.env.ALEPHANT_API_KEY,
6 });
7 
8 const response = await client.chat.completions.create({
9   model: "openai/gpt-4o-mini",
10   messages: [
11     { role: "user", content: "Summarize this support conversation." },
12   ],
13 });

Bare model auto-resolution

1 {
2   "model": "gpt-4o-mini",
3   "messages": [
4     { "role": "user", "content": "Explain gateway routing." }
5   ]
6 }

If gpt-4o-mini is unique in the model catalog, Alephant resolves it automatically.

Explicit provider route

1 {
2   "model": "anthropic/claude-3-5-sonnet",
3   "messages": [
4     { "role": "user", "content": "Review this architecture decision." }
5   ]
6 }

Use explicit provider routes when you need deterministic provider selection.

Recommended Usage

Use provider-prefixed model IDs for production workloads.

provider/model_id

Use bare model IDs when you want a simpler developer experience and the model name is unambiguous.

Use configured routers when routing should be controlled centrally by workspace policy rather than hardcoded in application code.

Use direct provider passthrough only when you intentionally want explicit upstream behavior.

Common Questions

Is Alephant just a model proxy?

No. Alephant routes requests, but it also applies policy, budget controls, provider adaptation, retry and fallback behavior, cost attribution, and observability in the gateway path.

Can one application use multiple providers?

Yes. A single application can send OpenAI-compatible requests to Alephant and select different providers by changing the model value or by using configured routers.

What happens if the model is not allowed?

Alephant rejects the request before dispatching it to the provider. This prevents disallowed provider usage and avoids creating upstream cost.

What happens if the model name is ambiguous?

Alephant returns a 400 Bad Request and asks you to specify the provider, such as openai/gpt-4o or azure/gpt-4o.

Does routing affect observability?

Yes. Alephant records both the requested model and the resolved provider/model so teams can see exactly how each request was routed and how much it cost.