> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://developers.alephant.io/llms.txt.
> For full documentation content, see https://developers.alephant.io/llms-full.txt.

# Prompt Manage

Prompt Manage lets your team create reusable prompt templates in the Alephant SaaS dashboard, assign each template a stable ID, and call it at runtime through the Alephant AI Gateway.

Instead of hardcoding prompts in application code, create the prompt once in Alephant, promote the tested version to production, and send the prompt ID with each request.

```typescript
Alephant-Prompt-ID: support-triage
```

Alephant applies the production prompt template, forwards the request to the selected model, and records token usage, cost, latency, route, agent, user, and session metadata.

## **Core Flow**

1. Create a prompt template in the Alephant dashboard.
   <img src="https://files.buildwithfern.com/visual-editor-images/alephantai.docs.buildwithfern.com/2026-05-13T06:37:43.552Z/ai-gateway/prompt-manage/image.png" alt="image" title="image" noZoom={false} />

2. Set a stable prompt ID, such as support-triage.

3. Add system, user, or assistant message segments.

4. Configure model binding and parameters.

5. Promote the tested version to production.

6. Call the gateway with Alephant-Prompt-ID.

## **Create A Template**

In **Prompt Manage**, create a new template and configure:

| **Field**     | **Description**                                    |
| ------------- | -------------------------------------------------- |
| ID            | Runtime ID used in the Alephant-Prompt-ID header.  |
| Template Name | Display name in the dashboard.                     |
| LLM Binding   | Provider/model configuration for the prompt.       |
| Parameters    | Temperature, max tokens, and top P.                |
| Messages      | Reusable prompt messages.                          |
| Variables     | Dynamic placeholders such as \{\{customer\_name}}. |
| Status        | Draft, Production, or Archived.                    |

Use stable IDs that your application can depend on:

support-triage\
security-code-audit\
hermes-planner\
openclaw-browser-agent\
\
**Call A Prompt At Runtime**

Send a normal OpenAI-compatible request to Alephant and include the prompt ID header.

```typescript
curl https://api.alephant.io/v1/chat/completions \
  -H "Authorization: Bearer $ALEPHANT_VIRTUAL_KEY" \
  -H "Content-Type: application/json" \
  -H "Alephant-Prompt-ID: support-triage" \
  -d '{
    "model": "openai/gpt-4o-mini",
    "messages": [
      {
        "role": "user",
        "content": "Customer says their AI invoice increased last night. Explain what happened."
      }
    ]
  }'
```

Alephant prepends the production template messages before the runtime messages you send.

## **Use Variables**

*Prompt templates can contain variables:*

`You are helping {customer_name} on the {plan_name} plan
Limit the response to {max_items} action items`

*Pass variable values in inputs:*

```typescript
{
  "model": "openai/gpt-4o-mini",
  "inputs": {
    "customer_name": "Acme",
    "plan_name": "Team",
    "max_items": 3
  },
  "messages": [
    {
      "role": "user",
      "content": "Summarize the usage spike."
    }
  ]
}
```

If required variables are missing, Alephant rejects the request before provider dispatch.

## **Versioning**

Each save creates a new prompt version.

| **Status** | **Use**                                     |
| ---------- | ------------------------------------------- |
| Draft      | Edit and test safely.                       |
| Production | Runtime version used by Alephant-Prompt-ID. |
| Archived   | Retired version kept for history.           |

Promote a version to production when you want future runtime calls for the same prompt ID to use it.

## **Cost And Observability**

Prompt-managed requests are attributed by prompt ID and version.

The dashboard can show:

* calls
* token usage
* prompt cost
* model and provider
* route
* latency
* linked agent, virtual key, user, and session
* errors or blocked requests

This helps teams understand which prompts are driving spend and whether a new prompt version increases token usage or cost.

## **Notes**

| **Case**                     | **Behavior**                                       |
| ---------------------------- | -------------------------------------------------- |
| No Alephant-Prompt-ID        | Request runs normally without prompt injection.    |
| Valid production prompt ID   | Alephant applies the production template.          |
| Missing inputs for variables | Request is rejected before provider dispatch.      |
| Draft-only prompt            | Promote a version before relying on runtime calls. |

Header names are case-insensitive. The recommended form is:

```typescript
curl https://api.alephant.io/v1/chat/completions \
  -H "Authorization: Bearer $ALEPHANT_API_KEY" \
  -H "Content-Type: application/json" \
  -H "Alephant-Prompt-ID: support-triage" \
  -d '{
    "model": "openai/gpt-4o-mini",
    },
    "messages": [
      {
        "role": "user",
        "content": "The customer asks why AI usage increased last night."
      }
    ]
  }'
```