A unified AI model routing platform. One API key, … models, smart auto-routing.
Auto routing is enabled by default. To control it:
Console → Settings → Routing → Auto routing toggle
Once enabled, Gate.AI automatically selects the best model for each request. If you prefer to pick models yourself, skip this step and specify models directly (e.g. anthropic/claude-sonnet-4.6).
Fully compatible with the OpenAI API. Supports Python, Node.js, curl, and tools across the ecosystem.
Replace the Base URL ( https://api.gate.ai/openai/v1 ) and API key to start using it.
from openai import OpenAI
client = OpenAI(
api_key="GATEAI_API_KEY", # get GATEAI_API_KEY from gate.ai (API Key)
base_url="https://api.gate.ai/openai/v1",
)
completion = client.chat.completions.create(
model="auto",
messages=[
{"role": "system", "content": "system prompt"},
{"role": "user", "content": "how are you?"}
],
)
# get the response from LLM (role=assistant)
print(completion.choices[0].message.content){
"id": "243c850e-214c-431e-977f-ebaf4aa95f56",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! Nice to meet you. How can I help you?"
},
"finish_reason": "stop"
}
],
"created": 1773408946,
"model": "deepseek.v3-v1:0",
"object": "chat.completion",
"usage": {
"prompt_tokens": 5,
"completion_tokens": 15,
"total_tokens": 20
}
}Open the top-right menu → Settings.

In the left sidebar:


Fill in the integration information:
| Item | Description |
|---|---|
| Model ID | Copy a model ID from the model marketplace, such as deepseek/deepseek-v3.2. Do not use auto. |
| OpenAI API Key | Turn on the switch and enter your Gate.AI API Key. |
| OpenAI Base URL | Turn on the switch and enter https://api.gate.ai/openai/v1. |

After finishing the configuration, save it and close Settings.
In Chat, Composer, Agent, and other conversation surfaces, search for or choose the model you just added from the model selector.

| Symptom | Fix |
|---|---|
| 401 / Connection failed | Confirm the Base URL is https://api.gate.ai/openai/v1, the API Key is valid, and the account has available balance. |
| Model unavailable | Confirm the model ID comes from the model marketplace, uses the provider/model format, and is not auto. |
| Model not found in the list | Confirm the Settings page was saved; restart Cursor and try again. |
If you already have Claude Code (Anthropic's terminal / IDE coding assistant) installed, follow these steps to connect Gate.AI.
sk-or-v1- and use it to replace the placeholders below.Claude Code reads environment variables. We recommend following the official LLM gateway documentation and setting:
| Variable | Value |
|---|---|
ANTHROPIC_BASE_URL | https://api.gaterouter.ai/anthropic |
ANTHROPIC_API_KEY | Your Gate.AI API key (sk-or-v1-…) |
export ANTHROPIC_BASE_URL="https://api.gaterouter.ai/anthropic"
export ANTHROPIC_API_KEY="sk-or-v1-xxxxxxxxxxxxxxxx"
claudeAppend the following to ~/.zshrc or ~/.bashrc:
export ANTHROPIC_BASE_URL="https://api.gaterouter.ai/anthropic"
export ANTHROPIC_API_KEY="sk-or-v1-xxxxxxxxxxxxxxxx"After running source ~/.zshrc (or opening a new terminal), run claude.
settings.json (recommended)In user-level or project-level configuration, add an env block (see Claude Code settings for paths), for example in your project directory .claude/settings.json:
{
"env": {
"ANTHROPIC_BASE_URL": "https://api.gaterouter.ai/anthropic",
"ANTHROPIC_API_KEY": "sk-or-v1-xxxxxxxxxxxxxxxx"
}
}Security note: Do not commit real keys to a public repository; use your OS secret store or CI secret injection, and keep local secrets in environment variables only.
To temporarily skip the gateway:
env -u ANTHROPIC_BASE_URL -u ANTHROPIC_API_KEY claude(You must already have Anthropic official credentials or another default provider configured.)
In Gate.AI docs, model IDs look like provider/model-name (for example anthropic/claude-sonnet-4.6). They are not identical to Claude Code built-in aliases such as sonnet. Pick one approach:
export ANTHROPIC_MODEL="anthropic/claude-sonnet-4.6"Or add the same key under env in settings.json.
Map aliases to Gate.AI model IDs (Sonnet example):
export ANTHROPIC_DEFAULT_SONNET_MODEL="anthropic/claude-sonnet-4.6"
export ANTHROPIC_DEFAULT_OPUS_MODEL="anthropic/claude-opus-4.6"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="anthropic/claude-haiku-4.5"Use the Gate.AI docs — Models list as the source of truth for IDs.
/modelTo pick a gateway-backed model in the UI, use Claude Code's custom model option (see the official Model configuration - ANTHROPIC_CUSTOM_MODEL_OPTION):
export ANTHROPIC_CUSTOM_MODEL_OPTION="anthropic/claude-sonnet-4.6"
export ANTHROPIC_CUSTOM_MODEL_OPTION_NAME="Sonnet (Gate.AI)"autoIf auto routing is enabled in the dashboard, you can try setting ANTHROPIC_MODEL to auto (same meaning as auto on the OpenAI setup page). If you see errors, switch back to an explicit model ID such as anthropic/claude-sonnet-4.6.
claudeAfter the session starts, send a simple prompt, for example: In one sentence, introduce yourself.
If you get a normal reply with no auth or routing errors, requests are reaching Gate.AI and the selected model is responding.
| Symptom | Likely cause | What to try |
|---|---|---|
| 401 / auth failure | Wrong API key or env not exported | Check ANTHROPIC_API_KEY matches the dashboard key |
| 404 on URL | Base URL points at the OpenAI path | Use https://api.gaterouter.ai/anthropic |
| Model missing / routing error | Bad model ID or model not allowed | Compare with the Models table; check routing and allow lists in the dashboard |
| Still hitting Anthropic directly | Env vars not applied | Confirm settings.json scope; in a new shell run echo $ANTHROPIC_BASE_URL to verify |

After enabling it, the Developer menu appears in the menu bar.

From the menu bar, choose Developer → Configure Third-Party Inference…



The connection is successful when the model picker shows available models.
| Symptom | Fix |
|---|---|
| Connection failed / 401 | Confirm the Base URL is https://api.gate.ai/anthropic/, Auth scheme is x-api-key, and the API key is valid. |
| Continue with Gateway does not appear | Confirm you clicked Apply Changes and fully quit and restarted Claude Desktop. |
| No models are shown | Confirm your Gate.AI account has available balance, and check that Model discovery is enabled. |
Run in your terminal:
hermes modelIn the menu, choose Custom endpoint and fill in the fields:
| Item | Value |
|---|---|
| API base URL | https://api.gate.ai/openai/v1 |
| API key | Your Gate.AI API key |
| Model | auto (recommended for auto routing), or the full model ID from the console (e.g. deepseek/deepseek-v3.2) |
If you are asked for context length, press Enter to leave it blank (Hermes will detect it).
To edit configuration in the browser, run:
hermes dashboardhermes chat "Hello"Success means the request reached Gate.AI and smart routing or your chosen model returned a result.
You can also run hermes doctor to verify the connection.
macOS / Linux
~/.hermes/config.yaml, main config (model, provider, base_url, api_key, etc.)~/.hermes/.env, secrets and env vars (recommended)Windows
C:\Users\<username>\.hermes\config.yamlC:\Users\<username>\.hermes\.env.env (pick one)Option A (same name as Gate.AI)
# Gate.AI API 密钥
GATEAI_API_KEY=sk-or-v1_xxxxxxxxxxxxxxxxxxxxxOption B (common Hermes custom-endpoint convention)
If model.api_key is not set for a custom endpoint, Hermes falls back to OPENAI_API_KEY. Put your Gate.AI key there:
OPENAI_API_KEY=sk-or-v1_xxxxxxxxxxxxxxxxxxxxxmodel in config.yamlAuto routing (auto)
model:
default: auto
provider: custom
base_url: https://api.gate.ai/openai/v1
api_key: ${GATEAI_API_KEY}If you use Option B, leave api_key blank or remove it so Hermes uses OPENAI_API_KEY.
Hermes expands ${VAR} when loading config (variables must exist in the environment, usually from ~/.hermes/.env).
Fixed model example
The model ID must match the Gate.AI model list.
model:
default: deepseek/deepseek-v3.2
provider: custom
base_url: https://api.gate.ai/openai/v1
api_key: ${GATEAI_API_KEY}After saving, run hermes chat "Hello" to verify the Gate.AI connection.
If you need multiple logical routes under one Gate.AI key (e.g. one with auto and one fixed to deepseek/deepseek-v3.2), add custom_providers in config.yaml (names: letters, digits, hyphens; e.g. gateai-auto):
model:
default: auto
provider: custom
base_url: https://api.gate.ai/openai/v1
api_key: ${GATEAI_API_KEY}
custom_providers:
- name: gateai-auto
base_url: https://api.gate.ai/openai/v1
api_key: ${GATEAI_API_KEY}
model: auto
- name: gateai-deepseek
base_url: https://api.gate.ai/openai/v1
api_key: ${GATEAI_API_KEY}
model: deepseek/deepseek-v3.2hermes model again and pick the named route or Custom endpoint./model syntax from the Hermes docs, for example:/model custom:gateai-auto:auto
/model custom:gateai-deepseek:deepseek/deepseek-v3.2(Names follow custom_providers[].name; format is custom:<profile name>:<model id>.)
Only some models work
Confirm model.provider is custom, and Base URL is https://api.gate.ai/openai/v1. If OpenAI-compatible models work but others do not, check model IDs and routing settings.
401 / invalid API key
Check the key is copied correctly and not expired; after editing .env, restart running hermes and hermes gateway before retrying.
Model not found or empty reply
deepseek/deepseek-v3.2).hermes doctor to inspect config and connectivity.If you already have QClaw installed, follow these steps to connect Gate.AI.
1. In the chat, send the message below. Replace the apiKey value with your Gate.AI API key.
Help me add a new provider
Name: Gate.AI
apiKey: sk-or-v1-xxxxxxxxxxxxxxxx
baseUrl: https://api.gate.ai/openai/v1
Model: autoQClaw will add the provider and restart automatically.
Ask: “Help me verify that my Gate.AI configuration is working.” The assistant should reply with something like “Gate.AI provider was added successfully!” (exact wording may vary.)
Ask: “Switch to auto under Gate.AI.” The assistant should reply with something like “Switched successfully!” (exact wording may vary.)
Click Preferences in the bottom-left, go to Models & API, then click Add custom model.
Click the connection test. If you see “Test successful”, the setup is correct.
Gate.AI(deepseek-v3.2) to use it.| Field | Value |
|---|---|
| Base URL | https://api.gate.ai/openai/v1 |
| Auth | Authorization: Bearer <API_KEY> |
| Format | OpenAI-compatible |
| Pricing | Pay-as-you-go |
Note: The API path is /openai/v1 (not /v1).
| Method | Path | Description |
|---|---|---|
| POST | /chat/completions | Chat completions (streaming supported) |
| GET | /models | List available models |
| Field | Value |
|---|---|
| Base URL | https://api.gate.ai |
| Auth | Authorization: Bearer <API_KEY> |
/api/v1/videosSubmit an async video generation job. Returns 202 Accepted with job_id for polling. Pass Idempotency-Key for safe retries.
Request Parameters
| Name | In | Type | Required | Description |
|---|---|---|---|---|
| Authorization | header | string | Yes | Gate.AI API Key. Format: Bearer <API_KEY> |
| Content-Type | header | string | Yes | Request body format |
| Idempotency-Key | header | string | No | Idempotency key for safe retries — same key returns the existing job |
Request Body
| Name | Type | Required | Description |
|---|---|---|---|
| model | string | Yes | Model ID, use bytedance/seedance-2.0 |
| prompt | string | Yes | Video prompt; describe subject, motion, scene, camera, and style |
| duration | integer | No | Duration in seconds: 4–15 |
| resolution | string | No | Output resolution: 480p, 720p, or 1080p |
| aspect_ratio | string | No | Aspect ratio: 16:9, 4:3, 1:1, 3:4, 9:16, 21:9, or adaptive |
| generate_audio | boolean | No | Whether to generate audio |
| seed | integer | No | Random seed, -1 to 4294967295; same seed does not guarantee identical output |
| size | string | No | Exact output size, e.g. 1280x720 |
| metadata | object | No | Business pass-through fields for auditing or source tagging |
| webhook_url | string | No | Callback URL invoked when the task completes or fails |
Example
{
"model": "bytedance/seedance-2.0",
"prompt": "A golden retriever running on a sunny beach, cinematic camera movement",
"duration": 6,
"resolution": "720p",
"aspect_ratio": "16:9",
"generate_audio": false,
"seed": -1,
"metadata": {
"source": "playground"
},
"webhook_url": "https://example.com/webhooks/gateai-video"
}Response Fields
| Name | Type | Description |
|---|---|---|
| job_id | string | Unique job ID |
| status | string | Task status: pending / in_progress / completed / failed |
| model | string | Model used |
| status_url | string | URL to poll task status |
| message | string | Server message |
| current_balance | string | Current account balance (USD) |
| estimated_cost | string | Estimated cost |
| pre_deduct_amount | string | Pre-deducted display amount |
| balance_after_estimate | string | Balance after applying estimated cost |
| currency | string | Currency, currently USD |
| billing_notice | string | Billing notice |
Response Example
{
"code": 200,
"msg": "",
"data": {
"job_id": "video_abc123",
"status": "in_progress",
"model": "bytedance/seedance-2.0",
"status_url": "https://api.gate.ai/api/v1/videos/video_abc123",
"message": "视频任务已提交,请调用 status_url 查询生成进度。",
"current_balance": "100.0000000000",
"estimated_cost": "1.0800000000",
"pre_deduct_amount": "1.0800000000",
"balance_after_estimate": "98.9200000000",
"currency": "USDT",
"billing_notice": "视频生成完成后按实际任务结果扣款,提交后请保持余额充足。"
}
}Response
| Status | Meaning | Description | Schema |
|---|---|---|---|
| 202 | Accepted | Task accepted and queued. Returns job_id for status polling. | VideoSubmitResponse |
| 400 | Bad Request | Parameter error | ErrorResponse |
| 401 | Unauthorized | Not logged in or token invalid | ErrorResponse |
| 402 | Payment Required | Insufficient balance. Response includes estimated cost and current balance. | ErrorResponse |
| 429 | Too Many Requests | Too many requests. Please slow down. | ErrorResponse |
| 500 | Internal Server Error | Internal server error | ErrorResponse |
/api/v1/videos/{job_id}Poll job status, progress, and billing info. Returns download_url when status is completed.
Request Parameters
| Name | In | Type | Required | Description |
|---|---|---|---|---|
| job_id | path | string | Yes | Video task ID |
| Authorization | header | string | Yes | Gate.AI API Key. Format: Bearer <API_KEY> |
Response Fields
| Name | Type | Description |
|---|---|---|
| job_id | string | Unique job ID |
| status | string | Task status: pending / in_progress / completed / failed |
| model | string | Model used |
| status_url | string | URL to poll task status |
| download_url | string | Video download URL (appears when status is completed) |
| duration | integer | Video duration in seconds |
| resolution | string | Output resolution |
| aspect_ratio | string | Aspect ratio |
| generate_audio | boolean | Whether audio is included |
| estimated_cost | string | Estimated cost |
| billed_cost | string | Actual billed amount |
| billing_status | string | Billing status: pre_deducted or settled |
| expires_at | string(ISO 8601) | Video file expiration time |
| created_at | string(ISO 8601) | Task creation time |
| completed_at | string(ISO 8601) | Task completion time |
Response Example
{
"code": 200,
"msg": "",
"data": {
"job_id": "video_abc123",
"status": "completed",
"model": "bytedance/seedance-2.0",
"status_url": "https://api.gate.ai/api/v1/videos/video_abc123",
"download_url": "https://api.gate.ai/api/v1/videos/video_abc123/content",
"duration": 6,
"resolution": "720p",
"aspect_ratio": "16:9",
"generate_audio": false,
"estimated_cost": "1.0800000000",
"billed_cost": "1.0800000000",
"billing_status": "settled",
"expires_at": "2026-06-26T05:00:00Z",
"created_at": "2026-05-27T05:00:00Z",
"completed_at": "2026-05-27T05:03:00Z"
}
}Response
| Status | Meaning | Description | Schema |
|---|---|---|---|
| 200 | OK | Success | VideoStatusResponse |
| 401 | Unauthorized | Not logged in or token invalid | ErrorResponse |
| 404 | Not Found | Job not found, or job does not belong to this API key. | ErrorResponse |
| 500 | Internal Server Error | Internal server error | ErrorResponse |
/api/v1/videos/{job_id}/contentAuthenticate and redirect (302) to the temporary video file URL.
Request Parameters
| Name | In | Type | Required | Description |
|---|---|---|---|---|
| job_id | path | string | Yes | Video task ID |
| Authorization | header | string | Yes | Gate.AI API Key. Format: Bearer <API_KEY> |
Response Example
HTTP/1.1 302 Found
Location: https://cdn.example.com/videos/video_abc123.mp4?expires=1780000000Response
| Status | Meaning | Description | Schema |
|---|---|---|---|
| 302 | Found | Redirects to temporary video download URL. | |
| 401 | Unauthorized | Not logged in or token invalid | ErrorResponse |
| 404 | Not Found | Job not found, or job does not belong to this API key. | ErrorResponse |
| 409 | Conflict | Task not yet completed; video is not available for download. | ErrorResponse |
| 410 | Gone | Video content has expired and is no longer available. | ErrorResponse |
| 500 | Internal Server Error | Internal server error | ErrorResponse |
| Model ID | Description | Use Case |
|---|---|---|
| openai/gpt-5.2 | OpenAI latest | Reasoning tasks |
| openai/gpt-5 | OpenAI general-purpose flagship | General purpose |
| openai/gpt-5-mini | OpenAI lightweight | General / cost optimization |
| openai/gpt-5-nano | OpenAI ultra low cost | Simple tasks |
| openai/gpt-4.1 | OpenAI stable | General purpose |
| openai/gpt-4.1-nano | OpenAI lightweight stable | Simple tasks |
| anthropic/claude-opus-4.6 | Anthropic's most capable | Complex reasoning |
| anthropic/claude-sonnet-4.6 | Anthropic balanced | General purpose |
| anthropic/claude-sonnet-4.5 | Anthropic previous gen | General purpose |
| anthropic/claude-haiku-4.5 | Anthropic fast | Simple tasks |
| google/gemini-3.1-pro | Google latest flagship | Long context / reasoning |
| google/gemini-2.5-pro | Google previous gen flagship | Long context |
| deepseek/deepseek-v3.2 | DeepSeek latest | Cost-effective |
| deepseek/deepseek-v3.1 | DeepSeek previous gen | General purpose |
| x-ai/grok-4 | xAI latest flagship | Reasoning / real-time info |
| x-ai/grok-4.1-fast | xAI high-speed | Fast response |
| moonshotai/kimi-k2.5 | Moonshot strong long-context | Long context |
| z-ai/glm-5 | Z.ai latest | General purpose |
| z-ai/glm-5-turbo | Coding & reasoning | Multi-scenario |
| z-ai/glm-4.7-flash | Z.ai fast tier | Simple tasks |
| minimax/minimax-m2.5 | MiniMax multimodal | General purpose |
Model ID format: provider/model-name. Version numbers use . (e.g. 4.6), not -.
For more models, visit the Models page.
| Error message | HTTP status | Cause | Solution |
|---|---|---|---|
invalid api key | 401 | The API Key is invalid, expired, revoked, or disabled | Go to Console → API Keys and confirm the key status is "active". If it has expired, generate a new one |
| Error message | HTTP status | Cause | Solution |
|---|---|---|---|
no model config found for: {model} | 404 | The requested model ID does not exist | Open the model list and confirm the model ID is spelled correctly |
model field is required | 400 | The request body is missing the model field | Add "model": "model name" to the request JSON |
invalid or empty requested model | 400 | The requested model name is empty or invalid | Open the model list and confirm you are using the correct model ID format |
unknown api path | 404 | The API path is incorrect | Confirm the Base URL is https://api.gate.ai/openai/v1 or https://api.gate.ai/anthropic |
| Error message | HTTP status | Cause | Solution |
|---|---|---|---|
invalid JSON body | 400 | The request body is not valid JSON | Check that the request body is valid JSON |
failed to read request body | 400 | The request body could not be read | Confirm the body is not corrupted and Content-Type is set to application/json |
failed to rewrite request body | 500 | Failed to rewrite the request body inside the gateway | Retry. If it keeps happening, contact technical support |
images are not supported by this model | 400 | The target model does not support image input | Switch to a multimodal model that supports images, such as gpt-4o |
audio is not supported by this model | 400 | The target model does not support audio input | Switch to a model that supports audio input |
unsupported parameter: max_tokens | 400 | Some models do not support the max_tokens parameter | Use max_completion_tokens instead |
| Error message | HTTP status | Cause | Solution |
|---|---|---|---|
api key budget quota exceeded | 429 | The API Key budget has been exhausted | Go to Console → API Keys → Budget settings, increase the budget, or wait for the quota to reset |
guardrail budget limit exceeded | 429 | The guardrail budget limit has been exceeded | Check the budget limit in guardrail settings, then increase the limit or reduce usage frequency |
organization guardrail budget limit exceeded | 429 | The organization-level guardrail budget has been exceeded | Contact an organization admin to adjust the organization-level budget limit |
model not allowed by guardrail policy | 403 | The model is not allowed by the guardrail policy | Go to Console → Guardrails and add the target model to the allowlist |
The free model usage has reached its daily global limit today. | 429 | The global daily limit for the free model has been reached | Wait for the next daily reset, or upgrade to a paid plan for unrestricted models |
The free model usage has reached its daily limit today. | 429 | Your personal daily limit for the free model has been reached | Wait for the next daily reset, or upgrade to a paid plan |
Guest daily spending limit exceeded. Please try again tomorrow or upgrade to a paid plan. | 429 | The guest daily spending limit has been exceeded | Register an account and upgrade to a paid plan, or wait for the next daily reset |
| Error message | HTTP status | Cause | Solution |
|---|---|---|---|
Pending payment {amount} USD — Please top up... | 402 | The account balance is insufficient and has an outstanding payment | Top up with Gate Pay |
Insufficient balance and account in debt | 402 | The balance is insufficient and the account is in debt | Top up with Gate Pay |
billing model info not found for model "{model}" | 400 | Billing information for the model does not exist | Confirm the model ID is correct. If it is a new model, contact technical support to configure billing rules |
billing model info is ambiguous for model "{model}" | 400 | Billing information is ambiguous because multiple records match | Contact technical support to check the model billing configuration |
billing configuration error | 500 | Billing rules are misconfigured on the server | Contact technical support to fix the billing configuration |
| Error message | HTTP status | Cause | Solution |
|---|---|---|---|
bad gateway | 502 | The gateway failed to communicate with the upstream service | Retry. If it keeps happening, check the service status page or contact technical support |
upstream service unavailable | 502 | The upstream AI Provider is unavailable | Retry later, or switch to another available model |
upstream service error | 502 | The upstream AI Provider returned an error | Check whether the request parameters match the target model requirements. If it keeps happening, contact technical support |
request timeout | 504 | The upstream request timed out | Reduce the input length or increase the timeout, then retry |
no provider handler configured for protocol | 502 | No protocol handler is configured in the gateway | Contact technical support to check the gateway configuration |
| Error message | HTTP status | Cause | Solution |
|---|---|---|---|
internal server error | 500 | Internal gateway error | Retry. If it keeps happening, contact technical support with the Request ID |
failed to record request log | 500 | Failed to record the request log | This does not affect the request result and can be ignored. Contact technical support if it happens frequently |
failed to list models | 500 | Failed to query the model list | Retry. If it keeps happening, contact technical support |
# ❌ Incorrect
https://api.gate.ai/v1/chat/completions
# ✅ Correct
https://api.gate.ai/openai/v1/chat/completions # OpenAI protocol
https://api.gate.ai/anthropic/v1/messages # Anthropic protocol// ❌ Some models do not support this
{ "model": "gpt-4o", "max_tokens": 100 }
// ✅ Use max_completion_tokens
{ "model": "openai/gpt-5.5", "max_completion_tokens": 100 }# ✅ Correct request headers
Authorization: Bearer sk-xxxxxxxxxxxxxxxxxxxx # OpenAI protocol
X-api-key: sk-xxxxxxxxxxxxxxxxxxxx # Anthropic protocol