Gate.AI Documentation

A unified AI model routing platform. One API key, … models, smart auto-routing.

Getting Started

1. Create an API key

Go to gate.ai, choose login method, and authorize
Go to Console → Settings → API keys → Create a key

2. Auto routing (optional)

Auto routing is enabled by default. To control it:

Console → Settings → Routing → Auto routing toggle

Once enabled, Gate.AI automatically selects the best model for each request. If you prefer to pick models yourself, skip this step and specify models directly (e.g. anthropic/claude-sonnet-4.6).

Standard Setup

Fully compatible with the OpenAI API. Supports Python, Node.js, curl, and tools across the ecosystem.

Replace the Base URL ( https://api.gate.ai/openai/v1 ) and API key to start using it.

from openai import OpenAI

client = OpenAI(
    api_key="GATEAI_API_KEY",  # get GATEAI_API_KEY from gate.ai (API Key)
    base_url="https://api.gate.ai/openai/v1",
)

completion = client.chat.completions.create(
    model="auto",
    messages=[
        {"role": "system", "content": "system prompt"},
        {"role": "user", "content": "how are you?"}
    ],
)

# get the response from LLM (role=assistant)
print(completion.choices[0].message.content)

Response example:

{
    "id": "243c850e-214c-431e-977f-ebaf4aa95f56",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "Hello! Nice to meet you. How can I help you?"
            },
            "finish_reason": "stop"
        }
    ],
    "created": 1773408946,
    "model": "deepseek.v3-v1:0",
    "object": "chat.completion",
    "usage": {
        "prompt_tokens": 5,
        "completion_tokens": 15,
        "total_tokens": 20
    }
}

Claude Code CLI Setup

Create a Gate.AI API Key

Open gate.ai → Dashboard → API Keys, create and copy a key starting with sk-or-v1-…
Confirm your account has sufficient balance

Network connectivity check

Replace GATEAI_API_KEY with your key:

export GATEAI_API_KEY="sk-or-v1-xxxxxxxxxxxxxxxx"

Anthropic-compatible endpoint (Claude Code):

curl -s -o /dev/null -w "%{http_code}" \
  -H "x-api-key: $GATEAI_API_KEY" \
  -H "content-type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -d '{"model":"anthropic/claude-sonnet-4.6","max_tokens":16,"messages":[{"role":"user","content":"hi"}]}' \
  https://api.gate.ai/anthropic/v1/messages

Returns 200: gateway is reachable; proceed with install and configuration
Returns 401: invalid or expired key — check the dashboard
Connection timeout: check local network or DNS; do not use https://api.gate.ai/v1

Install Claude Code CLI

macOS / Linux / WSL (recommended):

curl -fsSL https://claude.ai/install.sh | bash

Or with npm:

npm install -g @anthropic-ai/claude-code

If npm install fails

This is usually caused by registry.npmjs.org timeouts or instability — unrelated to the Gate.AI gateway. Try one of the following:

Option A: one-off install with a mirror (try first)

npm install -g @openai/codex --registry=https://registry.npmmirror.com
npm install -g @anthropic-ai/claude-code --registry=https://registry.npmmirror.com

Option B: set mirror globally

# Use npmmirror (common in China)
npm config set registry https://registry.npmmirror.com

# Verify registry
npm config get registry

# Reinstall
npm install -g @openai/codex
npm install -g @anthropic-ai/claude-code

Option C: current shell session only

export NPM_CONFIG_REGISTRY=https://registry.npmmirror.com
npm install -g @openai/codex

Symptom	Fix
`ETIMEDOUT` / `ECONNRESET`	Switch mirror (Option A or B) and retry
`EACCES` permission error	Configure npm global prefix: `mkdir -p ~/.npm-global && npm config set prefix ~/.npm-global`, and add `export PATH=~/.npm-global/bin:$PATH` to `~/.zshrc`
`command not found: claude` / `codex`	Installed but PATH not updated: restart the terminal, or confirm `bin` under `npm config get prefix` is on PATH

Configure model

Replace all sk-or-v1-your-key placeholders with your real key. No need to repeat unless you change the key or model.

mkdir -p ~/.claude

cat > ~/.claude/settings.json <<'EOF'
{
  "env": {
    "ANTHROPIC_BASE_URL": "https://api.gate.ai/anthropic",
    "ANTHROPIC_API_KEY": "sk-or-v1-your-key",
    "ANTHROPIC_MODEL": "anthropic/claude-sonnet-4.6"
  },
  "includeCoAuthoredBy": false
}
EOF

# Clear old Anthropic / proxy sessions to avoid auth conflicts
claude /logout 2>/dev/null || true
unset ANTHROPIC_AUTH_TOKEN

Store credentials only in settings.json. If ~/.zshrc still has ANTHROPIC_AUTH_TOKEN or a duplicate ANTHROPIC_API_KEY, comment or remove them.

Verify setup

In your project directory, start the CLI for AI-assisted coding in the terminal.

Run claude

claude

First check: run /status and confirm Base URL is https://api.gate.ai/anthropic and auth token is ANTHROPIC_API_KEY.

Advanced configuration

Auth conflict

If you see Auth conflict: Both a token (ANTHROPIC_AUTH_TOKEN) and an API key (ANTHROPIC_API_KEY) are set:

Scenario	Fix
Previously logged in via Anthropic OAuth / local proxy	Run `claude /logout`, then start `claude` again
Both token and key configured in shell and settings	Keep only `ANTHROPIC_API_KEY`; comment out `ANTHROPIC_AUTH_TOKEN` in `~/.zshrc`

Alternative configuration

Shell environment variables (do not duplicate settings.json):

export ANTHROPIC_BASE_URL="https://api.gate.ai/anthropic"
export ANTHROPIC_API_KEY="sk-or-v1-xxxxxxxxxxxxxxxx"
export ANTHROPIC_MODEL="anthropic/claude-sonnet-4.6"

Project-level settings (share structure; never commit real keys):

Path	Purpose
`.claude/settings.json`	Project-level, committable (no secrets)
`.claude/settings.local.json`	Local project key; add to `.gitignore`

Model environment variables

Add to the env block in settings.json:

Variable	Purpose	Example
`ANTHROPIC_DEFAULT_SONNET_MODEL`	Sonnet-tier tasks	`anthropic/claude-sonnet-4.6`
`ANTHROPIC_DEFAULT_OPUS_MODEL`	Opus-tier tasks	`anthropic/claude-opus-4.6`
`ANTHROPIC_DEFAULT_HAIKU_MODEL`	Haiku-tier tasks	`anthropic/claude-haiku-4.5`
`CLAUDE_CODE_SUBAGENT_MODEL`	Sub-agent tasks	`anthropic/claude-sonnet-4.6`

Full list: model catalog. If auto fails, use an explicit model ID.

Restore direct Anthropic (optional)

env -u ANTHROPIC_BASE_URL -u ANTHROPIC_API_KEY claude

Troubleshooting

Common Base URL mistakes

# ❌ Wrong (missing /anthropic prefix, 404)
https://api.gate.ai/v1
https://api.gate.ai/v1/chat/completions

# ❌ Wrong (do not use full API path in Claude config)
https://api.gate.ai/anthropic/v1/messages
https://api.gate.ai/anthropic/messages

# ✅ Correct
https://api.gate.ai/anthropic/v1
https://api.gate.ai/anthropic          # Claude Code CLI
https://api.gate.ai/anthropic/v1/messages  # curl key check (Anthropic)
https://api.gate.ai/openai/v1/responses    # curl key check (Codex)

/openai/v1/chat/completions works, but Codex must use the Responses API. On 404 for /responses, check wire_api = "responses", not the URL.
/openai/v1/models does not validate keys (invalid keys may still return 200). Use the curl endpoints above to verify keys.

Quick reference

Symptom	Type	Suggestion
Auth conflict warning	Auth conflict	Run `claude /logout`; keep only `ANTHROPIC_API_KEY`; remove or comment `ANTHROPIC_AUTH_TOKEN`
401 / auth failure	Auth error	Check the key is correct and not expired
404 on URL	Path error	Claude → `https://api.gate.ai/anthropic`; Codex → `https://api.gate.ai/openai/v1`. Do not use `https://api.gate.ai/v1/...`
Model not found	Model ID error	Use `provider/model-name` format (e.g. `anthropic/claude-sonnet-4.6`, `openai/gpt-5.2`); see the model catalog
Still connecting to official domain	Config not applied	Use user-level config: Claude → `~/.claude/settings.json`; Codex → `~/.codex/config.toml` (project `.codex/config.toml` cannot override gateway settings)
Shows offline	CLI behavior	Does not affect main chat; safe to ignore
402 / 429	Quota / rate limit	Top up or check key budget and rate limits
npm install fails	Network / permissions	See "If npm install fails"

Codex CLI Setup

Create a Gate.AI API Key

Open gate.ai → Dashboard → API Keys, create and copy a key starting with sk-or-v1-…
Confirm your account has sufficient balance

Network connectivity check

Replace GATEAI_API_KEY with your key:

export GATEAI_API_KEY="sk-or-v1-xxxxxxxxxxxxxxxx"

OpenAI-compatible endpoint (Codex / standard API):

curl -s -o /dev/null -w "%{http_code}" \
  -H "Authorization: Bearer $GATEAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"openai/gpt-5.2","input":"hi","max_output_tokens":16}' \
  https://api.gate.ai/openai/v1/responses

Returns 200: gateway is reachable; proceed with install and configuration
Returns 401: invalid or expired key — check the dashboard
Connection timeout: check local network or DNS; do not use https://api.gate.ai/v1

Install Codex CLI

npm install -g @openai/codex

curl -fsSL https://chatgpt.com/codex/install.sh | sh

If you use Homebrew

brew install --cask codex

Verify: codex --version

If npm install fails

This is usually caused by registry.npmjs.org timeouts or instability — unrelated to the Gate.AI gateway. Try one of the following:

Option A: one-off install with a mirror (try first)

npm install -g @openai/codex --registry=https://registry.npmmirror.com
npm install -g @anthropic-ai/claude-code --registry=https://registry.npmmirror.com

Option B: set mirror globally

# Use npmmirror (common in China)
npm config set registry https://registry.npmmirror.com

# Verify registry
npm config get registry

# Reinstall
npm install -g @openai/codex
npm install -g @anthropic-ai/claude-code

Option C: current shell session only

export NPM_CONFIG_REGISTRY=https://registry.npmmirror.com
npm install -g @openai/codex

Symptom	Fix
`ETIMEDOUT` / `ECONNRESET`	Switch mirror (Option A or B) and retry
`EACCES` permission error	Configure npm global prefix: `mkdir -p ~/.npm-global && npm config set prefix ~/.npm-global`, and add `export PATH=~/.npm-global/bin:$PATH` to `~/.zshrc`
`command not found: claude` / `codex`	Installed but PATH not updated: restart the terminal, or confirm `bin` under `npm config get prefix` is on PATH

Configure model

Replace all sk-or-v1-your-key placeholders with your real key. No need to repeat unless you change the key or model.

mkdir -p ~/.codex

cat > ~/.codex/config.toml <<'EOF'
model_provider = "gateai"
model = "openai/gpt-5.2"

[model_providers.gateai]
name = "Gate.AI"
base_url = "https://api.gate.ai/openai/v1"
env_key = "GATEAI_API_KEY"
wire_api = "responses"
requires_openai_auth = false
EOF

grep -q 'GATEAI_API_KEY=' ~/.zshrc 2>/dev/null || cat >> ~/.zshrc <<'EOF'

# Gate.AI for Codex CLI
export GATEAI_API_KEY="sk-or-v1-your-key"
EOF

source ~/.zshrc

Codex gateway settings must be in user-level ~/.codex/config.toml. Project-level .codex/config.toml cannot override these values.

Verify setup

In your project directory, start the CLI for AI-assisted coding in the terminal.

Run codex

codex

If you get a normal reply without 401 / 404, routing through Gate.AI succeeded.

Advanced configuration

Config reference

Field	Description	Example
`model_provider`	Provider name	`"gateai"`
`model`	Gate.AI model ID	`"openai/gpt-5.2"`
`base_url`	Gate OpenAI-compatible endpoint	`https://api.gate.ai/openai/v1`
`env_key`	API key environment variable	`"GATEAI_API_KEY"`
`wire_api`	Codex protocol type	`"responses"`
`requires_openai_auth`	When Gate key is not OpenAI official format	`false`
`model_reasoning_effort`	Reasoning effort (optional)	`"low"` / `"medium"` / `"high"`

Alternative configuration

Built-in OpenAI provider (try if Option A auth fails)

model = "openai/gpt-5.2"
openai_base_url = "https://api.gate.ai/openai/v1"

Environment variable: export OPENAI_API_KEY="sk-or-v1-xxxxxxxxxxxxxxxx"

Temporary CLI override

export GATEAI_API_KEY="sk-or-v1-xxxxxxxxxxxxxxxx"
codex --config openai_base_url='"https://api.gate.ai/openai/v1"' --config model='"openai/gpt-5.2"'

Switch model

Edit model in ~/.codex/config.toml, or run: codex --model openai/gpt-5.2

Restore direct OpenAI (optional)

Remove Gate-related settings from ~/.codex/config.toml and unset GATEAI_API_KEY.

Troubleshooting

Common Base URL mistakes

# ❌ Wrong (missing /openai prefix, 404)
https://api.gate.ai/v1
https://api.gate.ai/v1/chat/completions

# ❌ Wrong (do not use full API path in Codex config)
https://api.gate.ai/openai/v1/messages
https://api.gate.ai/openai/messages

# ✅ Correct
https://api.gate.ai/openai/v1          # Codex CLI
https://api.gate.ai/anthropic/v1/messages  # curl key check (Anthropic)
https://api.gate.ai/openai/v1/responses    # curl key check (Codex)

/openai/v1/chat/completions works, but Codex must use the Responses API. On 404 for /responses, check wire_api = "responses", not the URL.
/openai/v1/models does not validate keys (invalid keys may still return 200). Use the curl endpoints above to verify keys.

Quick reference

Symptom	Type	Suggestion
401 / auth failure	Auth error	Check the key is correct and not expired; for Codex confirm `requires_openai_auth = false`
404 on URL	Path error	Claude → `https://api.gate.ai/anthropic`; Codex → `https://api.gate.ai/openai/v1`. Do not use `https://api.gate.ai/v1/...`
404 on `/responses`	Protocol error	Base URL may be correct but Responses API not used. Check `~/.codex/config.toml`: `wire_api = "responses"`, `base_url = "https://api.gate.ai/openai/v1"`
curl `/models` returns 200 but CLI still 401	Wrong key check	`/openai/v1/models` does not validate keys. Use `/openai/v1/responses` or `/anthropic/v1/messages` instead
Model not found	Model ID error	Use `provider/model-name` format (e.g. `anthropic/claude-sonnet-4.6`, `openai/gpt-5.2`); see the model catalog
Still connecting to official domain	Config not applied	Use user-level config: Claude → `~/.claude/settings.json`; Codex → `~/.codex/config.toml` (project `.codex/config.toml` cannot override gateway settings)
402 / 429	Quota / rate limit	Top up or check key budget and rate limits
npm install fails	Network / permissions	See "If npm install fails"

Cursor Setup

Prerequisites

Cursor is installed
You have a Gate.AI API Key

Step 1: Open Cursor Settings

Open the top-right menu → Settings.

Screenshot of the Cursor Settings entry point — Reference: open Cursor Settings.

Step 2: Open Models Settings

In the left sidebar:

Find and open Models
Click View All Models
Scroll to the bottom and click Add Custom Model

Screenshot of the Cursor Models settings menu — Reference: open Models settings.

Screenshot of Add Custom Model in Cursor — Reference: click Add Custom Model.

Step 3: Add Configuration

Fill in the integration information:

Item	Description
Model ID	Copy a model ID from the model marketplace, such as `deepseek/deepseek-v3.2`. Do not use `auto`.
OpenAI API Key	Turn on the switch and enter your Gate.AI API Key.
OpenAI Base URL	Turn on the switch and enter `https://api.gate.ai/openai/v1`.

Screenshot of API Key and Base URL settings in Cursor — Reference: enter API Key and Base URL.

Step 4: Save and Close Settings

After finishing the configuration, save it and close Settings.

Step 5: Use Gate.AI in Cursor

In Chat, Composer, Agent, and other conversation surfaces, search for or choose the model you just added from the model selector.

Screenshot of the Cursor model selector — Reference: choose the model you just added from the model selector.

FAQ

Symptom	Fix
401 / Connection failed	Confirm the Base URL is `https://api.gate.ai/openai/v1`, the API Key is valid, and the account has available balance.
Model unavailable	Confirm the model ID comes from the model marketplace, uses the `provider/model` format, and is not `auto`.
Model not found in the list	Confirm the Settings page was saved; restart Cursor and try again.

Claude Code Setup

If you already have Claude Code (Anthropic's terminal / IDE coding assistant) installed, follow these steps to connect Gate.AI.

1. Create a Gate.AI API key

Go to Dashboard → Settings → API Keys and create a key.
Copy the key that starts with sk-or-v1- and use it to replace the placeholders below.
For auto routing: go to Dashboard → Settings → Routing and enable Auto routing. When it is off, specify a model ID explicitly in each request.

2. Configure Anthropic Base URL and API key

Claude Code reads environment variables. We recommend following the official LLM gateway documentation and setting:

Variable	Value
`ANTHROPIC_BASE_URL`	`https://api.gate.ai/anthropic`
`ANTHROPIC_API_KEY`	Your Gate.AI API key (`sk-or-v1-…`)

Option A: Current terminal session (temporary)

export ANTHROPIC_BASE_URL="https://api.gate.ai/anthropic"
export ANTHROPIC_API_KEY="sk-or-v1-xxxxxxxxxxxxxxxx"
claude

Option B: Shell profile

Append the following to ~/.zshrc or ~/.bashrc:

export ANTHROPIC_BASE_URL="https://api.gate.ai/anthropic"
export ANTHROPIC_API_KEY="sk-or-v1-xxxxxxxxxxxxxxxx"

After running source ~/.zshrc (or opening a new terminal), run claude.

Option C: Claude Code `settings.json` (recommended)

In user-level or project-level configuration, add an env block (see Claude Code settings for paths), for example in your project directory .claude/settings.json:

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://api.gate.ai/anthropic",
    "ANTHROPIC_API_KEY": "sk-or-v1-xxxxxxxxxxxxxxxx"
  }
}

Security note: Do not commit real keys to a public repository; use your OS secret store or CI secret injection, and keep local secrets in environment variables only.

Bypass the gateway and use Anthropic directly

To temporarily skip the gateway:

env -u ANTHROPIC_BASE_URL -u ANTHROPIC_API_KEY claude

(You must already have Anthropic official credentials or another default provider configured.)

3. Configure models (Gate.AI model IDs)

In Gate.AI docs, model IDs look like provider/model-name (for example anthropic/claude-sonnet-4.6). They are not identical to Claude Code built-in aliases such as sonnet. Pick one approach:

3.1 Default model via environment variable

export ANTHROPIC_MODEL="anthropic/claude-sonnet-4.6"

Or add the same key under env in settings.json.

3.2 Alias mapping

Map aliases to Gate.AI model IDs (Sonnet example):

export ANTHROPIC_DEFAULT_SONNET_MODEL="anthropic/claude-sonnet-4.6"
export ANTHROPIC_DEFAULT_OPUS_MODEL="anthropic/claude-opus-4.6"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="anthropic/claude-haiku-4.5"

Use the Gate.AI docs — Models list as the source of truth for IDs.

3.3 Custom entry in `/model`

To pick a gateway-backed model in the UI, use Claude Code's custom model option (see the official Model configuration - ANTHROPIC_CUSTOM_MODEL_OPTION):

export ANTHROPIC_CUSTOM_MODEL_OPTION="anthropic/claude-sonnet-4.6"
export ANTHROPIC_CUSTOM_MODEL_OPTION_NAME="Sonnet (Gate.AI)"

3.4 Auto-routing model `auto`

If auto routing is enabled in the dashboard, you can try setting ANTHROPIC_MODEL to auto (same meaning as auto on the OpenAI setup page). If you see errors, switch back to an explicit model ID such as anthropic/claude-sonnet-4.6.

4. Verify the setup

In a terminal with the environment configured, run:

claude

After the session starts, send a simple prompt, for example: In one sentence, introduce yourself.
If you get a normal reply with no auth or routing errors, requests are reaching Gate.AI and the selected model is responding.

5. FAQ

Symptom	Likely cause	What to try
401 / auth failure	Wrong API key or env not exported	Check `ANTHROPIC_API_KEY` matches the dashboard key
404 on URL	Base URL points at the OpenAI path	Use `https://api.gate.ai/anthropic`
Model missing / routing error	Bad model ID or model not allowed	Compare with the Models table; check routing and allow lists in the dashboard
Still hitting Anthropic directly	Env vars not applied	Confirm `settings.json` scope; in a new shell run `echo $ANTHROPIC_BASE_URL` to verify

Claude Desktop Setup

Prerequisites

Claude Desktop is installed
Gate.AI API Key

Step 1: Enable Developer Mode

Launch Claude Desktop. You do not need to sign in with an Anthropic account.
From the menu bar, choose Help → Troubleshooting → Enable Developer Mode.

Screenshot showing the Enable Developer Mode menu item in Claude Desktop — Reference: enable Developer Mode from Help → Troubleshooting.

After enabling it, the Developer menu appears in the menu bar.

Reference: the Developer menu appears after Developer Mode is enabled.

Step 2: Open Third-Party Inference Settings

From the menu bar, choose Developer → Configure Third-Party Inference…

Screenshot showing the Configure Third-Party Inference menu item in Claude Desktop — Reference: open Configure Third-Party Inference from the Developer menu.

Step 3: Enter Gate.AI Credentials and Models

Choose Gateway and Static API key, then enter your Gate.AI credentials.
After filling them in, click Test connection to test the connection.

Screenshot of Gateway and Static API key configuration in Claude Desktop — Reference: enter Gateway and Static API key configuration.

Turn on Model discovery.
After enabling it, you can click Test connection again.

Screenshot of the Model discovery switch in Claude Desktop — Reference: turn on Model discovery.

Click Apply Changes to save.

Step 4: Restart and Launch

Fully quit Claude Desktop, including the app process, then reopen it.
On the launch page, choose Continue with Gateway. No Anthropic account is required.

Step 5: Verify

The connection is successful when the model picker shows available models.

FAQ

Symptom	Fix
Connection failed / 401	Confirm the Base URL is `https://api.gate.ai/anthropic/`, Auth scheme is x-api-key, and the API key is valid.
Continue with Gateway does not appear	Confirm you clicked Apply Changes and fully quit and restarted Claude Desktop.
No models are shown	Confirm your Gate.AI account has available balance, and check that Model discovery is enabled.

Hermes Setup

Prerequisites

Create an API key in the Gate.AI Console.
If you use auto routing, enable it under Settings → Routing in the console.

Option 1: Terminal setup

1. Choose model and custom endpoint

Run in your terminal:

hermes model

In the menu, choose Custom endpoint and fill in the fields:

Item	Value
API base URL	https://api.gate.ai/openai/v1
API key	Your Gate.AI API key
Model	auto (recommended for auto routing), or the full model ID from the console (e.g. deepseek/deepseek-v3.2)

If you are asked for context length, press Enter to leave it blank (Hermes will detect it).

2. (Optional) Local web dashboard

To edit configuration in the browser, run:

hermes dashboard

3. Verify

hermes chat "Hello"

Success means the request reached Gate.AI and smart routing or your chosen model returned a result.

You can also run hermes doctor to verify the connection.

Option 2: Edit configuration files

1. File locations

macOS / Linux

~/.hermes/config.yaml, main config (model, provider, base_url, api_key, etc.)
~/.hermes/.env, secrets and env vars (recommended)

Windows

C:\Users\<username>\.hermes\config.yaml
C:\Users\<username>\.hermes\.env

2. Save the API key in `.env` (pick one)

Option A (same name as Gate.AI)

# Gate.AI API 密钥
GATEAI_API_KEY=sk-or-v1_xxxxxxxxxxxxxxxxxxxxx

Option B (common Hermes custom-endpoint convention)

If model.api_key is not set for a custom endpoint, Hermes falls back to OPENAI_API_KEY. Put your Gate.AI key there:

OPENAI_API_KEY=sk-or-v1_xxxxxxxxxxxxxxxxxxxxx

3. Configure `model` in `config.yaml`

Auto routing (auto)

model:
  default: auto
  provider: custom
  base_url: https://api.gate.ai/openai/v1
  api_key: ${GATEAI_API_KEY}

If you use Option B, leave api_key blank or remove it so Hermes uses OPENAI_API_KEY.

Hermes expands ${VAR} when loading config (variables must exist in the environment, usually from ~/.hermes/.env).

Fixed model example

The model ID must match the Gate.AI model list.

model:
  default: deepseek/deepseek-v3.2
  provider: custom
  base_url: https://api.gate.ai/openai/v1
  api_key: ${GATEAI_API_KEY}

4. Verify after saving

After saving, run hermes chat "Hello" to verify the Gate.AI connection.

Multiple routes / models

If you need multiple logical routes under one Gate.AI key (e.g. one with auto and one fixed to deepseek/deepseek-v3.2), add custom_providers in config.yaml (names: letters, digits, hyphens; e.g. gateai-auto):

model:
  default: auto
  provider: custom
  base_url: https://api.gate.ai/openai/v1
  api_key: ${GATEAI_API_KEY}

custom_providers:
  - name: gateai-auto
    base_url: https://api.gate.ai/openai/v1
    api_key: ${GATEAI_API_KEY}
    model: auto

  - name: gateai-deepseek
    base_url: https://api.gate.ai/openai/v1
    api_key: ${GATEAI_API_KEY}
    model: deepseek/deepseek-v3.2

How to switch

Terminal: run hermes model again and pick the named route or Custom endpoint.
In chat (TUI): use the /model syntax from the Hermes docs, for example:

/model custom:gateai-auto:auto
/model custom:gateai-deepseek:deepseek/deepseek-v3.2

(Names follow custom_providers[].name; format is custom:<profile name>:<model id>.)

FAQ

Only some models work

Confirm model.provider is custom, and Base URL is https://api.gate.ai/openai/v1. If OpenAI-compatible models work but others do not, check model IDs and routing settings.

401 / invalid API key

Check the key is copied correctly and not expired; after editing .env, restart running hermes and hermes gateway before retrying.

Model not found or empty reply

Model ID matches the console (e.g. deepseek/deepseek-v3.2).
Auto routing is enabled in the Gate.AI console.
Run hermes doctor to inspect config and connectivity.

QClaw Setup

If you already have QClaw installed, follow these steps to connect Gate.AI.

Configure in chat

1. In the chat, send the message below. Replace the apiKey value with your Gate.AI API key.

Help me add a new provider
Name: Gate.AI
apiKey: sk-or-v1-xxxxxxxxxxxxxxxx
baseUrl: https://api.gate.ai/openai/v1
Model: auto

QClaw will add the provider and restart automatically.

2. Verify

Ask: “Help me verify that my Gate.AI configuration is working.” The assistant should reply with something like “Gate.AI provider was added successfully!” (exact wording may vary.)

3. Switch to Gate.AI

Ask: “Switch to auto under Gate.AI.” The assistant should reply with something like “Switched successfully!” (exact wording may vary.)

AutoClaw Setup

1. Open the configuration entry

Click Preferences in the bottom-left, go to Models & API, then click Add custom model.

2. Add a model

Set the provider to Custom.
Enter a Gate.AI-supported model ID, e.g. deepseek/deepseek-v3.2.
Enter a display name, e.g. Gate.AI(deepseek-v3.2).
Enter your API Key, e.g. sk-or-v1-xxxxxxxxxxxxxxxx.
Base URL: https://api.gate.ai/openai/v1

3. Test the configuration

Click the connection test. If you see “Test successful”, the setup is correct.

4. Use the model

Click Add. After it saves, return to the app.
Below the chat input, switch the model to your configured Gate.AI(deepseek-v3.2) to use it.

API Reference

General Chat API Reference

Field	Value
Base URL	https://api.gate.ai/openai/v1
Auth	Authorization: Bearer <API_KEY>
Format	OpenAI-compatible
Pricing	Pay-as-you-go

Note: The API path is /openai/v1 (not /v1).

Endpoints

Method	Path	Description
POST	/chat/completions	Chat completions (streaming supported)
GET	/models	List available models

Seedance Video Generation API Reference

Field	Value
Base URL	https://api.gate.ai
Auth	Authorization: Bearer <API_KEY>

Submit Video Task

POST/api/v1/videos

Submit an async video generation job. Returns 202 Accepted with job_id for polling. Pass Idempotency-Key for safe retries.

Request Parameters

Name	In	Type	Required	Description
Authorization	header	string	Yes	Gate.AI API Key. Format: Bearer <API_KEY>
Content-Type	header	string	Yes	Request body format
Idempotency-Key	header	string	No	Idempotency key for safe retries — same key returns the existing job

Request Body

Name	Type	Required	Description
model	string	Yes	Model ID, use bytedance/seedance-2.0
prompt	string	Yes	Video prompt; describe subject, motion, scene, camera, and style. Limits vary by model: Sora 500 characters, Hailuo 2000 characters, Wan T2V 1500 characters, and Wan I2V 800 characters. Gate.AI does not enforce a hard limit; over-limit prompts are rejected upstream and surfaced as 502 / 503.
duration	integer	No	Duration in seconds: 4–15
resolution	string	No	Output resolution: 480p, 720p, or 1080p
aspect_ratio	string	No	Aspect ratio: 16:9, 4:3, 1:1, 3:4, 9:16, 21:9, or adaptive
generate_audio	boolean	No	Whether to generate audio
seed	integer	No	Random seed, -1 to 4294967295; same seed does not guarantee identical output
size	string	No	Exact output size, e.g. 1280x720
input_references	array	No	Optional reference media array for image-to-video, first/last frame driving, video style transfer, or audio-driven generation
input_references[].type	string	Yes	Reference media type: image, video, or audio. Must match the selected role
input_references[].url	string	Yes	HTTPS URL of the reference media
input_references[].role	string	Yes	Reference purpose: first_frame, last_frame, reference_video, or reference_audio. first_frame/last_frame allow one image each; last_frame requires first_frame
metadata	object	No	Business pass-through fields for auditing or source tagging
webhook_url	string	No	Callback URL invoked when the task completes or fails

Example

{
  "model": "bytedance/seedance-2.0",
  "prompt": "A golden retriever running on a sunny beach, cinematic camera movement",
  "duration": 6,
  "resolution": "720p",
  "aspect_ratio": "16:9",
  "generate_audio": false,
  "seed": -1,
  "input_references": [
    {
      "type": "image",
      "url": "https://cdn.example.com/portrait.jpg",
      "role": "first_frame"
    }
  ],
  "metadata": {
    "source": "playground"
  },
  "webhook_url": "https://example.com/webhooks/gateai-video"
}

Response Fields

Name	Type	Description
job_id	string	Unique job ID
status	string	Task status: pending / in_progress / completed / failed
model	string	Model used
status_url	string	URL to poll task status
message	string	Server message
current_balance	string	Current account balance (USD)
estimated_cost	string	Estimated cost
pre_deduct_amount	string	Pre-deducted display amount
balance_after_estimate	string	Balance after applying estimated cost
currency	string	Currency, currently USD
billing_notice	string	Billing notice

Response Example

{
  "code": 200,
  "msg": "",
  "data": {
    "job_id": "video_abc123",
    "status": "in_progress",
    "model": "bytedance/seedance-2.0",
    "status_url": "https://api.gate.ai/api/v1/videos/video_abc123",
    "message": "视频任务已提交，请调用 status_url 查询生成进度。",
    "current_balance": "100.0000000000",
    "estimated_cost": "1.0800000000",
    "pre_deduct_amount": "1.0800000000",
    "balance_after_estimate": "98.9200000000",
    "currency": "USDT",
    "billing_notice": "视频生成完成后按实际任务结果扣款，提交后请保持余额充足。"
  }
}

Response

Status	Meaning	Description	Schema
202	Accepted	Task accepted and queued. Returns job_id for status polling.	VideoSubmitResponse
400	Bad Request	Parameter error	ErrorResponse
401	Unauthorized	Not logged in or token invalid	ErrorResponse
402	Payment Required	Insufficient balance. Response includes estimated cost and current balance.	ErrorResponse
429	Too Many Requests	Too many requests. Please slow down.	ErrorResponse
500	Internal Server Error	Internal server error	ErrorResponse

Query Task Status

GET/api/v1/videos/{job_id}

Poll job status, progress, and billing info. Returns download_url when status is completed.

Request Parameters

Name	In	Type	Required	Description
job_id	path	string	Yes	Video task ID
Authorization	header	string	Yes	Gate.AI API Key. Format: Bearer <API_KEY>

Response Fields

Name	Type	Description
job_id	string	Unique job ID
status	string	Task status: pending / in_progress / completed / failed
model	string	Model used
status_url	string	URL to poll task status
download_url	string	Video download URL (appears when status is completed)
duration	integer	Video duration in seconds
resolution	string	Output resolution
aspect_ratio	string	Aspect ratio
generate_audio	boolean	Whether audio is included
estimated_cost	string	Estimated cost
billed_cost	string	Actual billed amount
billing_status	string	Billing status: pre_deducted or settled
expires_at	string(ISO 8601)	Video file expiration time
created_at	string(ISO 8601)	Task creation time
completed_at	string(ISO 8601)	Task completion time

Response Example

{
  "code": 200,
  "msg": "",
  "data": {
    "job_id": "video_abc123",
    "status": "completed",
    "model": "bytedance/seedance-2.0",
    "status_url": "https://api.gate.ai/api/v1/videos/video_abc123",
    "download_url": "https://api.gate.ai/api/v1/videos/video_abc123/content",
    "duration": 6,
    "resolution": "720p",
    "aspect_ratio": "16:9",
    "generate_audio": false,
    "estimated_cost": "1.0800000000",
    "billed_cost": "1.0800000000",
    "billing_status": "settled",
    "expires_at": "2026-06-26T05:00:00Z",
    "created_at": "2026-05-27T05:00:00Z",
    "completed_at": "2026-05-27T05:03:00Z"
  }
}

Response

Status	Meaning	Description	Schema
200	OK	Success	VideoStatusResponse
401	Unauthorized	Not logged in or token invalid	ErrorResponse
404	Not Found	Job not found, or job does not belong to this API key.	ErrorResponse
500	Internal Server Error	Internal server error	ErrorResponse

Download Video

GET/api/v1/videos/{job_id}/content

Authenticate and redirect (302) to the temporary video file URL.

Request Parameters

Name	In	Type	Required	Description
job_id	path	string	Yes	Video task ID
Authorization	header	string	Yes	Gate.AI API Key. Format: Bearer <API_KEY>

Response Example

HTTP/1.1 302 Found
Location: https://cdn.example.com/videos/video_abc123.mp4?expires=1780000000

Response

Status	Meaning	Description	Schema
302	Found	Redirects to temporary video download URL.
401	Unauthorized	Not logged in or token invalid	ErrorResponse
404	Not Found	Job not found, or job does not belong to this API key.	ErrorResponse
409	Conflict	Task not yet completed; video is not available for download.	ErrorResponse
410	Gone	Video content has expired and is no longer available.	ErrorResponse
500	Internal Server Error	Internal server error	ErrorResponse

Reference Image API Reference

Use the Gate.AI image-to-image endpoint to upload a reference image and generate or edit an image. This endpoint uses multipart/form-data and synchronously returns image URLs and billing details; usage.input_tokens includes image_tokens counted from the reference image.

Field	Value
Base URL	https://api.gate.ai/openai/v1
Auth	Authorization: Bearer <API_KEY>
Format	OpenAI-compatible; uses a multipart/form-data request body

Note: Image endpoints live under /openai/v1. Generated data[].url values are short-lived S3 presigned URLs, so download or persist them promptly. Stored objects have a 30-day TTL.

Generate Images from a Reference Image

POST/images/edits

Upload a reference image with multipart/form-data to synchronously generate or edit an image. usage.input_tokens includes image_tokens counted from the reference image.

Request Parameters

Name	In	Type	Required	Description
Authorization	header	string	Yes	Gate.AI API Key. Format: Bearer <API_KEY>
Content-Type	header	string	Yes	Request body format: multipart/form-data

Request Body

Name	Type	Required	Description
model	string	Yes	Image model ID, such as gpt-image-1, qwen-image-2.0-pro, or seedream-4.0. Missing model returns 400 model is required
image	file	Yes	Reference image file. gpt-image-1 requires PNG, under 4 MB, square; without mask, it must include an alpha channel as the mask
mask	file	No	PNG mask. Transparent areas are edited, and dimensions must match image
prompt	string	Yes	Edit prompt, up to 1000 characters
n	integer	No	Number of images to generate, 1-10, default 1. Multiple images are billed per image
size	string	No	Output size. gpt-image-1 supports 256x256, 512x512, or 1024x1024, and this is used for cost estimation
response_format	string	No	url or b64_json. gpt-image-1 does not support this parameter and may reject it upstream

Example

model=gpt-image-1
prompt=Add a yellow border and a small sun in the corner
size=1024x1024
image=@./input.png

Response Fields

Name	Type	Description
created	integer	Generation timestamp in seconds
data	array	Result array, length equals n
data[].url	string	Image S3 presigned URL, short-lived, with a 30-day stored object TTL
usage	object	Usage. OpenAI models return token details; Qwen models return width, height, and image_count
model_extend.cost	string	Actual billed amount in USD; billing is based on this value
model_extend.line_items	array	Billing line items. Token billing includes input/output/cache; per-image billing includes billing_unit, rate_usd_per_image, and resolution_tier
model_extend.provider	string	Actual upstream provider, such as openai or qwen
size / quality / output_format / background	string	Input fields echoed by some models
model	string	Only echoed at the top level for Qwen models

Response Example

{
  "created": 1781604390,
  "data": [
    {
      "url": "https://ai-gateway-file.s3.ap-northeast-1.amazonaws.com/multimodal/image/2026/06/16/example-edit-0.png?X-Amz-Expires=600&X-Amz-Signature=..."
    }
  ],
  "size": "1024x1024",
  "quality": "low",
  "output_format": "png",
  "usage": {
    "input_tokens": 220,
    "input_tokens_details": {
      "image_tokens": 194,
      "text_tokens": 26
    },
    "output_tokens": 272,
    "total_tokens": 492
  },
  "model_extend": {
    "cost": "0.009804",
    "provider": "openai",
    "total_tokens": "492"
  }
}

Response

Status	Meaning	Description	Schema
200	OK	Success. Returns image results and billing information synchronously.	ImageResponse
400	Bad Request	Malformed request body, invalid JSON, or missing model / prompt.	OpenAIErrorResponse
401	Unauthorized	API key is invalid or missing.	OpenAIErrorResponse
402	Payment Required	Insufficient balance. Response includes current balance and estimated cost.	InsufficientBalanceResponse
404	Not Found	Model not found, or the image endpoint is not enabled.	OpenAIErrorResponse
413	Payload Too Large	Request body too large. Default limit is 8 MiB.	OpenAIErrorResponse
429	Too Many Requests	Too many requests. Please slow down.	OpenAIErrorResponse
500	Internal Server Error	Internal server error.	OpenAIErrorResponse
502	Bad Gateway	Upstream image service failed.	OpenAIErrorResponse

Text-to-Image API Reference

Use the Gate.AI text-to-image endpoint to call image models from OpenAI, Qwen, Seedream, and other providers. This endpoint uses a JSON request body and synchronously returns image URLs and billing details, with no job_id polling or webhook.

Field	Value
Base URL	https://api.gate.ai/openai/v1
Auth	Authorization: Bearer <API_KEY>
Format	OpenAI-compatible; uses a JSON request body

Note: Image endpoints live under /openai/v1. Generated data[].url values are short-lived S3 presigned URLs, so download or persist them promptly. Stored objects have a 30-day TTL.

Generate Images from Text

POST/images/generations

Generate images synchronously from a text prompt. Successful responses use the OpenAI-compatible image result shape, with data[].url pointing to the generated image.

Request Parameters

Name	In	Type	Required	Description
Authorization	header	string	Yes	Gate.AI API Key. Format: Bearer <API_KEY>
Content-Type	header	string	Yes	Request body format: application/json

Request Body

Name	Type	Required	Description
model	string	Yes	Image model ID, such as gpt-image-1, qwen-image-2.0-pro, or seedream-4.0. Missing model returns 400 model is required
prompt	string	Yes	Image text prompt. gpt-image-1 allows up to 32000 characters
n	integer	No	Number of images to generate, 1-10, default 1. Multiple images are billed per image
size	string	No	Output size. gpt-image-1 supports 1024x1024, 1536x1024, 1024x1536, or auto, and this is used for cost estimation
response_format	string	No	url or b64_json. gpt-image-1 does not support this parameter and may reject it upstream
stream	boolean	No	Pass-through field. The current image path is synchronous and does not route by this value

Example

{
  "model": "gpt-image-1",
  "prompt": "A golden retriever running on a sunny beach, cinematic",
  "n": 1,
  "size": "1024x1024"
}

Response Fields

Name	Type	Description
created	integer	Generation timestamp in seconds
data	array	Result array, length equals n
data[].url	string	Image S3 presigned URL, short-lived, with a 30-day stored object TTL
usage	object	Usage. OpenAI models return token details; Qwen models return width, height, and image_count
model_extend.cost	string	Actual billed amount in USD; billing is based on this value
model_extend.line_items	array	Billing line items. Token billing includes input/output/cache; per-image billing includes billing_unit, rate_usd_per_image, and resolution_tier
model_extend.provider	string	Actual upstream provider, such as openai or qwen
size / quality / output_format / background	string	Input fields echoed by some models
model	string	Only echoed at the top level for Qwen models

Response Example

{
  "created": 1781604363,
  "data": [
    {
      "url": "https://ai-gateway-file.s3.ap-northeast-1.amazonaws.com/multimodal/image/2026/06/16/example-0.png?X-Amz-Expires=600&X-Amz-Signature=..."
    }
  ],
  "size": "1024x1024",
  "quality": "low",
  "output_format": "png",
  "background": "opaque",
  "usage": {
    "input_tokens": 10,
    "output_tokens": 196,
    "total_tokens": 206,
    "output_tokens_details": {
      "image_tokens": 196,
      "text_tokens": 0
    }
  },
  "model_extend": {
    "cost": "0.006322",
    "provider": "openai",
    "total_tokens": "206",
    "line_items": [
      {
        "kind": "uncached_input",
        "tokens": 10,
        "rate_usd_per_million": "5.0000000000",
        "amount_usd": "0.0000500000"
      },
      {
        "kind": "output",
        "tokens": 196,
        "rate_usd_per_million": "32.0000000000",
        "amount_usd": "0.0062720000"
      }
    ]
  }
}

Response

Status	Meaning	Description	Schema
200	OK	Success. Returns image results and billing information synchronously.	ImageResponse
400	Bad Request	Malformed request body, invalid JSON, or missing model / prompt.	OpenAIErrorResponse
401	Unauthorized	API key is invalid or missing.	OpenAIErrorResponse
402	Payment Required	Insufficient balance. Response includes current balance and estimated cost.	InsufficientBalanceResponse
404	Not Found	Model not found, or the image endpoint is not enabled.	OpenAIErrorResponse
413	Payload Too Large	Request body too large. Default limit is 8 MiB.	OpenAIErrorResponse
429	Too Many Requests	Too many requests. Please slow down.	OpenAIErrorResponse
500	Internal Server Error	Internal server error.	OpenAIErrorResponse
502	Bad Gateway	Upstream image service failed.	OpenAIErrorResponse

Speech-to-Text API Reference

Upload an audio file through the Gate.AI speech-to-text endpoint and receive an OpenAI-compatible transcription result synchronously. When stream=true is set, the gateway relays upstream SSE as text/event-stream, with usage and model_extend on the final frame.

Field	Value
Base URL	https://api.gate.ai/openai/v1
Auth	Authorization: Bearer <API_KEY>
Format	OpenAI-compatible; uploads audio with multipart/form-data

Note: Audio endpoints live under /openai/v1. Speech-to-text and text-to-speech are synchronous capabilities: no job_id is returned and no polling is required. For default binary TTS responses, billing details are recorded in Console Generations.

Speech to Text

POST/audio/transcriptions

Upload an audio file for transcription and synchronously receive text, usage, and model_extend.cost. gpt-4o-transcribe and gpt-4o-mini-transcribe are token billed; whisper-1 is typically billed by audio duration.

Request Parameters

Name	In	Type	Required	Description
Authorization	header	string	Yes	Gate.AI API Key. Format: Bearer <API_KEY>
Content-Type	header	string	Yes	Request body format: multipart/form-data

Request Body

Name	Type	Required	Description
model	string	Yes	Speech-to-text model ID, such as whisper-1, gpt-4o-transcribe, or gpt-4o-mini-transcribe. Availability depends on the platform catalog
file	file	Yes	Audio file to transcribe, such as mp3, wav, or m4a
language	string	No	Optional language hint, such as zh, to improve transcription accuracy
stream	boolean	No	Set true to return text/event-stream transcription events. The final transcript.text.done frame includes usage and model_extend

Example

model=gpt-4o-transcribe
language=zh
file=@./input.mp3

Response Fields

Name	Type	Description
text	string	Transcribed text
usage	object	Speech-to-text usage. Different models may return token usage or audio-duration usage
usage.input_token_details.audio_tokens	integer	Audio input token count for token-billed transcription models
model_extend.cost	string	Actual billed amount in USD. Billing is based on model_extend.cost
model_extend.provider	string	Actual upstream provider, such as openai
model_extend.line_items	array	Billing line items, such as input_audio, output, or duration-billed items

Response Example

{
  "text": "今天天气很好，我们一起去海边散步吧。",
  "usage": {
    "type": "tokens",
    "input_tokens": 14,
    "input_token_details": {
      "text_tokens": 0,
      "audio_tokens": 14
    },
    "output_tokens": 18,
    "total_tokens": 32
  },
  "model_extend": {
    "cost": "0.000180",
    "provider": "openai",
    "total_tokens": "32",
    "line_items": [
      {
        "kind": "input_audio",
        "tokens": 14,
        "rate_usd_per_million": "6.0000000000",
        "amount_usd": "0.0000840000"
      },
      {
        "kind": "output",
        "tokens": 18,
        "rate_usd_per_million": "10.0000000000",
        "amount_usd": "0.0001800000"
      }
    ]
  }
}

Response

Status	Meaning	Description	Schema
200	OK	Success. Returns transcription text, audio data, or SSE events synchronously.	AudioResponse
400	Bad Request	Invalid parameters, malformed body, or missing required fields such as model, file, or input.	OpenAIErrorResponse
401	Unauthorized	API key is invalid or missing.	OpenAIErrorResponse
402	Payment Required	Insufficient balance.	InsufficientBalanceResponse
404	Not Found	Model not found, or the audio endpoint is not enabled.	OpenAIErrorResponse
413	Payload Too Large	Uploaded file or request body is too large.	OpenAIErrorResponse
429	Too Many Requests	Too many requests. Please slow down.	OpenAIErrorResponse
500	Internal Server Error	Internal server error.	OpenAIErrorResponse
502	Bad Gateway	Upstream audio service failed.	OpenAIErrorResponse

Text-to-Speech API Reference

Submit text through the Gate.AI text-to-speech endpoint and generate audio synchronously. By default the response is binary audio; with stream_format=sse, the endpoint returns speech.audio.delta frames and a final speech.audio.done frame with usage and model_extend.

Field	Value
Base URL	https://api.gate.ai/openai/v1
Auth	Authorization: Bearer <API_KEY>
Format	OpenAI-compatible; uses a JSON request body, returns binary audio by default, and can stream audio chunks over SSE

Text to Speech

POST/audio/speech

Submit text to synthesize speech and synchronously receive audio. The default response body is a binary audio stream whose Content-Type follows response_format; SSE mode returns base64 audio chunks and final billing details.

Request Parameters

Name	In	Type	Required	Description
Authorization	header	string	Yes	Gate.AI API Key. Format: Bearer <API_KEY>
Content-Type	header	string	Yes	Request body format: application/json

Request Body

Name	Type	Required	Description
model	string	Yes	Text-to-speech model ID. Currently only gpt-4o-mini-tts is supported
input	string	Yes	Text to synthesize
voice	string	Yes	Voice name, such as alloy
response_format	string	No	Output audio format, such as wav or mp3. Defaults to wav
stream_format	string	No	Set to sse to return text/event-stream. speech.audio.delta frames contain base64 audio chunks, and speech.audio.done includes usage and model_extend

Example

{
  "model": "gpt-4o-mini-tts",
  "input": "今天天气很好，我们一起去海边散步吧。",
  "voice": "alloy",
  "response_format": "mp3"
}

Response Fields

Name	Type	Description
audio bytes	binary	Default binary audio response data
speech.audio.delta	SSE event	SSE audio chunk event carrying a base64-encoded audio segment
speech.audio.done	SSE event	SSE completion event carrying usage and model_extend
usage	object	Text-to-speech usage. For binary responses it is written to backend logs; for SSE it appears on the final frame
model_extend.cost	string	Actual billed amount in USD. Billing is based on model_extend.cost

Response Example

HTTP/1.1 200 OK
Content-Type: audio/mpeg

(binary audio bytes)

Response

Status	Meaning	Description	Schema
200	OK	Success. Returns transcription text, audio data, or SSE events synchronously.	AudioResponse
400	Bad Request	Invalid parameters, malformed body, or missing required fields such as model, file, or input.	OpenAIErrorResponse
401	Unauthorized	API key is invalid or missing.	OpenAIErrorResponse
402	Payment Required	Insufficient balance.	InsufficientBalanceResponse
404	Not Found	Model not found, or the audio endpoint is not enabled.	OpenAIErrorResponse
413	Payload Too Large	Uploaded file or request body is too large.	OpenAIErrorResponse
429	Too Many Requests	Too many requests. Please slow down.	OpenAIErrorResponse
500	Internal Server Error	Internal server error.	OpenAIErrorResponse
502	Bad Gateway	Upstream audio service failed.	OpenAIErrorResponse

Gemini Native Protocol API Reference

Call Gemini models through the Gate.AI Gemini native protocol. Best for apps already using Gemini contents[], parts[], and tools[] request structures.

Field	Value
Base URL	https://api.gate.ai/gemini/v1beta
Auth	Authorization: Bearer <API_KEY>
Format	Gemini native JSON request body
Text generation	POST /models/{model}:generateContent
Streaming generation	POST /models/{model}:streamGenerateContent?alt=sse

Note: Gate.AI uses its own API Key. Do not use Google Gemini's ?key= query parameter.

The Gemini native protocol selects the model via {model} in the URL path, for example POST /models/gemini-2.5-pro:generateContent. Do not pass model again in the request body. This differs from OpenAI Chat Completions, where model is sent in the request body.

Equivalent Base URL paths

Scenario	Base URL
Gate.AI explicit Gemini protocol (recommended)	https://api.gate.ai/gemini/v1beta
Google AI Studio SDK direct access (without /gemini prefix)	https://api.gate.ai/v1beta
Vertex AI style path (with prefix)	https://api.gate.ai/gemini/v1/publishers/google/models/{model}:{action}
Vertex AI style path (without prefix)	https://api.gate.ai/v1/publishers/google/models/{model}:{action}

Text generation

POST/models/{model}:generateContent

Generate a model reply from a Gemini native contents[] request body.

Request Parameters

Name	In	Type	Required	Description
Authorization	header	string	Yes	Gate.AI API Key. Format: Bearer <API_KEY>
Content-Type	header	string	Yes	Request body format: application/json
model	path	string	Yes	Gemini model ID, such as gemini-2.5-pro

Request Body

Name	Type	Required	Description
contents	array	Yes	Gemini conversation turns in order
contents[].role	string	No	user or model
contents[].parts	array	Yes	Content parts in one turn
parts[].text	string	No	Text input
parts[].inlineData	object	No	Multimodal input with mimeType and base64 data
tools	array	No	Tool declarations in Gemini functionDeclarations format
toolConfig	object	No	Tool calling policy
generationConfig	object	No	Generation settings such as temperature and maxOutputTokens
systemInstruction	object	No	System instruction

Example

export GATEAI_API_KEY="sk-or-v1-xxxxxxxxxxxxxxxx"

curl https://api.gate.ai/gemini/v1beta/models/gemini-2.5-pro:generateContent \
  -H "Authorization: Bearer $GATEAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [
          {"text": "Introduce Gate.AI in one sentence"}
        ]
      }
    ]
  }'

Response Fields

Name	Type	Description
candidates	array	Model candidate replies
candidates[].index	integer	Candidate index, starting at 0
candidates[].content.role	string	Always model
candidates[].content.parts[].text	string	Text reply content
candidates[].content.parts[].thought	boolean	Whether the part is model reasoning content
candidates[].content.parts[].thoughtSignature	string	Thought signature for multi-turn reasoning context
candidates[].content.parts[].functionCall	object	Tool call request
candidates[].finishReason	string	Finish reason, such as STOP
usageMetadata	object	Usage and billing metadata
usageMetadata.promptTokenCount	integer	Input token count
usageMetadata.candidatesTokenCount	integer	Output token count
usageMetadata.totalTokenCount	integer	Total token count
usageMetadata.cost	string	Request cost in USD

Response Example

{
  "candidates": [
    {
      "index": 0,
      "content": {
        "role": "model",
        "parts": [
          {
            "text": "Gate.AI is a unified AI model routing platform."
          }
        ]
      },
      "finishReason": "STOP"
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 13,
    "candidatesTokenCount": 37,
    "totalTokenCount": 50,
    "cost": "0.0000964"
  }
}

Streaming generation

POST/models/{model}:streamGenerateContent?alt=sse

Stream Gemini response chunks. The final chunk usually includes usageMetadata. Read candidates[].content.parts[].text according to the Gemini streaming protocol.

Request Parameters

Name	In	Type	Required	Description
Authorization	header	string	Yes	Gate.AI API Key. Format: Bearer <API_KEY>
Content-Type	header	string	Yes	Request body format: application/json
model	path	string	Yes	Gemini model ID, such as gemini-2.5-pro

Request Body

Name	Type	Required	Description
contents	array	Yes	Gemini conversation turns in order
contents[].role	string	No	user or model
contents[].parts	array	Yes	Content parts in one turn
parts[].text	string	No	Text input
parts[].inlineData	object	No	Multimodal input with mimeType and base64 data
tools	array	No	Tool declarations in Gemini functionDeclarations format
toolConfig	object	No	Tool calling policy
generationConfig	object	No	Generation settings such as temperature and maxOutputTokens
systemInstruction	object	No	System instruction

Example

curl -N "https://api.gate.ai/gemini/v1beta/models/gemini-2.5-pro:streamGenerateContent?alt=sse" \
  -H "Authorization: Bearer $GATEAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [
          {"text": "List three scenarios where model routing is useful"}
        ]
      }
    ]
  }'

Multimodal input

Put images, audio, video, PDF, and other content in the same parts[]. Gate.AI Gemini entry recommends camelCase fields: inlineData.mimeType.

Type	`mimeType` examples
Image	`image/png`, `image/jpeg`
Audio	`audio/mp3`, `audio/wav`
Video	`video/mp4`
PDF	`application/pdf`

Image input example

IMAGE_B64="$(base64 -i ./image.png | tr -d '\n')"

curl https://api.gate.ai/gemini/v1beta/models/gemini-2.5-pro:generateContent \
  -H "Authorization: Bearer $GATEAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d "{
    \"contents\": [
      {
        \"role\": \"user\",
        \"parts\": [
          {\"text\": \"Describe this image\"},
          {
            \"inlineData\": {
              \"mimeType\": \"image/png\",
              \"data\": \"${IMAGE_B64}\"
            }
          }
        ]
      }
    ]
  }"

Tool calling

Gemini tools use tools[].functionDeclarations[]. Put the tool policy in toolConfig.functionCallingConfig.

{
  "contents": [
    {
      "role": "user",
      "parts": [
        {"text": "What should I wear in Beijing today?"}
      ]
    }
  ],
  "tools": [
    {
      "functionDeclarations": [
        {
          "name": "get_weather",
          "description": "Query city weather",
          "parameters": {
            "type": "object",
            "properties": {
              "city": {
                "type": "string",
                "description": "City name"
              }
            },
            "required": ["city"]
          }
        }
      ]
    }
  ],
  "toolConfig": {
    "functionCallingConfig": {
      "mode": "AUTO"
    }
  }
}

When the model triggers a tool, it returns functionCall in candidates[].content.parts[].

FAQ

Symptom	Cause	Suggested fix
404	Missing /gemini/v1beta in the path or wrong model ID	Use https://api.gate.ai/gemini/v1beta/models/{model}:generateContent
Multimodal content not read	Wrong inlineData field names, MIME type, or base64	Use inlineData.mimeType/data and ensure base64 has no line breaks
Tool not triggered	Schema exceeds Gemini JSON Schema subset	Start with basic fields such as type, properties, required, and description

Status codes

Status	Meaning	Description
200	OK	Request succeeded
400	Bad Request	Invalid request body or JSON, or missing required contents / parts
401	Unauthorized	Invalid or missing API Key
402	Payment Required	Insufficient balance
404	Not Found	Invalid API path, unknown model, or model does not support this protocol
413	Payload Too Large	Multimodal request body too large
429	Too Many Requests	Too many requests. Reduce call frequency.
500	Internal Server Error	Internal server error
502	Bad Gateway	Upstream Gemini service failed

Difference from OpenAI-compatible access

If your app already uses OpenAI Chat Completions, you can keep calling Gemini models via https://api.gate.ai/openai/v1/chat/completions. If your app already uses Gemini native contents[] / parts[] format, use the /gemini/v1beta/models/... endpoints on this page for lower migration cost.

Model List and Pricing API Reference

Use the Gate.AI model list endpoint to retrieve every available model, its core metadata, and list pricing. It is suited to model selection, comparison tools, and programmatic agent workflows. Each request returns the complete list without pagination.

Field	Value
Base URL	https://gate.ai
Auth	No authentication required
Format	Gate.AI REST JSON

List Models

GET/api/v1/models

Returns every model currently available on the platform (150+), including display names, providers, input and output modalities, context lengths, and complete list pricing.

Request Parameters

Name	In	Type	Required	Description
lang	query	string	No	Response text language: en (default), zh (Simplified Chinese), or zh-tw (Traditional Chinese). This affects description and tags[].name. Values are case-sensitive; invalid values fall back to English.
context_length	query	string	No	Filter by context length. Supports comma-separated values, for example 128000,64000. Pass 0, an empty string, or omit the parameter for no limit.
input_modality	query	string	No	Filter by input modality. Supported values include text, image, and video. Multiple values can be comma-separated, for example text,image.
keyword	query	string	No	Fuzzy-search model names against model or base_info.name. Matching is case-insensitive.
max_input_price	query	number	No	Maximum input price. Pass 0 or omit the parameter for no limit.
max_output_price	query	number	No	Maximum output price. Pass 0 or omit the parameter for no limit.
min_input_price	query	number	No	Minimum input price. Defaults to 0 when omitted and uses an open lower bound, excluding models priced at 0.
min_output_price	query	number	No	Minimum output price. Defaults to 0 when omitted and uses an open lower bound, excluding models priced at 0.
output_modality	query	string	No	Filter by output modality. Supported values include text, image, and video. Multiple values can be comma-separated, for example text,image.
provider_name	query	string	No	Filter by the provider name visible to users. Matching ignores case and surrounding whitespace.

Example

curl "https://gate.ai/api/v1/models?lang=zh-tw"

Response Fields

Name	Type	Description
code	integer	Business status code. Success is always 200.
message	string	Status description. Success is success.
data.data	array	Model array. Each element is a model object.
timestamp	integer	Server timestamp in milliseconds.
data.data[].model	string	Model ID to pass as the model parameter to generation endpoints.
data.data[].name	string	Display name.
data.data[].provider_name	string	Model provider name, such as OpenAI, Anthropic, or ByteDance.
data.data[].type	string	Supported endpoint forms, separated by \| when multiple apply: completions, responses, messages, videos, generations, edits, transcriptions, or speech.
data.data[].input_price	number \| null	Input price in USD per million tokens. When pricing_tiers exists, this is the first-tier price. It is null for non-token-billed models.
data.data[].output_price	number \| null	Output price in USD per million tokens, with the same semantics as input_price.
data.data[].cache_read_price	number \| null	Cache-read price in USD per million tokens.
data.data[].cache_write_price	number \| null	Cache-write price in USD per million tokens.
data.data[].context_length	integer	Context length in tokens. Models for which this does not apply return 0.
data.data[].description	string	Model description localized by the lang parameter.
data.data[].tags	array	Capability tags containing key (stable English identifier), name (localized by lang), and category.
data.data[].input_modalities	array	Input modalities: text, image, video, or audio. This is platform-maintained reference data and may lag individual upstream model updates.
data.data[].output_modalities	array	Output modalities, using the same value set as input_modalities.
data.data[].pricing_tiers	object	Present only for tier-priced models. input, output, cache_read, and cache_write contain tier arrays with an upper bound and the corresponding price.
data.data[].video_capabilities	object	Present only for video models: durations (seconds), resolutions, and aspect_ratios.
data.data[].video_pricing	object	Pricing details present only for video models.
video_pricing.currency	string	Currency. Currently USD.
video_pricing.billing_mode	string	per_second_resolution (output seconds × resolution tier) or per_film (per generated video).
video_pricing.unit	string	Billing unit: second or film.
video_pricing.tiers	array	Pricing tiers. Per-second entries contain resolution + price; per-film entries contain resolution + duration + price. Prices are strings.
data.data[].image_pricing	object	Pricing details present only for image models.
image_pricing.currency	string	Currency. Currently USD.
image_pricing.billing_mode	string	per_image or token.
image_pricing.unit	string	Billing unit: image or 1M.
image_pricing.per_image	string	Price per image for per-image billing.
image_pricing.text_input / image_input	object	Present for token billing. Each object contains input, output, and cache_read prices in USD per million tokens as strings.
data.data[].tts_stt_pricing	object	Pricing details present only for speech models.
tts_stt_pricing.billing_mode	string	token or audio_duration.
tts_stt_pricing.input_tokens / output_tokens / cache_read_tokens / cache_write_tokens	string	Present for token billing, in USD per million tokens.
tts_stt_pricing.input_per_minute	string	Present for duration billing. Input-audio price per minute in USD.
data.data[].model_publish_time	string (ISO 8601)	Model publication time.

Response Example

{
  "code": 200,
  "message": "success",
  "data": {
    "data": [
      {
        "id": 224,
        "model": "qwen3-vl-plus",
        "name": "Qwen3 VL Plus",
        "icon": "https://gimg2.staticimgs.com/image/qwen_20260402_161657_e06ffb27e1219c85184d939bce7cd1bd.webp",
        "provider_type": "qwen",
        "provider_name": "Qwen",
        "type": "completions",
        "input_price": 0.144,
        "output_price": 1.434,
        "cache_read_price": 0.029,
        "cache_write_price": 0.539,
        "price_type": 0,
        "context_length": 262144,
        "description": "Qwen3-VL 系列的 Plus 主力视觉语言模型，支持图像、视频与文本联合输入，擅长文档图表解析、视频理解、空间定位与智能体工具调用，适合多模态问答与视觉自动化场景。",
        "overview": "",
        "tags": [
          {
            "id": 3,
            "key": "vision",
            "name": "视觉",
            "category": "general"
          },
          {
            "id": 28,
            "key": "ui-understanding",
            "name": "界面理解",
            "category": "general"
          },
          {
            "id": 30,
            "key": "document-ocr",
            "name": "文档识别",
            "category": "general"
          },
          {
            "id": 31,
            "key": "multimodal-understanding",
            "name": "多模态理解",
            "category": "general"
          }
        ],
        "input_modalities": [
          "text",
          "image",
          "video"
        ],
        "output_modalities": [
          "text"
        ],
        "playground_enabled": true,
        "model_publish_time": "2025-09-23T00:00:00Z",
        "create_time": "2026-06-05T14:18:56Z",
        "update_time": "2026-07-06T01:57:19Z",
        "pricing_tiers": {
          "input": [
            {
              "upper": 32000,
              "input_price": "0.144"
            },
            {
              "upper": 128000,
              "input_price": "0.216"
            },
            {
              "upper": null,
              "input_price": "0.431"
            }
          ],
          "output": [
            {
              "upper": 32000,
              "output_price": "1.434"
            },
            {
              "upper": 128000,
              "output_price": "2.151"
            },
            {
              "upper": null,
              "output_price": "4.301"
            }
          ],
          "cache_read": [
            {
              "upper": 32000,
              "cache_read_price": "0.029"
            },
            {
              "upper": 128000,
              "cache_read_price": "0.044"
            },
            {
              "upper": null,
              "cache_read_price": "0.087"
            }
          ],
          "cache_write": [
            {
              "upper": null,
              "cache_write_price": "0.539"
            }
          ]
        }
      }
    ]
  },
  "timestamp": "1784201129"
}

Response

Status	Meaning	Description	Schema
200	OK	Success. Returns the complete model list.	ModelsResponse
500	Internal Server Error	Internal server error.	ErrorResponse

Compatibility notice: Response fields not listed on this page are internal platform fields and may change at any time; do not depend on them. Fields documented here may be added to in a backward-compatible way but will not receive breaking changes.

Check Balance

Query the balance status for the user that owns the current API Key.

Field	Value
Base URL	https://api.gate.ai
Auth	Authorization: Bearer <API_KEY>
Format	Gate.AI REST JSON

Get Credits Balance

GET/api/v1/credits/balance

Query the balance status for the user that owns the current API Key.

Request Parameters

Name	In	Type	Required	Description
Authorization	header	string	Yes	API Key authentication. Format: Bearer <API_KEY>

Response Fields

Name	Type	Description
code	integer	Business status code. Success is always 200.
data	object	Balance data.
data.status	integer	Balance status: 1 tradable, 2 insufficient balance, 4 overdue debt.
data.message	string	Status description.
data.asset_type	string	Asset ownership type: user_asset for personal assets, organization_asset for organization assets.
data.recharge_balance	string	Recharge balance.
data.debt_amount	string	Debt amount. Empty string or a zero-value string when there is no debt.

Response Example

{
  "code": 200,
  "data": {
    "status": 1,
    "message": "可以交易",
    "asset_type": "user_asset",
    "recharge_balance": "12.34000000",
    "debt_amount": ""
  }
}

Response

Status	Meaning	Description	Schema
200	OK	Query succeeded. Returns code and data.	CreditsBalanceResponse
401	Unauthorized	API Key is missing, invalid, expired, revoked, disabled, or inactive.	ErrorResponse
403	Forbidden	The member that owns the current API Key does not have permission to view the balance.	ErrorResponse
429	Too Many Requests	Rate limit reached.	ErrorResponse
500	Internal Server Error	Gateway internal error, such as API Key lookup failure or response serialization failure.	ErrorResponse
502	Bad Gateway	Balance upstream service is unavailable, returned a non-success status, response parsing failed, or data is empty.	ErrorResponse

Error Response Example

{
  "error": {
    "type": "authentication_error",
    "code": "7022002001",
    "message": "invalid api key"
  }
}

Troubleshooting

Authentication Errors

Error message	HTTP status	Cause	Solution
`invalid api key`	401	The API Key is invalid, expired, revoked, or disabled	Go to Console → API Keys and confirm the key status is "active". If it has expired, generate a new one

Routing and Model Errors

Error message	HTTP status	Cause	Solution
`no model config found for: {model}`	404	The requested model ID does not exist	Open the model list and confirm the model ID is spelled correctly
`model field is required`	400	The request body is missing the `model` field	Add `"model": "model name"` to the request JSON
`invalid or empty requested model`	400	The requested model name is empty or invalid	Open the model list and confirm you are using the correct model ID format
`unknown api path`	404	The API path is incorrect	Confirm the Base URL is `https://api.gate.ai/openai/v1` or `https://api.gate.ai/anthropic`

Request Parameter Errors

Error message	HTTP status	Cause	Solution
`invalid JSON body`	400	The request body is not valid JSON	Check that the request body is valid JSON
`failed to read request body`	400	The request body could not be read	Confirm the body is not corrupted and Content-Type is set to `application/json`
`failed to rewrite request body`	500	Failed to rewrite the request body inside the gateway	Retry. If it keeps happening, contact technical support
`images are not supported by this model`	400	The target model does not support image input	Switch to a multimodal model that supports images, such as `gpt-4o`
`audio is not supported by this model`	400	The target model does not support audio input	Switch to a model that supports audio input
`unsupported parameter: max_tokens`	400	Some models do not support the `max_tokens` parameter	Use `max_completion_tokens` instead

Quota and Rate Limit Errors

Error message	HTTP status	Cause	Solution
`api key budget quota exceeded`	429	The API Key budget has been exhausted	Go to Console → API Keys → Budget settings, increase the budget, or wait for the quota to reset
`guardrail budget limit exceeded`	429	The guardrail budget limit has been exceeded	Check the budget limit in guardrail settings, then increase the limit or reduce usage frequency
`organization guardrail budget limit exceeded`	429	The organization-level guardrail budget has been exceeded	Contact an organization admin to adjust the organization-level budget limit
`model not allowed by guardrail policy`	403	The model is not allowed by the guardrail policy	Go to Console → Guardrails and add the target model to the allowlist
`The free model usage has reached its daily global limit today.`	429	The global daily limit for the free model has been reached	Wait for the next daily reset, or upgrade to a paid plan for unrestricted models
`The free model usage has reached its daily limit today.`	429	Your personal daily limit for the free model has been reached	Wait for the next daily reset, or upgrade to a paid plan
`Guest daily spending limit exceeded. Please try again tomorrow or upgrade to a paid plan.`	429	The guest daily spending limit has been exceeded	Register an account and upgrade to a paid plan, or wait for the next daily reset

Billing and Balance Errors

Error message	HTTP status	Cause	Solution
`Pending payment {amount} USD — Please top up...`	402	The account balance is insufficient and has an outstanding payment	Top up with Gate Pay
`Insufficient balance and account in debt`	402	The balance is insufficient and the account is in debt	Top up with Gate Pay
`billing model info not found for model "{model}"`	400	Billing information for the model does not exist	Confirm the model ID is correct. If it is a new model, contact technical support to configure billing rules
`billing model info is ambiguous for model "{model}"`	400	Billing information is ambiguous because multiple records match	Contact technical support to check the model billing configuration
`billing configuration error`	500	Billing rules are misconfigured on the server	Contact technical support to fix the billing configuration

Upstream Service Errors

Error message	HTTP status	Cause	Solution
`bad gateway`	502	The gateway failed to communicate with the upstream service	Retry. If it keeps happening, check the service status page or contact technical support
`upstream service unavailable`	502	The upstream AI Provider is unavailable	Retry later, or switch to another available model
`upstream service error`	502	The upstream AI Provider returned an error	Check whether the request parameters match the target model requirements. If it keeps happening, contact technical support
`request timeout`	504	The upstream request timed out	Reduce the input length or increase the timeout, then retry
`no provider handler configured for protocol`	502	No protocol handler is configured in the gateway	Contact technical support to check the gateway configuration

Server Internal Errors

Error message	HTTP status	Cause	Solution
`internal server error`	500	Internal gateway error	Retry. If it keeps happening, contact technical support with the Request ID
`failed to record request log`	500	Failed to record the request log	This does not affect the request result and can be ignored. Contact technical support if it happens frequently
`failed to list models`	500	Failed to query the model list	Retry. If it keeps happening, contact technical support

Common Quick Checks

Incorrect Base URL

# ❌ Incorrect
https://api.gate.ai/v1/chat/completions

# ✅ Correct
https://api.gate.ai/openai/v1/chat/completions  # OpenAI protocol
https://api.gate.ai/anthropic/v1/messages  # Anthropic protocol

Incompatible max_tokens Parameter

// ❌ Some models do not support this
{ "model": "gpt-4o", "max_tokens": 100 }

// ✅ Use max_completion_tokens
{ "model": "openai/gpt-5.5", "max_completion_tokens": 100 }

API Key Format

# ✅ Correct request headers
Authorization: Bearer sk-xxxxxxxxxxxxxxxxxxxx # OpenAI protocol
X-api-key: sk-xxxxxxxxxxxxxxxxxxxx # Anthropic protocol

Gate.AI Documentation

Getting Started

1. Create an API key

2. Auto routing (optional)

Standard Setup

Response example:

Claude Code CLI Setup

Create a Gate.AI API Key

Network connectivity check

Install Claude Code CLI

Configure model

Verify setup

Advanced configuration

Auth conflict

Alternative configuration

Model environment variables

Restore direct Anthropic (optional)

Troubleshooting

Common Base URL mistakes

Quick reference

Codex CLI Setup

Create a Gate.AI API Key

Network connectivity check

Install Codex CLI

Configure model

Verify setup

Advanced configuration

Config reference

Alternative configuration

Switch model

Restore direct OpenAI (optional)

Troubleshooting

Common Base URL mistakes

Quick reference

Cursor Setup

Prerequisites

Step 1: Open Cursor Settings

Step 2: Open Models Settings

Step 3: Add Configuration

Step 4: Save and Close Settings

Step 5: Use Gate.AI in Cursor

FAQ

Claude Code Setup

1. Create a Gate.AI API key

2. Configure Anthropic Base URL and API key

Option A: Current terminal session (temporary)

Option B: Shell profile

Option C: Claude Code settings.json (recommended)

Bypass the gateway and use Anthropic directly

3. Configure models (Gate.AI model IDs)

3.1 Default model via environment variable

3.2 Alias mapping

3.3 Custom entry in /model

3.4 Auto-routing model auto

4. Verify the setup

5. FAQ

Claude Desktop Setup

Prerequisites

Step 1: Enable Developer Mode

Step 2: Open Third-Party Inference Settings

Step 3: Enter Gate.AI Credentials and Models

Step 4: Restart and Launch

Step 5: Verify

FAQ

Hermes Setup

Prerequisites

Option 1: Terminal setup

1. Choose model and custom endpoint

2. (Optional) Local web dashboard

3. Verify

Option 2: Edit configuration files

1. File locations

2. Save the API key in .env (pick one)

3. Configure model in config.yaml

4. Verify after saving

Multiple routes / models

How to switch

FAQ

QClaw Setup

Configure in chat

Option C: Claude Code `settings.json` (recommended)

3.3 Custom entry in `/model`

3.4 Auto-routing model `auto`

2. Save the API key in `.env` (pick one)

3. Configure `model` in `config.yaml`