✨ 代理優先 API

概覽

LemonData 的 Agent-First API 在錯誤回應中加入了可結構化的提示，AI 代理可以立即解析並採取行動——不需網路搜尋、不需查文件、也不需猜測。每個錯誤回應的標準 error 物件內都可能包含像 did_you_mean、suggestions、hint、retryable 和 retry_after 類型的可選欄位。這些欄位向後相容——不使用它們的客戶端不會看到差異。

錯誤提示欄位

所有提示欄位都是 error 物件內的可選擴充：

欄位	類型	說明
`did_you_mean`	`string`	最相近的模型名稱
`suggestions`	`array`	含 metadata 的建議模型清單
`alternatives`	`array`	當前可用的替代模型
`hint`	`string`	可供人類或代理閱讀的下一步指引
`retryable`	`boolean`	是否重試相同請求可能會成功
`retry_after`	`number`	在重試前應等待的秒數
`balance_usd`	`number`	帳戶當前美元餘額
`estimated_cost_usd`	`number`	失敗請求的估計費用（美元）

錯誤代碼範例

model_not_found (400)

當模型名稱未匹配到任何啟用中的模型時：

{
  "error": {
    "message": "Model not found: please check the model name",
    "type": "invalid_request_error",
    "param": "model",
    "code": "model_not_found",
    "did_you_mean": "gpt-5.4",
    "suggestions": [
      {"id": "gpt-5.4"},
      {"id": "gpt-5-mini"},
      {"id": "claude-sonnet-4-6"}
    ],
    "hint": "Did you mean 'gpt-5.4'? Use GET https://api.lemondata.cc/v1/models to list all available models."
  }
}

did_you_mean 的解析使用：

靜態別名對應（來自生產錯誤資料）
正規化字串比對（去除連字號、大小寫不敏感）
編輯距離比對（閾值 ≤ 3）

公開路由不會針對隱藏、延遲或非公開模型暴露不同的錯誤代碼。將不可用的公開模型視為一個 miss：檢查 did_you_mean、suggestions 與 hint，然後使用受支援的公開模型重試。

insufficient_balance (402)

當帳戶餘額不足以支付估計費用時：

{
  "error": {
    "message": "Insufficient balance: need ~$0.3500 for claude-sonnet-4-6, but balance is $0.1200.",
    "type": "insufficient_balance",
    "code": "insufficient_balance",
    "balance_usd": 0.12,
    "estimated_cost_usd": 0.35,
    "suggestions": [
      {"id": "gpt-5-mini"},
      {"id": "deepseek-v3-2"}
    ],
    "hint": "Insufficient balance: need ~$0.3500 for claude-sonnet-4-6, but balance is $0.1200. Try a cheaper model, or top up at https://lemondata.cc/dashboard/billing."
  }
}

suggestions 包含比估計費用更便宜且代理可以切換的模型。

all_channels_failed (503)

當模型的所有上游通道都不可用時：

{
  "error": {
    "message": "Model claude-opus-4-6 temporarily unavailable",
    "code": "all_channels_failed",
    "retryable": true,
    "retry_after": 30,
    "alternatives": [
      {"id": "claude-sonnet-4-6", "status": "available", "tags": []},
      {"id": "gpt-5-mini", "status": "available", "tags": []}
    ],
    "hint": "All channels for 'claude-opus-4-6' are temporarily unavailable. Retry in 30s or try an alternative model."
  }
}

當原因為 no_channels（此模型未配置任何通道）時，retryable 為 false。只有在像是電路斷路器觸發或額度耗盡等暫時性失敗時才會是 true。

rate_limit_exceeded (429)

{
  "error": {
    "message": "Rate limit: 60 rpm exceeded",
    "type": "rate_limit_exceeded",
    "code": "rate_limit_exceeded",
    "retryable": true,
    "retry_after": 8,
    "hint": "Rate limited. Retry after 8s. Current limit: 60/min for user role."
  }
}

retry_after 的值是根據實際速率限制窗口重置時間計算得出。

與 OpenAI 相容的端點使用 LemonData 穩定的公開錯誤類型，例如 rate_limit_exceeded、upstream_error 和 all_channels_failed。與 Anthropic 相容與 Gemini 相容的端點則使用它們各自的原生回應格式。

context_length_exceeded (400)

當輸入超過模型的上下文窗口（上游錯誤，並附加提示）時：

{
  "error": {
    "message": "This model's maximum context length is 128000 tokens...",
    "type": "invalid_request_error",
    "code": "context_length_exceeded",
    "retryable": false,
    "suggestions": [
      {"id": "gemini-2.5-pro"},
      {"id": "claude-sonnet-4-6"}
    ],
    "hint": "Reduce your input or switch to a model with a larger context window."
  }
}

Native Endpoint Headers

當你使用具有原生端點（Anthropic 或 Gemini）的模型呼叫 /v1/chat/completions 時，成功回應 會包含優化用的標頭：

X-LemonData-Hint: This model supports native Anthropic format. Use POST /v1/messages for better performance (no format conversion).
X-LemonData-Native-Endpoint: /v1/messages

模型供應商	建議端點	好處
Anthropic (Claude)	`/v1/messages`	無格式轉換、延伸思考、提示快取
Google (Gemini)	`/v1beta/gemini`	無格式轉換、地面化、安控設定
OpenAI	—	Chat completions 已是原生格式

這些標頭會出現在串流與非串流的回應中。

/v1/models 增強

/v1/models 現在會攜帶非 chat 的推薦 metadata，代理可以在呼叫 image、video、music、3D、TTS、STT、embedding、rerank 或 translation 端點之前使用這些資訊。

{
  "id": "gemini-2.5-flash-image",
  "lemondata": {
    "category": "image",
    "pricing_unit": "per_request",
    "agent_preferences": {
      "image": {
        "preferred_rank": 1,
        "success_rate_24h": 0.98,
        "sample_count_24h": 423,
        "status": "ready",
        "updated_at": "2026-03-28T12:00:00.000Z",
        "basis": {
          "score_source": "clickhouse_24h",
          "channel_id": null,
          "physical_model": null
        }
      }
    }
  }
}

欄位	值	說明
`category`	`chat`, `image`, `video`, `audio`, `tts`, `stt`, `3d`, `embedding`, `rerank`	模型類型
`pricing_unit`	`per_token`, `per_image`, `per_second`, `per_request`	模型的計費方式
`cache_pricing`	object or `null`	僅在模型存在上游 prompt cache 價格時回傳；純平台語意快取折抵不會單獨出現在列表路由
`agent_preferences.<scene>`	object	僅在 `GET /v1/models?recommended_for=<scene>` 時回傳該情境的非 chat 推薦快照

當 recommended_for 存在時，agent_preferences 來源於快取的 24 小時成功率快照：

窗口：24 小時
快照快取：stale-while-revalidate
status = "ready" 表示模型有足夠的近期樣本參與排序
status = "insufficient_samples" 表示該模型仍可見但不會排在有分數模型之前

類別篩選

GET https://api.lemondata.cc/v1/models?category=chat          # Chat models only
GET https://api.lemondata.cc/v1/models?category=image         # Image generation models
GET https://api.lemondata.cc/v1/models?tag=coding&category=chat  # Coding-optimized chat models

llms.txt

可機器解析的 API 概覽位於：

GET https://api.lemondata.cc/llms.txt

它包含：

首次呼叫範本與可運作的範例
常見的模型名稱（根據使用資料動態產生）
所有 12 個 API 端點
模型探索的篩選參數
錯誤處理指引

在第一次 API 呼叫前先讀取 llms.txt 的 AI 代理，通常可以在第一次嘗試就成功。

在代理程式碼中的使用範例

Python (OpenAI SDK)

from openai import OpenAI, BadRequestError

client = OpenAI(
    api_key="sk-your-key",
    base_url="https://api.lemondata.cc/v1"
)

def smart_chat(messages, model="gpt-4o"):
    try:
        return client.chat.completions.create(
            model=model, messages=messages
        )
    except BadRequestError as e:
        error = e.body.get("error", {}) if isinstance(e.body, dict) else {}
        # Use did_you_mean for auto-correction
        if error.get("code") == "model_not_found" and error.get("did_you_mean"):
            return client.chat.completions.create(
                model=error["did_you_mean"], messages=messages
            )
        raise

JavaScript (OpenAI SDK)

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'sk-your-key',
  baseURL: 'https://api.lemondata.cc/v1'
});

async function smartChat(messages, model = 'gpt-4o') {
  try {
    return await client.chat.completions.create({ model, messages });
  } catch (error) {
    const err = error?.error;
    if (err?.code === 'model_not_found' && err?.did_you_mean) {
      return client.chat.completions.create({
        model: err.did_you_mean, messages
      });
    }
    throw error;
  }
}

設計原則

快速失敗、資訊充足

錯誤會立即回傳代理自我修正所需的所有資料。

不自動路由

API 不會在背後偷偷替換成其他模型。由代理來決定。

資料驅動的建議

所有推薦皆來自生產資料，而非硬編碼清單。

向後相容

所有提示欄位均為可選。現有客戶端看不到差異。

快速入門

核心指南

Coding Agents

✨ 代理優先 API

概覽

錯誤提示欄位

錯誤代碼範例

model_not_found (400)

insufficient_balance (402)

all_channels_failed (503)

rate_limit_exceeded (429)

context_length_exceeded (400)

Native Endpoint Headers

/v1/models 增強

類別篩選

推薦探索

llms.txt

在代理程式碼中的使用範例

Python (OpenAI SDK)

JavaScript (OpenAI SDK)

設計原則

快速失敗、資訊充足

不自動路由

資料驅動的建議

向後相容

快速入門

核心指南

Coding Agents

​概覽

​錯誤提示欄位

​錯誤代碼範例

​model_not_found (400)

​insufficient_balance (402)

​all_channels_failed (503)

​rate_limit_exceeded (429)

​context_length_exceeded (400)

​Native Endpoint Headers

​/v1/models 增強

​類別篩選

​推薦探索

​llms.txt

​在代理程式碼中的使用範例

​Python (OpenAI SDK)

​JavaScript (OpenAI SDK)

​設計原則

快速失敗、資訊充足

不自動路由

資料驅動的建議

向後相容

概覽

錯誤提示欄位

錯誤代碼範例

model_not_found (400)

insufficient_balance (402)

all_channels_failed (503)

rate_limit_exceeded (429)

context_length_exceeded (400)

Native Endpoint Headers

/v1/models 增強

類別篩選

推薦探索

llms.txt

在代理程式碼中的使用範例

Python (OpenAI SDK)

JavaScript (OpenAI SDK)

設計原則