跳轉到主要內容

概覽

LemonData 實施速率限制以確保公平使用和平台穩定性。限制因帳戶層級而異。

速率限制層級

層級請求數/分鐘描述
User60所有帳戶的預設層級
Partner300針對整合合作夥伴
VIP1,000高用量使用者
速率限制可能會有所變動。請聯繫 [email protected] 以獲取自定義限制。

速率限制標頭

每個 API 回應都包含速率限制資訊:
X-RateLimit-Limit: 60          # Your limit per minute
X-RateLimit-Remaining: 55      # Requests remaining
X-RateLimit-Reset: 1234567890  # Unix timestamp when limit resets

已超出速率限制

當您超出限制時,您將收到 429 回應:
{
  "error": {
    "message": "Rate limit exceeded. Please slow down.",
    "type": "rate_limit_exceeded"
  }
}
附帶額外標頭:
Retry-After: 60  # Seconds to wait before retrying

處理速率限制

指數退避 (Exponential Backoff)

實作指數退避以進行自動重試:
import time
from openai import OpenAI, RateLimitError

client = OpenAI(
    api_key="sk-your-api-key",
    base_url="https://api.lemondata.cc/v1"
)

def make_request_with_backoff(messages, max_retries=5):
    for attempt in range(max_retries):
        try:
            return client.chat.completions.create(
                model="gpt-4o",
                messages=messages
            )
        except RateLimitError as e:
            if attempt == max_retries - 1:
                raise

            wait_time = 2 ** attempt  # 1, 2, 4, 8, 16 seconds
            print(f"Rate limited. Waiting {wait_time}s...")
            time.sleep(wait_time)

請求隊列

對於高用量應用程式,請實作請求隊列:
import asyncio
from collections import deque

class RateLimitedClient:
    def __init__(self, requests_per_minute=60):
        self.rpm = requests_per_minute
        self.interval = 60 / requests_per_minute
        self.last_request = 0

    async def request(self, messages):
        # Wait if needed to respect rate limit
        now = asyncio.get_event_loop().time()
        wait_time = max(0, self.last_request + self.interval - now)
        if wait_time > 0:
            await asyncio.sleep(wait_time)

        self.last_request = asyncio.get_event_loop().time()
        return await self.client.chat.completions.create(
            model="gpt-4o",
            messages=messages
        )

批次處理

對於大量操作,請使用帶有延遲的批次處理:
def process_batch(items, batch_size=50, delay=1):
    results = []
    for i in range(0, len(items), batch_size):
        batch = items[i:i + batch_size]
        for item in batch:
            result = client.chat.completions.create(
                model="gpt-4o",
                messages=[{"role": "user", "content": item}]
            )
            results.append(result)
        time.sleep(delay)  # Pause between batches
    return results

最佳實踐

追蹤速率限制標頭以主動保持在限制範圍內。
針對相同的請求快取回應,以減少 API 呼叫。
更快的模型(如 gpt-4o-mini)可提供更高的吞吐量。
如果您需要更高的限制,請聯繫 [email protected]

升級您的層級

若要申請層級升級:
  1. 登入您的 控制台
  2. 前往 Settings → Account
  3. 聯繫支援團隊並說明您的使用場景
或發送電子郵件至 [email protected],並提供:
  • 您的帳戶電子郵件
  • 預期的請求量
  • 使用場景描述