Overview
LemonData’s Agent-First API enriches error responses with structured hints that AI agents can parse and act on immediately — no web searches, no doc lookups, no guesswork. Every error response includes optional fields likedid_you_mean, suggestions, hint, retryable, and retry_after inside the standard error object. These fields are backward-compatible — clients that don’t use them see no difference.
Error Hint Fields
All hint fields are optional extensions inside theerror object:
| Field | Type | Description |
|---|---|---|
did_you_mean | string | Closest matching model name |
suggestions | array | Recommended models with metadata |
alternatives | array | Currently available alternative models |
hint | string | Human/agent-readable next-step guidance |
retryable | boolean | Whether retrying the same request may succeed |
retry_after | number | Seconds to wait before retrying |
balance_usd | number | Current account balance in USD |
estimated_cost_usd | number | Estimated cost of the failed request |
Error Code Examples
model_not_found (400)
When a model name doesn’t match any active model:did_you_mean resolution uses:
- Static alias mapping (from production error data)
- Normalized string matching (strips hyphens, case-insensitive)
- Edit distance matching (threshold ≤ 3)
insufficient_balance (402)
When account balance is too low for the estimated cost:suggestions contains models cheaper than the estimated cost that the agent can switch to.
all_channels_failed (503)
When all upstream channels for a model are unavailable:retryable is false when the reason is no_channels (no channels configured for this model). It’s true only for transient failures like circuit breaker trips or quota exhaustion.rate_limit_exceeded (429)
retry_after value is calculated from the actual rate limit window reset time.
context_length_exceeded (400)
When input exceeds the model’s context window (upstream error, enriched with hints):Native Endpoint Headers
When you call/v1/chat/completions with a model that has a native endpoint (Anthropic or Gemini), the success response includes optimization headers:
| Model Provider | Suggested Endpoint | Benefit |
|---|---|---|
| Anthropic (Claude) | /v1/messages | No format conversion, extended thinking, prompt caching |
| Google (Gemini) | /v1beta/gemini | No format conversion, grounding, safety settings |
| OpenAI | — | Chat completions is already the native format |
/v1/models Enhancements
Three new fields in thelemondata extension of each model object:
| Field | Values | Description |
|---|---|---|
category | chat, image, video, audio, tts, stt, 3d, embedding, rerank | Model type |
pricing_unit | per_token, per_image, per_second, per_request | How the model is billed |
cache_pricing | object or null | Upstream prompt cache prices + platform semantic cache discount |
Category Filtering
llms.txt
A machine-readable API overview is available at:- First-call template with a working example
- Common model names (dynamically generated from usage data)
- All 12 API endpoints
- Filter parameters for model discovery
- Error handling guidance
llms.txt before their first API call can typically succeed on the first attempt.
Usage in Agent Code
Python (OpenAI SDK)
JavaScript (OpenAI SDK)
Design Principles
Fail fast, fail informatively
Errors return immediately with all the data an agent needs to self-correct.
No auto-routing
The API never silently substitutes a different model. The agent decides.
Data-driven suggestions
All recommendations come from production data, not hardcoded lists.
Backward compatible
All hint fields are optional. Existing clients see no difference.