Request Body
A list of messages comprising the conversation.Each message object contains:
role(string):system,user, orassistantcontent(string | array): The message content
Sampling temperature between 0 and 2. Higher values make output more random.
Maximum number of tokens to generate.
If true, partial message deltas will be sent as SSE events.
Options for streaming. Set
include_usage: true to receive token usage in stream chunks.Nucleus sampling parameter. We recommend altering this or temperature, not both.
Number between -2.0 and 2.0. Positive values penalize repeated tokens.
Number between -2.0 and 2.0. Positive values penalize tokens already in the text.
Up to 4 sequences where the API will stop generating tokens.
A list of tools the model may call (function calling).
Controls how the model uses tools. Options:
auto, none, required, or a specific tool object.Whether to enable parallel function calling. Set to false to call functions sequentially.
Maximum tokens for the completion. Alternative to
max_tokens, useful for newer reasoning-enabled model families.Reasoning effort for reasoning-enabled models. Options:
low, medium, high.Random seed for deterministic sampling.
Number of completions to generate (1-128).
Whether to return log probabilities.
Number of top log probabilities to return (0-20). Requires
logprobs: true.Top-K sampling parameter (for Anthropic/Gemini models).
Response format specification. Use
{"type": "json_object"} for JSON mode, or {"type": "json_schema", "json_schema": {...}} for structured outputs.Modify the likelihood of specified tokens appearing. Map token IDs (as strings) to bias values from -100 to 100.
A unique identifier representing your end-user for abuse monitoring.
LemonData cache control options.
type(string): Cache strategy -default,no_cache,no_store,response_only,semantic_onlymax_age(integer): Cache TTL in seconds (max 86400)
Response
Unique identifier for the completion.
Always
chat.completion.Unix timestamp of when the completion was created.
The model used for completion.
List of completion choices.Each choice contains:
index(integer): Index of the choicemessage(object): The generated messagefinish_reason(string): Why the model stopped (stop,length,tool_calls)
Token usage statistics.
prompt_tokens(integer): Tokens in the promptcompletion_tokens(integer): Tokens in the completiontotal_tokens(integer): Total tokens used