Skip to main content

Request Body

model
string
required
ID of the model to use. See Models for available options.
messages
array
required
A list of messages comprising the conversation.Each message object contains:
  • role (string): system, user, or assistant
  • content (string | array): The message content
temperature
number
default:"1"
Sampling temperature between 0 and 2. Higher values make output more random.
max_tokens
integer
Maximum number of tokens to generate.
stream
boolean
default:"false"
If true, partial message deltas will be sent as SSE events.
stream_options
object
Options for streaming. Set include_usage: true to receive token usage in stream chunks.
top_p
number
default:"1"
Nucleus sampling parameter. We recommend altering this or temperature, not both.
frequency_penalty
number
default:"0"
Number between -2.0 and 2.0. Positive values penalize repeated tokens.
presence_penalty
number
default:"0"
Number between -2.0 and 2.0. Positive values penalize tokens already in the text.
stop
string | array
Up to 4 sequences where the API will stop generating tokens.
tools
array
A list of tools the model may call (function calling).
tool_choice
string | object
Controls how the model uses tools. Options: auto, none, required, or a specific tool object.
parallel_tool_calls
boolean
default:"true"
Whether to enable parallel function calling. Set to false to call functions sequentially.
max_completion_tokens
integer
Maximum tokens for the completion. Alternative to max_tokens, useful for newer reasoning-enabled model families.
reasoning_effort
string
Reasoning effort for reasoning-enabled models. Options: low, medium, high.
seed
integer
Random seed for deterministic sampling.
n
integer
default:"1"
Number of completions to generate (1-128).
logprobs
boolean
Whether to return log probabilities.
top_logprobs
integer
Number of top log probabilities to return (0-20). Requires logprobs: true.
top_k
integer
Top-K sampling parameter (for Anthropic/Gemini models).
response_format
object
Response format specification. Use {"type": "json_object"} for JSON mode, or {"type": "json_schema", "json_schema": {...}} for structured outputs.
logit_bias
object
Modify the likelihood of specified tokens appearing. Map token IDs (as strings) to bias values from -100 to 100.
user
string
A unique identifier representing your end-user for abuse monitoring.
cache_control
object
LemonData cache control options.
  • type (string): Cache strategy - default, no_cache, no_store, response_only, semantic_only
  • max_age (integer): Cache TTL in seconds (max 86400)

Response

id
string
Unique identifier for the completion.
object
string
Always chat.completion.
created
integer
Unix timestamp of when the completion was created.
model
string
The model used for completion.
choices
array
List of completion choices.Each choice contains:
  • index (integer): Index of the choice
  • message (object): The generated message
  • finish_reason (string): Why the model stopped (stop, length, tool_calls)
usage
object
Token usage statistics.
  • prompt_tokens (integer): Tokens in the prompt
  • completion_tokens (integer): Tokens in the completion
  • total_tokens (integer): Total tokens used
curl -X POST "https://api.lemondata.cc/v1/chat/completions" \
  -H "Authorization: Bearer sk-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Hello!"}
    ],
    "temperature": 0.7,
    "max_tokens": 1000
  }'
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1706000000,
  "model": "gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 20,
    "completion_tokens": 9,
    "total_tokens": 29
  }
}