Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.lemondata.cc/llms.txt

Use this file to discover all available pages before exploring further.

For coding agents, discover the current recommended STT shortlist first with GET /v1/models?recommended_for=stt, then send the selected model explicitly to this endpoint.

Request Body

Synchronous request timeout: This non-chat endpoint waits for the routed model to finish. Large inputs, long audio, or large batches can exceed common 30s client defaults, so set your HTTP client timeout to at least 120s.
file
file
required
Audio file to transcribe. Supported formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm.
model
string
default:"whisper-1"
Model to use. Currently only whisper-1 is supported.
language
string
Language of the audio in ISO-639-1 format (e.g., en, zh, ja).
prompt
string
Optional text to guide the model’s style or continue a previous segment.
response_format
string
default:"json"
Output format: json, text, srt, verbose_json, vtt.
temperature
number
default:"0"
Sampling temperature (0 to 1).
timestamp_granularities
array
Timestamp granularity: word and/or segment. Requires verbose_json.

Response

text
string
The transcribed text.
For verbose_json:
task
string
Always transcribe.
language
string
Detected language.
duration
number
Audio duration in seconds.
segments
array
Transcription segments with timestamps.
words
array
Word-level timestamps (if requested).
curl -X POST "https://api.lemondata.cc/v1/audio/transcriptions" \
  -H "Authorization: Bearer sk-your-api-key" \
  -F file="@audio.mp3" \
  -F model="whisper-1" \
  -F language="en"
{
  "text": "Hello, this is a test of the transcription API."
}

Translation

To translate audio to English, use the translations endpoint:
response = client.audio.translations.create(
    model="whisper-1",
    file=audio_file
)