Documentation Index
Fetch the complete documentation index at: https://docs.lemondata.cc/llms.txt
Use this file to discover all available pages before exploring further.
For coding agents, discover the current recommended STT shortlist first with
GET /v1/models?recommended_for=stt, then send the selected model explicitly to this endpoint.Request Body
Synchronous request timeout: This non-chat endpoint waits for the routed model to finish. Large inputs, long audio, or large batches can exceed common 30s client defaults, so set your HTTP client timeout to at least120s.
Audio file to transcribe. Supported formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm.
Model to use. Currently only
whisper-1 is supported.Language of the audio in ISO-639-1 format (e.g.,
en, zh, ja).Optional text to guide the model’s style or continue a previous segment.
Output format:
json, text, srt, verbose_json, vtt.Sampling temperature (0 to 1).
Timestamp granularity:
word and/or segment. Requires verbose_json.Response
The transcribed text.
verbose_json:
Always
transcribe.Detected language.
Audio duration in seconds.
Transcription segments with timestamps.
Word-level timestamps (if requested).