Skip to main content
POST
/
v1
/
audio
/
transcriptions
curl -X POST "https://api.lemondata.cc/v1/audio/transcriptions" \
  -H "Authorization: Bearer sk-your-api-key" \
  -F file="@audio.mp3" \
  -F model="whisper-1" \
  -F language="en"
{
  "text": "Hello, this is a test of the transcription API."
}

Request Body

file
file
required
Audio file to transcribe. Supported formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm.
model
string
required
Model to use: whisper-1.
language
string
Language of the audio in ISO-639-1 format (e.g., en, zh, ja).
prompt
string
Optional text to guide the model’s style or continue a previous segment.
response_format
string
default:"json"
Output format: json, text, srt, verbose_json, vtt.
temperature
number
default:"0"
Sampling temperature (0 to 1).
timestamp_granularities
array
Timestamp granularity: word and/or segment. Requires verbose_json.

Response

text
string
The transcribed text.
For verbose_json:
task
string
Always transcribe.
language
string
Detected language.
duration
number
Audio duration in seconds.
segments
array
Transcription segments with timestamps.
words
array
Word-level timestamps (if requested).
curl -X POST "https://api.lemondata.cc/v1/audio/transcriptions" \
  -H "Authorization: Bearer sk-your-api-key" \
  -F file="@audio.mp3" \
  -F model="whisper-1" \
  -F language="en"
{
  "text": "Hello, this is a test of the transcription API."
}

Translation

To translate audio to English, use the translations endpoint:
response = client.audio.translations.create(
    model="whisper-1",
    file=audio_file
)