Skip to main content
POST
/
v1
/
audio
/
translations
curl -X POST "https://api.lemondata.cc/v1/audio/translations" \
  -H "Authorization: Bearer sk-your-api-key" \
  -F "file=@german_audio.mp3" \
  -F "model=whisper-1"
{
  "text": "Hello, my name is Wolfgang and I come from Germany. Where are you from?"
}

Overview

Translates audio in any supported language into English text. Unlike transcription, this endpoint always outputs English text regardless of the input language.

Request Body

file
file
required
The audio file to translate. Supported formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm. Maximum file size is 25 MB.
model
string
default:"whisper-1"
The model to use. Currently only whisper-1 is supported.
prompt
string
An optional text to guide the model’s style or continue a previous segment. Should be in English.
response_format
string
default:"json"
The format of the output. Options: json, text, srt, verbose_json, vtt.
temperature
number
The sampling temperature, between 0 and 1. Higher values like 0.8 produce more random output, while lower values like 0.2 make output more focused and deterministic.

Response

text
string
The translated text in English.
For verbose_json format, the response also includes:
language
string
The detected language of the input audio.
duration
number
The duration of the input audio in seconds.
segments
array
Segments of the translated text with timestamps.
curl -X POST "https://api.lemondata.cc/v1/audio/translations" \
  -H "Authorization: Bearer sk-your-api-key" \
  -F "file=@german_audio.mp3" \
  -F "model=whisper-1"
{
  "text": "Hello, my name is Wolfgang and I come from Germany. Where are you from?"
}

Translation vs Transcription

FeatureTranslationTranscription
Output languageAlways EnglishSame as input
Use caseConvert foreign audio to EnglishPreserve original language
Language parameterNot applicableOptional hint
The translation endpoint automatically detects the source language and translates to English. The language parameter from transcription is ignored.

Body

application/json
file
file
required

The audio file to translate. Supported formats: flac , mp3 , mp4 , mpeg , mpga , m4a , ogg , wav , webm . Maximum file size is 25 MB.

model
string

The model to use. Currently only whisper-1 is supported.

prompt
string

An optional text to guide the model’s style or continue a previous segment. Should be in English.

response_format
string

The format of the output. Options: json , text , srt , verbose_json , vtt .

temperature
number

The sampling temperature, between 0 and 1. Higher values like 0.8 produce more random output, while lower values like 0.2 make output more focused and deterministic.

Response

200 - application/json

Response 200

text
string

The translated text in English.

language
string

The detected language of the input audio.

duration
number

The duration of the input audio in seconds.

segments
object[]

Segments of the translated text with timestamps.