Create Transcription

curl -X POST "https://api.lemondata.cc/v1/audio/transcriptions" \
  -H "Authorization: Bearer sk-your-api-key" \
  -F file="@audio.mp3" \
  -F model="whisper-1" \
  -F language="en"

{
  "text": "Hello, this is a test of the transcription API."
}

POST

audio

transcriptions

curl -X POST "https://api.lemondata.cc/v1/audio/transcriptions" \
  -H "Authorization: Bearer sk-your-api-key" \
  -F file="@audio.mp3" \
  -F model="whisper-1" \
  -F language="en"

{
  "text": "Hello, this is a test of the transcription API."
}

Request Body

file

required

Audio file to transcribe. Supported formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm.

model

string

required

Model to use: whisper-1.

language

string

Language of the audio in ISO-639-1 format (e.g., en, zh, ja).

prompt

string

Optional text to guide the model’s style or continue a previous segment.

response_format

string

default:"json"

Output format: json, text, srt, verbose_json, vtt.

temperature

number

default:"0"

Sampling temperature (0 to 1).

timestamp_granularities

array

Timestamp granularity: word and/or segment. Requires verbose_json.

Response

text

string

The transcribed text.

For verbose_json:

task

string

Always transcribe.

language

string

Detected language.

duration

number

Audio duration in seconds.

segments

array

Transcription segments with timestamps.

words

array

Word-level timestamps (if requested).

curl -X POST "https://api.lemondata.cc/v1/audio/transcriptions" \
  -H "Authorization: Bearer sk-your-api-key" \
  -F file="@audio.mp3" \
  -F model="whisper-1" \
  -F language="en"

{
  "text": "Hello, this is a test of the transcription API."
}

Translation

To translate audio to English, use the translations endpoint:

response = client.audio.translations.create(
    model="whisper-1",
    file=audio_file
)

Create Speech Create Video

⌘I

Overview

Chat

Responses (OpenAI)

Messages (Anthropic)

Gemini (Google)

Embeddings

Rerank

Images

Audio

Video

Music

3D Generation

Models

Create Transcription

Request Body

Response

Translation

Overview

Chat

Responses (OpenAI)

Messages (Anthropic)

Gemini (Google)

Embeddings

Rerank

Images

Audio

Video

Music

3D Generation

Models

​Request Body

​Response

​Translation

Request Body

Response

Translation