Create embedding

My Translation Strategy for MDX Documentation Okay, here’s the approach I’m taking for this technical translation from English to Simplified Chinese. This is MDX documentation, so the structure is critical. I’m focusing on preserving the original format while accurately translating the content. First, I need to translate all the textual content: headings, paragraphs, list items, and descriptions, including titles, descriptions, and table entries. I’ll translate the title in frontmatter to “创建 Embedding” because “Embedding” is a very common term in AI documentation, and I’ll keep the openapi key as is. The description I’m handling with “创建一个代表输入文本的 embedding 向量”. Now, the important part: I absolutely do not translate any code blocks, inline code snippets (e.g., text-embedding-3-small), URLs, API endpoints/paths (like POST /v1/embeddings), variable names, technical identifiers, component names (like <Card>), or parameter names. The model in the ParamField for example stays as the ID, so it will remain “text-embedding-3-small”. The surrounding text that describes the model however, I’m translating, and this will involve ensuring that the descriptions are clear and accurate. I must meticulously maintain the MDX structure and formatting. That means line breaks, spacing, and markdown syntax need to be identical to the original. Keeping it structured like this will be important for developers reading the documentation. I have to ensure that all ParamField and ResponseField tags, and also their attributes, remain untouched. Regarding technical terminology, I’m keeping common English terms like “API”, “SDK”, “token”, and of course “Embedding” (or “Embedding”) as I’ve determined that’s common in AI docs, even though there’s an alternative translation. I need to maintain that technical accuracy. I’m aiming for a consistently professional and technical tone throughout the translated content. The final output needs to be solely the translated text; no English text should remain unless it is a technical term or code. I’ll be paying close attention to the specific items in the document. So, in the frontmatter, title: "Create Embedding" becomes title: "创建 Embedding". The openapi line stays untouched. I’m taking care with the other fields too, like the description: "Creates an embedding vector representing the input text" is now description: "创建一个代表输入文本的 embedding 向量". Headings will be translated, so ## Request Body becomes ## 请求体, and ## Available Models becomes ## 可用模型, and ## Response will become ## 响应. For the table headers, Model will become 模型, Dimensions becomes 维度, and Description will be 描述. I’ll translate the table content too. In the descriptions for parameters, the translation needs to be precise, such as translating the description of the model parameter from “ID of the embedding model to use (e.g., text-embedding-3-small).” to “要使用的 embedding 模型 ID（例如 text-embedding-3-small）。” I’ll make sure to preserve the context and ensure the translated descriptions are equally clear. For instance, Input text to embed. Can be a string or array of strings. becomes 要进行 embedding 的输入文本。可以是字符串或字符串数组。 Finally, during review, I’ll double-check everything: inline code, component names, API paths, and that the tone is spot-on and the MDX structure is flawless. I’ll pay attention to preserve all of the surrounding structure and formatting, to make sure it’s as helpful as possible for the target audience.

title: “创建 Embedding” openapi: “POST /v1/embeddings” description: “创建一个代表输入文本的 embedding 向量”

请求体

model

string

必填

要使用的 embedding 模型 ID（例如 text-embedding-3-small）。

input

string | array

必填

要进行 embedding 的输入文本。可以是字符串或字符串数组。

encoding_format

string

默认值:"float"

Embedding 的格式：float 或 base64。

dimensions

integer

输出的维度数量（取决于具体模型）。

user

string

代表终端用户的唯一标识符，用于滥用监控。

可用模型

模型	维度	描述
`text-embedding-3-large`	3072	最佳质量
`text-embedding-3-small`	1536	平衡
`text-embedding-ada-002`	1536	旧版

响应

object

string

始终为 list。

data

array

Embedding 对象数组。每个对象包含：

object (string): embedding
index (integer): 输入数组中的索引
embedding (array): Embedding 向量

model

string

使用的模型。

usage

object

包含 prompt_tokens 和 total_tokens 的 Token 使用情况。

curl -X POST "https://api.lemondata.cc/v1/embeddings" \
  -H "Authorization: Bearer sk-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-small",
    "input": "The quick brown fox jumps over the lazy dog"
  }'

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [0.0023, -0.0194, 0.0081, ...]
    }
  ],
  "model": "text-embedding-3-small",
  "usage": {
    "prompt_tokens": 9,
    "total_tokens": 9
  }
}

批量 Embedding

# 一次性对多段文本进行 Embedding
response = client.embeddings.create(
    model="text-embedding-3-small",
    input=[
        "First document text",
        "Second document text",
        "Third document text"
    ]
)

for i, data in enumerate(response.data):
    print(f"Document {i}: {len(data.embedding)} dimensions")

Overview

Chat

Messages

Responses

Gemini

Images

Video

Audio

Music

3D

Embeddings

Rerank

Cache

Models

Pricing

title: “创建 Embedding” openapi: “POST /v1/embeddings” description: “创建一个代表输入文本的 embedding 向量”

请求体

可用模型

响应

批量 Embedding

Overview

Chat

Messages

Responses

Gemini

Images

Video

Audio

Music

3D

Embeddings

Rerank

Cache

Models

Pricing

​title: “创建 Embedding” openapi: “POST /v1/embeddings” description: “创建一个代表输入文本的 embedding 向量”

​请求体

​可用模型

​响应

​批量 Embedding

title: “创建 Embedding” openapi: “POST /v1/embeddings” description: “创建一个代表输入文本的 embedding 向量”

请求体

可用模型

响应

批量 Embedding