LlamaIndex

Überblick

Typ: Framework oder PlattformPrimärer Pfad: OpenAI-kompatibel über OpenAILikeSupport-Niveau: Über OpenAILike unterstützt

Für LemonData ist die robustere LlamaIndex-Konfiguration die Verwendung von OpenAI-kompatiblen Integrationen anstelle der integrierten OpenAI-Klassen. Die aktuelle LlamaIndex-Dokumentation empfiehlt ausdrücklich OpenAILike für OpenAI-kompatible Endpunkte von Drittanbietern, da die integrierten OpenAI-Klassen Metadaten aus offiziellen Modellnamen ableiten. Anders gesagt: Behandeln Sie OpenAILike hier als den unterstützten LemonData-Pfad und nicht die eingebauten OpenAI-Klassen.

Installation

pip install llama-index-core \
  llama-index-readers-file \
  llama-index-llms-openai-like \
  llama-index-embeddings-openai-like

Grundkonfiguration

from llama_index.core import Settings
from llama_index.llms.openai_like import OpenAILike
from llama_index.embeddings.openai_like import OpenAILikeEmbedding

llm = OpenAILike(
    model="gpt-5.4",
    api_base="https://api.lemondata.cc/v1",
    api_key="sk-your-lemondata-key",
    is_chat_model=True,
)

embed_model = OpenAILikeEmbedding(
    model_name="text-embedding-3-small",
    api_base="https://api.lemondata.cc/v1",
    api_key="sk-your-lemondata-key",
)

Settings.llm = llm
Settings.embed_model = embed_model

Grundlegende Verwendung

response = llm.complete("Explain LemonData in one sentence.")
print(response.text)

Chat

from llama_index.core.llms import ChatMessage

messages = [
    ChatMessage(role="system", content="You are a helpful assistant."),
    ChatMessage(role="user", content="What is the capital of France?")
]

response = llm.chat(messages)
print(response.message.content)

Streaming

for chunk in llm.stream_complete("Write a short poem about AI."):
    print(chunk.delta, end="", flush=True)

Embeddings

vector = embed_model.get_text_embedding("Hello, world!")
print(vector[:5])

RAG mit Dokumenten

from llama_index.core import SimpleDirectoryReader, VectorStoreIndex

documents = SimpleDirectoryReader("./data").load_data()
index = VectorStoreIndex.from_documents(documents)

query_engine = index.as_query_engine()
response = query_engine.query("What is in my documents?")
print(response)

Chat-Engine

chat_engine = index.as_chat_engine(chat_mode="condense_question")

response = chat_engine.chat("What is LemonData?")
print(response)

response = chat_engine.chat("How many models does it support?")
print(response)

Asynchrone Verwendung

import asyncio

async def main():
    response = await llm.acomplete("Hello!")
    print(response.text)

asyncio.run(main())

Best Practices

Verwenden Sie OpenAILike für LemonData

Bevorzugen Sie llama_index.llms.openai_like.OpenAILike und llama_index.embeddings.openai_like.OpenAILikeEmbedding für LemonData und andere OpenAI-kompatible Gateways von Drittanbietern.

Setzen Sie api_base explizit

Übergeben Sie api_base="https://api.lemondata.cc/v1" direkt im Code, anstatt sich auf ältere Namen von OpenAI-Umgebungsvariablen zu verlassen.

Halten Sie Modellrollen getrennt

Verwenden Sie Chat-/Reasoning-Modelle für die Synthese und text-embedding-3-small oder text-embedding-3-large für Retrieval.

Kompatibilitätsmatrix

Coding-Tools

Native SDKs (nativ)

Frameworks & Plattformen

Chat-Apps

Überblick

Installation

Grundkonfiguration

Grundlegende Verwendung

Chat

Streaming

Embeddings

RAG mit Dokumenten

Chat-Engine

Asynchrone Verwendung

Best Practices

Kompatibilitätsmatrix

Coding-Tools

Native SDKs (nativ)

Frameworks & Plattformen

Chat-Apps

​Überblick

​Installation

​Grundkonfiguration

​Grundlegende Verwendung

​Chat

​Streaming

​Embeddings

​RAG mit Dokumenten

​Chat-Engine

​Asynchrone Verwendung

​Best Practices

Überblick

Installation

Grundkonfiguration

Grundlegende Verwendung

Chat

Streaming

Embeddings

RAG mit Dokumenten

Chat-Engine

Asynchrone Verwendung

Best Practices