Quickstart
Base URL:
https://api.speka.online/v1Install the OpenAI SDK and make your first call:
from openai import OpenAI
client = OpenAI(
base_url="https://api.speka.online/v1",
api_key="sk-speka-live-...", # your Speka key
)
resp = client.chat.completions.create(
model="meta/llama-3.3-70b-instruct",
messages=[{"role": "user", "content": "Write a haiku about GPUs."}],
stream=True,
)
for chunk in resp:
print(chunk.choices[0].delta.content or "", end="")Authentication
Pass your key in the Authorization header as a bearer token. Create and revoke keys in your dashboard. Keys are shown once — store them securely.
Authorization: Bearer sk-speka-live-...Chat completions
POST /v1/chat/completions — supports messages, temperature, max_tokens, tools, response_format and more.
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.speka.online/v1",
apiKey: process.env.SPEKA_API_KEY, // sk-speka-live-...
});
const stream = await client.chat.completions.create({
model: "deepseek-ai/deepseek-r1",
messages: [{ role: "user", content: "Solve: 23 * 47" }],
stream: true,
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}Streaming
Set stream: true to receive server-sent events. We proxy the upstream stream directly, so time-to-first-token stays low.
Embeddings
POST /v1/embeddings returns vectors for retrieval and semantic search.
curl https://api.speka.online/v1/embeddings \
-H "Authorization: Bearer sk-speka-live-..." \
-H "Content-Type: application/json" \
-d '{
"model": "nvidia/llama-3.2-nv-embedqa-1b-v2",
"input": ["The quick brown fox"]
}'Image generation
POST /v1/images/generations with an image model id such as black-forest-labs/flux.1-dev returns generated images.
Errors & rate limits
Errors use the OpenAI envelope: { "error": { "message", "type", "code" } }. Common statuses:
- 401Missing or invalid key.
- 402Usage allowance exhausted — upgrade or add credits.
- 429Rate limit exceeded — see the
Retry-Afterheader. - 5xxUpstream issue — we auto-retry across capacity.