Streaming

Handle streaming responses with Server-Sent Events

Overview#

Set stream: true in your request to receive responses as Server-Sent Events (SSE). This allows your application to display tokens as they are generated, providing a better user experience.

Foza proxies the upstream provider's SSE stream directly to your client. The format is identical to OpenAI's streaming API.

Examples#

Python
from openai import OpenAI
 
client = OpenAI(base_url="https://api.foza.ai/v1", api_key="sk-foza-xxxxx")
 
stream = client.chat.completions.create(
    model="openai/gpt-4o",
    messages=[{"role": "user", "content": "Write a haiku about APIs"}],
    stream=True,
)
 
for chunk in stream:
    content = chunk.choices[0].delta.content
    if content:
        print(content, end="", flush=True)
Node.js
import OpenAI from "openai";
 
const client = new OpenAI({
  baseURL: "https://api.foza.ai/v1",
  apiKey: "sk-foza-xxxxx",
});
 
const stream = await client.chat.completions.create({
  model: "openai/gpt-4o",
  messages: [{ role: "user", content: "Write a haiku about APIs" }],
  stream: true,
});
 
for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content;
  if (content) process.stdout.write(content);
}

SSE Format#

Each event in the stream has the format:

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"delta":{"content":"Hello"}}]}

data: [DONE]

The final event is always data: [DONE] to signal the stream has ended.