Streaming AI Responses

Streaming AI Responses New

The AI Toolkit can consume a Server-Sent Events stream instead of waiting for the full response. Tokens appear as they arrive, the same way ChatGPT / Claude / Gemini surfaces do. The server implementation lives in RichTextBox.AspNetCore — MapRichTextBoxUploads() automatically maps /richtextbox/ai/stream; OpenAI / Anthropic / Azure OpenAI resolvers implement IStreamingRichTextBoxAiResolver.

Status

Idle. Click Summarize or Proofread to start a stream.

Server side: streaming endpoint

MapRichTextBoxUploads() wires /richtextbox/ai/stream. If the registered IRichTextBoxAiResolver also implements IStreamingRichTextBoxAiResolver, every delta the resolver yields is flushed as an SSE data: frame. Non-streaming resolvers fall back to a single event: response frame so the client always sees the same message shape.

// Program.cs
builder.Services.AddRichTextBox();
builder.Services.AddRichTextBoxOpenAiResolver(opts =>
{
    opts.ApiKey = builder.Configuration["OpenAi:ApiKey"]!;
    opts.Model  = "gpt-4o-mini";
});

app.MapRichTextBoxUploads();   // also maps /richtextbox/ai/stream

Client side: `aiToolkit.streamRequest()`

var stream = editor.aiToolkit.streamRequest({
    url: "/richtextbox/ai/stream",
    body: { mode: "summarize", documentText: editor.getHTMLCode() },
    onDelta: function (delta, accumulated) { /* token */ },
    onDone: function (text) { editor.aiToolkit.acceptInline(text); },
    onError: function (err) { console.error(err); }
});

// stream.abort();  // cancel mid-flight

Resilience knobs

Two RichTextBoxOptions properties tune streaming behaviour:

builder.Services.AddRichTextBox(opts =>
{
    // Hard ceiling on a single AI stream. Default: 5 minutes.
    opts.AiStreamTimeout = TimeSpan.FromMinutes(2);

    // Comment-frame heartbeat so nginx / ALB / CloudFront don't kill idle streams.
    // Default: 15 seconds.
    opts.AiStreamHeartbeatInterval = TimeSpan.FromSeconds(10);
});

Error handling

Resolver exceptions never leak to the client. The endpoint logs the full exception with a correlation id and emits event: error with a generic "AI streaming failed. Please retry." body plus the correlation id so support can find the matching log entry.

Server side: streaming endpoint

Client side: aiToolkit.streamRequest()

Resilience knobs

Client side: `aiToolkit.streamRequest()`