feat: display inference statistics in chat UI #43
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Add a UI component to display LLM inference statistics (token counts, context window utilization, throughput) in the chat interface.
Current State
InferenceStatsin its streaming response (see dependency below), the UI needs to consume and display these metrics.Tasks
1. Regenerate TypeScript proto types
Regenerate the TypeScript protobuf types to include the new
InferenceStatsmessage and the updatedProcessRequestResponsewith the optional stats field.2. Create
InferenceStatsPanel.sveltecomponentBuild a new Svelte component that displays:
total_tokens / context_window_sizepercentage3. Integrate into chat page
InferenceStatsPanelas a collapsible panel below each assistant message (or at the bottom of the chat stream).InferenceStatsfrom the final streamedProcessRequestResponseand pass it to the component.Dependencies
Depends on llm-multiverse/llm-multiverse#236
Implementation Complete
PR #45 implements the inference statistics display:
InferenceStatsmessage tocommon.protowith fields:prompt_tokens,completion_tokens,total_tokens,context_window_size,tokens_per_second. Added optionalinference_statsfield toProcessRequestResponse.InferenceStatsPanel.svelte— collapsible<details>panel (collapsed by default) showing token counts in a grid layout, throughput, and a color-coded context utilization progress bar (blue <70%, amber 70-90%, red >90%).FinalResultin the chat page after orchestration completes. Stats extracted from the streaming response inuseOrchestration.Note: Proto source changes are local only — the upstream submodule will be updated when llm-multiverse/llm-multiverse#236 is merged.