feat: display inference statistics in chat UI #43

New Issue

2026-03-12T22:33:05+01:00

shahondin1624 commented

2026-03-12 22:33:05 +01:00

Summary

Add a UI component to display LLM inference statistics (token counts, context window utilization, throughput) in the chat interface.

Current State

The UI has no stats display component.
Once the orchestrator forwards InferenceStats in its streaming response (see dependency below), the UI needs to consume and display these metrics.

Tasks

1. Regenerate TypeScript proto types

Regenerate the TypeScript protobuf types to include the new InferenceStats message and the updated ProcessRequestResponse with the optional stats field.

2. Create `InferenceStatsPanel.svelte` component

Build a new Svelte component that displays:

Prompt tokens — number of tokens in the prompt
Completion tokens — number of tokens generated
Total tokens — sum of prompt + completion tokens
Context window size — model's maximum context length
Context utilization — visual progress bar showing total_tokens / context_window_size percentage
Tokens/sec — generation throughput

3. Integrate into chat page

Add the InferenceStatsPanel as a collapsible panel below each assistant message (or at the bottom of the chat stream).
Panel should be collapsed by default to keep the UI clean.
Extract the InferenceStats from the final streamed ProcessRequestResponse and pass it to the component.

Dependencies

Depends on: feat: forward inference stats through orchestrator streaming response

## Summary Add a UI component to display LLM inference statistics (token counts, context window utilization, throughput) in the chat interface. ## Current State - The UI has no stats display component. - Once the orchestrator forwards `InferenceStats` in its streaming response (see dependency below), the UI needs to consume and display these metrics. ## Tasks ### 1. Regenerate TypeScript proto types Regenerate the TypeScript protobuf types to include the new `InferenceStats` message and the updated `ProcessRequestResponse` with the optional stats field. ### 2. Create `InferenceStatsPanel.svelte` component Build a new Svelte component that displays: - **Prompt tokens** — number of tokens in the prompt - **Completion tokens** — number of tokens generated - **Total tokens** — sum of prompt + completion tokens - **Context window size** — model's maximum context length - **Context utilization** — visual progress bar showing `total_tokens / context_window_size` percentage - **Tokens/sec** — generation throughput ### 3. Integrate into chat page - Add the `InferenceStatsPanel` as a **collapsible panel** below each assistant message (or at the bottom of the chat stream). - Panel should be collapsed by default to keep the UI clean. - Extract the `InferenceStats` from the final streamed `ProcessRequestResponse` and pass it to the component. ## Dependencies - Depends on: **feat: forward inference stats through orchestrator streaming response**

shahondin1624 commented

2026-03-12 22:33:12 +01:00

Depends on llm-multiverse/llm-multiverse#236

shahondin1624 referenced this issue from a commit

2026-03-13 14:46:58 +01:00

feat: display inference statistics in chat UI

shahondin1624 referenced this issue from a commit

2026-03-13 14:46:58 +01:00

chore: mark issue #43 plan as COMPLETED

shahondin1624 referenced a pull request that will close this issue

2026-03-13 14:48:02 +01:00

feat: display inference statistics in chat UI #45

shahondin1624 commented

2026-03-13 14:50:29 +01:00

Implementation Complete

PR #45 implements the inference statistics display:

Proto: Added InferenceStats message to common.proto with fields: prompt_tokens, completion_tokens, total_tokens, context_window_size, tokens_per_second. Added optional inference_stats field to ProcessRequestResponse.
Component: New InferenceStatsPanel.svelte — collapsible <details> panel (collapsed by default) showing token counts in a grid layout, throughput, and a color-coded context utilization progress bar (blue <70%, amber 70-90%, red >90%).
Integration: Panel renders below FinalResult in the chat page after orchestration completes. Stats extracted from the streaming response in useOrchestration.
Quality: Full dark mode support, ARIA accessibility on progress bar, 0 typecheck/lint errors.

Note: Proto source changes are local only — the upstream submodule will be updated when llm-multiverse/llm-multiverse#236 is merged.

## Implementation Complete PR #45 implements the inference statistics display: - **Proto**: Added `InferenceStats` message to `common.proto` with fields: `prompt_tokens`, `completion_tokens`, `total_tokens`, `context_window_size`, `tokens_per_second`. Added optional `inference_stats` field to `ProcessRequestResponse`. - **Component**: New `InferenceStatsPanel.svelte` — collapsible `<details>` panel (collapsed by default) showing token counts in a grid layout, throughput, and a color-coded context utilization progress bar (blue <70%, amber 70-90%, red >90%). - **Integration**: Panel renders below `FinalResult` in the chat page after orchestration completes. Stats extracted from the streaming response in `useOrchestration`. - **Quality**: Full dark mode support, ARIA accessibility on progress bar, 0 typecheck/lint errors. Note: Proto source changes are local only — the upstream submodule will be updated when llm-multiverse/llm-multiverse#236 is merged.

shahondin1624 closed this issue

2026-03-13 14:50:35 +01:00

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: llm-multiverse/llm-multiverse-ui#43

feat: display inference statistics in chat UI #43

Summary

Current State

Tasks

1. Regenerate TypeScript proto types

2. Create InferenceStatsPanel.svelte component

3. Integrate into chat page

Dependencies

Implementation Complete

2. Create `InferenceStatsPanel.svelte` component