2.1 KiB
2.1 KiB
Issue #43: Display Inference Statistics in Chat UI
Status: COMPLETED
Issue: #43
Branch: feature/issue-43-inference-stats
Overview
Add a collapsible UI panel that displays LLM inference statistics (token counts, context window utilization, throughput) below assistant messages after orchestration completes.
Phases
Phase 1: Proto Types
Files:
proto/upstream/proto/llm_multiverse/v1/common.proto— AddInferenceStatsmessageproto/upstream/proto/llm_multiverse/v1/orchestrator.proto— Add optionalinference_statsfield toProcessRequestResponse
InferenceStats message fields:
prompt_tokens(uint32) — tokens in the promptcompletion_tokens(uint32) — tokens generatedtotal_tokens(uint32) — sum of prompt + completioncontext_window_size(uint32) — model's maximum context lengthtokens_per_second(float) — generation throughput
Then regenerate types: npm run generate
Phase 2: Orchestration State
Files:
src/lib/composables/useOrchestration.svelte.ts— ExtractinferenceStatsfrom response, expose via store getter
Phase 3: InferenceStatsPanel Component
Files:
src/lib/components/InferenceStatsPanel.svelte— New component
Design:
- Follow
<details>pattern from FinalResult.svelte - Collapsed by default
- Summary line shows key stat (e.g., total tokens + tokens/sec)
- Expanded content shows all stats in a grid layout
- Context utilization shown as a progress bar
- Blue/indigo color scheme (neutral, info-like)
- Full dark mode support
Phase 4: Chat Page Integration
Files:
src/routes/chat/+page.svelte— RenderInferenceStatsPanelafterFinalResultwhen stats available
Acceptance Criteria
- InferenceStats proto message defined and TypeScript types generated
- InferenceStatsPanel displays all required metrics
- Panel is collapsible, collapsed by default
- Context utilization shows visual progress bar
- Integrates cleanly into chat page below assistant message
- Dark mode support
- Build, lint, typecheck pass