shahondin1624 99ad3630fc stream: report cached tokens; indicator: suppress pi's "Working..."
Two small fixes:

ai-server/stream.ts
- llama.cpp reports cached prompt tokens via
    usage.prompt_tokens_details.cached_tokens
  and we were ignoring it. Populate output.usage.cacheRead so pi's
  footer can show the "R<tokens>" field. cacheRead is a subset of
  prompt_tokens (already counted in input), so totalTokens stays
  input + output — no double-counting.

dark-mechanicus-indicator.ts
- Pi appends "Working... (ESC to interrupt)" next to custom working
  indicator frames via a separate message slot. Call
  ctx.ui.setWorkingMessage("") on session_start + every turn_start to
  clear that suffix so the indicator line is just
    ⚙ <quote> · <elapsed>
  with no trailing "Working...".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 23:28:16 +02:00
S
Description
Personal pi (pi-coding-agent) extensions: ai-server (mTLS remote provider + admin commands) and local-llama.
490 KiB
Languages
TypeScript 79.3%
JavaScript 11.3%
Shell 9.4%