99ad3630fc23f773d9aa130a65e679e631d0a5bc
Two small fixes:
ai-server/stream.ts
- llama.cpp reports cached prompt tokens via
usage.prompt_tokens_details.cached_tokens
and we were ignoring it. Populate output.usage.cacheRead so pi's
footer can show the "R<tokens>" field. cacheRead is a subset of
prompt_tokens (already counted in input), so totalTokens stays
input + output — no double-counting.
dark-mechanicus-indicator.ts
- Pi appends "Working... (ESC to interrupt)" next to custom working
indicator frames via a separate message slot. Call
ctx.ui.setWorkingMessage("") on session_start + every turn_start to
clear that suffix so the indicator line is just
⚙ <quote> · <elapsed>
with no trailing "Working...".
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Description
Personal pi (pi-coding-agent) extensions: ai-server (mTLS remote provider + admin commands) and local-llama.
Languages
TypeScript
79.3%
JavaScript
11.3%
Shell
9.4%