T

shahondin1624 c464f6b903 fix generation token-rate disappearing on empty completions

A clean completion that emits a single token with no content delta never
captured firstOutputMs, so the footer's generation rate (G) computed null while
the processing rate (P) survived via the responseEndMs fallback — the stat
visibly dropped out on those turns.

Add findDisplayableTokenStats, which walks back to the most recent turn that has
a usable generation rate so a degenerate turn no longer blanks the display, and
point the footer at it. Falls back to the newest turn with any stats so P still
shows when no turn has a generation rate. findLatestPiTokenStats (persistence)
is unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-28 20:33:22 +02:00

ai-server

migrate ai-server extension from llama.cpp router to llama-swap

2026-05-27 10:42:19 +02:00

dark-mechanicus

Refactor extension structure

2026-05-17 22:55:46 +02:00

llama.cpp

fix llama.cpp tool conversion for direct provider

2026-05-19 09:18:23 +02:00

scripts

update install script

2026-05-26 14:57:06 +02:00

session-handoff

fix session-handoff truncation persisting only one turn

2026-05-28 19:26:55 +02:00

shared

fix generation token-rate disappearing on empty completions

2026-05-28 20:33:22 +02:00

tests

fix generation token-rate disappearing on empty completions

2026-05-28 20:33:22 +02:00

themes

Darken warpGreen to nurgle-green + patch Editor for cursor/typing

2026-04-23 22:09:01 +02:00

token-stats

fix generation token-rate disappearing on empty completions

2026-05-28 20:33:22 +02:00

.gitignore

update install script

2026-05-26 14:57:06 +02:00

README.md

reshape pi-extensions layout to match installed extensions

2026-05-18 22:21:48 +02:00

README.md

pi-extensions

Personal collection of extensions, themes, and scripts for pi — Mario Zechner's CLI coding agent.

The anchor of the repo is ai-server, a multi-file pi extension that lets pi talk to a self-hosted llama.cpp router behind mTLS. Everything else (theme, banner, custom footer, working indicator, session-name generator, etc.) is Warhammer 40k "Dark Mechanicum" flavoring on top of pi's interactive TUI.

Installed extensions

Extension	What it does
`ai-server/`	Remote llama.cpp provider over mTLS. Dynamic model discovery. Admin slash commands (load / unload / ctx / preset / restart / refresh). Custom SSE stream implementation with tool calls, reasoning, cache token reporting. See ai-server/README.md for the full setup.
`token-stats/` + `shared/token-stats.ts`	Footer owner for context-window + token-rate display. Tracks prefill and generation speed, including reasoning/thinking tokens, and reads `tokenStats.enabled` from `~/.pi/agent/settings.json`.
`dark-mechanicus/`	TUI customization bundle for the dark-mechanicus theme — loaded as one extension via `index.ts`. Includes: `indicator.ts` (working indicator: `⚙ <quote> · <elapsed>`, pulsing cog, 45-quote pool), `banner.ts` (cog-and-skull header art), `status-line.ts` (third footer line with rotating flavor text), `session-names.ts` (auto `<adj>-<noun> · <NNN>` session names + tab title), `thinking-label.ts` (`Cogitating...` for folded thinking blocks), `markdown-body-color.ts` (forces lavender body text). Display toggles now come from `darkMechanicus` settings instead of slash commands.
`llama.cpp/`	Local llama.cpp provider extension. Dynamic `/v1/models` discovery, fallback model registration, slash commands, and a custom streaming adapter that preserves `piTokenStats`.

Theme

File	Palette
`themes/dark-mechanicus.json`	Dark purple aubergine background (`#16101e`), AdMech blood-red accents (`#a8232c`), burnished brass borders (`#b8803d`), nurgle-green syntax strings (`#5a6b2e`), cognitor-pink syntax types (`#c75a8a`). All 51 color tokens defined.

Activate with /settings inside pi (or set "theme": "dark-mechanicus" in ~/.pi/agent/settings.json). Pi hot-reloads on edits to the active theme file.

A full commented sample config is in settings.sample.jsonc.

Setup scripts

Script	Runs on	Purpose
`scripts/install-client.sh`	New pi client	Provisions or updates a client. `--components LIST` picks parts (`ai-server`, `dark-mechanicus`, `token-stats`, `themes`, `local-llama`, `ai-complete`, `certs`, `ssh`, `shared`, or `all`). `local-llama` now syncs the split `llama.cpp/` extension and `token-stats` syncs `token-stats/`. `--update-only` is non-destructive — only adds missing files, never overwrites existing ones. Verifies mTLS `/health` at the end.
`scripts/issue-client-cert.sh`	Caddy host	Generates a new client identity (key + cert + modern + legacy p12) for a named device. Use for each new machine.
`scripts/install-browser-certs.sh`	New client	Imports client cert + root CA into Brave Flatpak NSS, system NSS, Firefox profiles, and optionally the system trust store. Mostly obsolete since the Caddy switch to Let's Encrypt but still useful for the mTLS client cert side.
`scripts/ai-complete` (docs)	Any pi client	Minimal shell CLI for direct llama-server router access. Wraps `https://ai.shahondin1624.de/v1/chat/completions` with mTLS, with retries, prompt-from-file, streaming, and model load/status helpers. Useful for scripts and agents (Claude Code, etc.) that want to delegate generation.

Tests

Seventy-four tests total, no external dependencies. Runs with Node 22+'s built-in test runner:

node --experimental-strip-types --test tests/*.test.ts llama.cpp/llama.cpp.test.mjs

File	Coverage
`tests/messages.test.ts`	15 unit tests over `ai-server/messages.ts` — pi Context → OpenAI payload conversion (system prompts, user/assistant/tool-result roles, tool calls, image-only messages).
`tests/router-utils.test.ts`	12 unit tests over `ai-server/router-utils.ts` — `extractCtxSize`, `isShardArtefact`, and reasoning-model detection helpers.
`tests/integration.test.ts`	6 live-endpoint tests: `/health`, `/models`, model-entry shape, mTLS enforcement, publicly-trusted cert (Let's Encrypt contract), chat completion usage shape including `prompt_tokens_details.cached_tokens`. Auto-skip if the server is unreachable.
`tests/token-stats.test.ts`	6 unit tests over `shared/token-stats.ts` — timing metadata parsing and rate calculation, including thinking-token-aware generation speed.
`llama.cpp/llama.cpp.test.mjs`	35 tests over the split local llama.cpp extension — reasoning-model detection, model discovery, provider registration, compat flags, slash commands, env overrides, and streaming token-stats behavior.

Stream-parsing edge cases (SSE framing, tool-call splits across chunks, reasoning deltas, abort mid-stream) remain deferred — they need a mock HTTPS server harness, not worth the complexity for a one-user setup.

Quick onboarding for another machine

# 1. Mint a new client identity (on the Caddy host)
ssh shahondin1624@192.168.2.2 '~/pi-extensions/scripts/issue-client-cert.sh laptop-alice'

# 2. On the new pi client
git clone https://git.shahondin1624.de/shahondin1624/pi-extensions.git ~/pi-extensions
cd ~/pi-extensions
scripts/install-client.sh

# 3. Activate the theme
#    /settings in pi, pick "dark-mechanicus"

Layout

pi-extensions/
├── ai-server/                  the core mTLS provider (multi-file extension)
│   ├── index.ts                entry — provider + admin commands
│   ├── config.ts               URLs, SSH host, cert loading, MODELS[]
│   ├── messages.ts             Context → OpenAI messages
│   ├── stream.ts               custom SSE stream, mTLS HTTPS, pi-ai events
│   ├── admin.ts                router HTTP client + SSH helpers
│   ├── router-utils.ts         pure helpers (test-friendly)
│   └── README.md               full mTLS + systemd + Caddy setup notes
├── token-stats/
│   └── index.ts                footer owner for context + token rate display
├── settings.sample.jsonc       commented template for ~/.pi/agent/settings.json
├── dark-mechanicus/            theme TUI bundle (one extension, multi-file)
│   ├── index.ts                entry — sequences each module's registrar
│   ├── indicator.ts            working indicator (cog + quote + timer)
│   ├── banner.ts               TUI header art
│   ├── status-line.ts          rotating flavor status line
│   ├── session-names.ts        auto-name generator + tab title
│   ├── thinking-label.ts       "Cogitating..." for folded thinking blocks
│   └── markdown-body-color.ts  forces body text color for chat + editor
├── llama.cpp/                  local llama.cpp extension (multi-file)
│   ├── config.ts               base URL, fallback model, provider identity
│   ├── discovery.ts            /v1/models discovery helpers
│   ├── index.ts                entry — provider registration + slash commands
│   ├── model-utils.ts          provider model mapping + streamSimple adapter
│   └── llama.cpp.test.mjs      self-contained node:test suite
├── themes/
│   └── dark-mechanicus.json    51-token theme
├── scripts/                    install helpers
├── tests/                      node:test suites
└── README.md                   this file

License

Personal use. No license declared; the repo is private.