make llama.cpp base URL configurable via settings + document live-symlink dev setup
Resolve the local llama.cpp provider's server URL from LLAMA_BASE_URL env → localLlama.baseUrl in settings.json → built-in default, reading settings inline (node:fs) so the flat-copy test build stays self-contained. A PI_SETTINGS_PATH override keeps the suite deterministic across hosts. Document the live-development workflow of symlinking each extension dir AND shared/ into ~/.pi/agent/extensions/, with a warning that a symlinked extension paired with a stale copied shared/ silently loads the wrong helpers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -15,7 +15,7 @@ Warhammer 40k "Dark Mechanicum" flavoring on top of pi's interactive TUI.
|
|||||||
| [`ai-server/`](ai-server/) | Remote llama.cpp provider over mTLS. Dynamic model discovery. Admin slash commands (load / unload / ctx / preset / restart / refresh). Custom SSE stream implementation with tool calls, reasoning, cache token reporting. See [ai-server/README.md](ai-server/README.md) for the full setup. |
|
| [`ai-server/`](ai-server/) | Remote llama.cpp provider over mTLS. Dynamic model discovery. Admin slash commands (load / unload / ctx / preset / restart / refresh). Custom SSE stream implementation with tool calls, reasoning, cache token reporting. See [ai-server/README.md](ai-server/README.md) for the full setup. |
|
||||||
| [`token-stats/`](token-stats/) + [`shared/token-stats.ts`](shared/token-stats.ts) | Footer owner for context-window + token-rate display. Tracks prefill and generation speed, including reasoning/thinking tokens, and reads `tokenStats.enabled` from `~/.pi/agent/settings.json`. |
|
| [`token-stats/`](token-stats/) + [`shared/token-stats.ts`](shared/token-stats.ts) | Footer owner for context-window + token-rate display. Tracks prefill and generation speed, including reasoning/thinking tokens, and reads `tokenStats.enabled` from `~/.pi/agent/settings.json`. |
|
||||||
| [`dark-mechanicus/`](dark-mechanicus/) | TUI customization bundle for the dark-mechanicus theme — loaded as one extension via `index.ts`. Includes: `indicator.ts` (working indicator: `⚙ <quote> · <elapsed>`, pulsing cog, 45-quote pool), `banner.ts` (cog-and-skull header art), `status-line.ts` (third footer line with rotating flavor text), `session-names.ts` (auto `<adj>-<noun> · <NNN>` session names + tab title), `thinking-label.ts` (`Cogitating...` for folded thinking blocks), `markdown-body-color.ts` (forces lavender body text). Display toggles now come from `darkMechanicus` settings instead of slash commands. |
|
| [`dark-mechanicus/`](dark-mechanicus/) | TUI customization bundle for the dark-mechanicus theme — loaded as one extension via `index.ts`. Includes: `indicator.ts` (working indicator: `⚙ <quote> · <elapsed>`, pulsing cog, 45-quote pool), `banner.ts` (cog-and-skull header art), `status-line.ts` (third footer line with rotating flavor text), `session-names.ts` (auto `<adj>-<noun> · <NNN>` session names + tab title), `thinking-label.ts` (`Cogitating...` for folded thinking blocks), `markdown-body-color.ts` (forces lavender body text). Display toggles now come from `darkMechanicus` settings instead of slash commands. |
|
||||||
| [`llama.cpp/`](llama.cpp/) | Local llama.cpp provider extension. Dynamic `/v1/models` discovery, fallback model registration, slash commands, and a custom streaming adapter that preserves `piTokenStats`. |
|
| [`llama.cpp/`](llama.cpp/) | Local llama.cpp provider extension. Dynamic `/v1/models` discovery, fallback model registration, slash commands, and a custom streaming adapter that preserves `piTokenStats`. Server base URL resolves from `LLAMA_BASE_URL` env → `localLlama.baseUrl` in `~/.pi/agent/settings.json` → built-in default. |
|
||||||
|
|
||||||
## Theme
|
## Theme
|
||||||
|
|
||||||
@@ -39,7 +39,7 @@ A full commented sample config is in [`settings.sample.jsonc`](settings.sample.j
|
|||||||
|
|
||||||
## Tests
|
## Tests
|
||||||
|
|
||||||
Seventy-four tests total, no external dependencies. Runs with Node 22+'s
|
83 tests total, no external dependencies. Runs with Node 22+'s
|
||||||
built-in test runner:
|
built-in test runner:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
@@ -49,10 +49,10 @@ node --experimental-strip-types --test tests/*.test.ts llama.cpp/llama.cpp.test.
|
|||||||
| File | Coverage |
|
| File | Coverage |
|
||||||
|---|---|
|
|---|---|
|
||||||
| `tests/messages.test.ts` | 15 unit tests over `ai-server/messages.ts` — pi Context → OpenAI payload conversion (system prompts, user/assistant/tool-result roles, tool calls, image-only messages). |
|
| `tests/messages.test.ts` | 15 unit tests over `ai-server/messages.ts` — pi Context → OpenAI payload conversion (system prompts, user/assistant/tool-result roles, tool calls, image-only messages). |
|
||||||
| `tests/router-utils.test.ts` | 12 unit tests over `ai-server/router-utils.ts` — `extractCtxSize`, `isShardArtefact`, and reasoning-model detection helpers. |
|
| `tests/router-utils.test.ts` | 14 unit tests over `ai-server/router-utils.ts` — `extractCtxSize`, `isShardArtefact`, and reasoning-model detection helpers. |
|
||||||
| `tests/integration.test.ts` | 6 live-endpoint tests: `/health`, `/models`, model-entry shape, mTLS enforcement, publicly-trusted cert (Let's Encrypt contract), chat completion usage shape including `prompt_tokens_details.cached_tokens`. Auto-skip if the server is unreachable. |
|
| `tests/integration.test.ts` | 6 live-endpoint tests: `/health`, `/models`, model-entry shape, mTLS enforcement, publicly-trusted cert (Let's Encrypt contract), chat completion usage shape including `prompt_tokens_details.cached_tokens`. Auto-skip if the server is unreachable. |
|
||||||
| `tests/token-stats.test.ts` | 6 unit tests over `shared/token-stats.ts` — timing metadata parsing and rate calculation, including thinking-token-aware generation speed. |
|
| `tests/token-stats.test.ts` | 10 unit tests over `shared/token-stats.ts` — timing metadata parsing and rate calculation, including thinking-token-aware generation speed and displayable-turn fallback. |
|
||||||
| `llama.cpp/llama.cpp.test.mjs` | 35 tests over the split local llama.cpp extension — reasoning-model detection, model discovery, provider registration, compat flags, slash commands, env overrides, and streaming token-stats behavior. |
|
| `llama.cpp/llama.cpp.test.mjs` | 38 tests over the split local llama.cpp extension — reasoning-model detection, model discovery, provider registration, compat flags, slash commands, env + `localLlama.baseUrl` settings resolution, and streaming token-stats behavior. |
|
||||||
|
|
||||||
Stream-parsing edge cases (SSE framing, tool-call splits across chunks,
|
Stream-parsing edge cases (SSE framing, tool-call splits across chunks,
|
||||||
reasoning deltas, abort mid-stream) remain deferred — they need a mock
|
reasoning deltas, abort mid-stream) remain deferred — they need a mock
|
||||||
@@ -73,6 +73,40 @@ scripts/install-client.sh
|
|||||||
# /settings in pi, pick "dark-mechanicus"
|
# /settings in pi, pick "dark-mechanicus"
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## Local development (syncing the repo into pi)
|
||||||
|
|
||||||
|
pi loads extensions from `~/.pi/agent/extensions/`. `install-client.sh` *copies*
|
||||||
|
files there, but for active development it's easier to **symlink** each
|
||||||
|
extension (and `shared/`) so edits in this repo take effect on the next pi
|
||||||
|
restart — no re-copy needed.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
REPO="$HOME/Projects/pi-extensions" # this checkout
|
||||||
|
EXT="$HOME/.pi/agent/extensions"
|
||||||
|
mkdir -p "$EXT"
|
||||||
|
|
||||||
|
# Symlink every tracked extension directory + shared/ into the load dir.
|
||||||
|
for d in ai-server dark-mechanicus llama.cpp memory session-handoff token-stats shared; do
|
||||||
|
rm -rf "$EXT/$d" # remove any stale copy/symlink first
|
||||||
|
ln -s "$REPO/$d" "$EXT/$d"
|
||||||
|
done
|
||||||
|
|
||||||
|
# Sanity check
|
||||||
|
ls -la "$EXT" # each entry should be a symlink -> the repo
|
||||||
|
```
|
||||||
|
|
||||||
|
> **Important:** `shared/` **must** be symlinked too, not left as a copy.
|
||||||
|
> Extensions import sibling helpers via `../shared/*.js`, and pi's loader
|
||||||
|
> resolves those relative to the *install* path (it does not canonicalize
|
||||||
|
> symlinks). A symlinked extension paired with a stale copied `shared/` will
|
||||||
|
> silently load the wrong helpers — e.g. an extension can import a function the
|
||||||
|
> copy doesn't have yet, throw at render, and (for the footer) blank out
|
||||||
|
> entirely. Keep them in lockstep by symlinking both.
|
||||||
|
|
||||||
|
After changing symlinks, **restart pi** to reload extensions. To go back to a
|
||||||
|
copy-based install, delete the symlinks and re-run
|
||||||
|
`scripts/install-client.sh`.
|
||||||
|
|
||||||
## Layout
|
## Layout
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|||||||
+34
-2
@@ -1,13 +1,45 @@
|
|||||||
/**
|
/**
|
||||||
* Configuration constants for the llama.cpp provider extension.
|
* Configuration constants for the llama.cpp provider extension.
|
||||||
*
|
*
|
||||||
* All values are configurable via environment variables. Defaults are
|
* The server base URL resolves in this order:
|
||||||
|
* 1. LLAMA_BASE_URL environment variable
|
||||||
|
* 2. `localLlama.baseUrl` in ~/.pi/agent/settings.json
|
||||||
|
* 3. Built-in default
|
||||||
|
* All other values are configurable via environment variables. Defaults are
|
||||||
* suitable for a typical LAN-based llama.cpp server.
|
* suitable for a typical LAN-based llama.cpp server.
|
||||||
*/
|
*/
|
||||||
|
|
||||||
|
import { existsSync, readFileSync } from "node:fs";
|
||||||
|
import { join } from "node:path";
|
||||||
|
|
||||||
|
// ─── Settings lookup ────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
const HOME = process.env.HOME ?? process.env.USERPROFILE ?? "";
|
||||||
|
// PI_SETTINGS_PATH lets tests point at an isolated settings file (or a
|
||||||
|
// nonexistent one) so resolution is deterministic regardless of the host.
|
||||||
|
const SETTINGS_PATH = process.env.PI_SETTINGS_PATH ?? join(HOME, ".pi", "agent", "settings.json");
|
||||||
|
|
||||||
|
/** Read `localLlama.baseUrl` (or `local-llama.baseUrl`) from pi's settings.json. */
|
||||||
|
function baseUrlFromSettings(): string | undefined {
|
||||||
|
try {
|
||||||
|
if (!SETTINGS_PATH || !existsSync(SETTINGS_PATH)) {
|
||||||
|
return undefined;
|
||||||
|
}
|
||||||
|
const settings = JSON.parse(readFileSync(SETTINGS_PATH, "utf8")) as Record<string, unknown>;
|
||||||
|
const section = (settings.localLlama ?? settings["local-llama"]) as
|
||||||
|
| Record<string, unknown>
|
||||||
|
| undefined;
|
||||||
|
const url = section?.baseUrl;
|
||||||
|
return typeof url === "string" && url.length > 0 ? url : undefined;
|
||||||
|
} catch {
|
||||||
|
return undefined;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
// ─── Server configuration ───────────────────────────────────────────────
|
// ─── Server configuration ───────────────────────────────────────────────
|
||||||
|
|
||||||
export const BASE_URL = process.env.LLAMA_BASE_URL ?? "http://192.168.2.35:8123/v1";
|
export const BASE_URL =
|
||||||
|
process.env.LLAMA_BASE_URL ?? baseUrlFromSettings() ?? "http://192.168.2.35:8123/v1";
|
||||||
|
|
||||||
// ─── Fallback model ─────────────────────────────────────────────────────
|
// ─── Fallback model ─────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
|||||||
@@ -82,6 +82,9 @@ function cleanLlamaEnv() {
|
|||||||
delete process.env.LLAMA_MODEL_ID;
|
delete process.env.LLAMA_MODEL_ID;
|
||||||
delete process.env.LLAMA_CTX;
|
delete process.env.LLAMA_CTX;
|
||||||
delete process.env.LLAMA_MAX_OUT;
|
delete process.env.LLAMA_MAX_OUT;
|
||||||
|
// Point settings resolution at a nonexistent file so BASE_URL falls through
|
||||||
|
// to the built-in default, independent of the developer's real settings.json.
|
||||||
|
process.env.PI_SETTINGS_PATH = join(tmpdir(), "llama-test-no-such-settings.json");
|
||||||
}
|
}
|
||||||
|
|
||||||
// ─── Mock PI ────────────────────────────────────────────────────────────────
|
// ─── Mock PI ────────────────────────────────────────────────────────────────
|
||||||
@@ -811,6 +814,53 @@ test("extension entry: registers slash commands", async () => {
|
|||||||
}
|
}
|
||||||
});
|
});
|
||||||
|
|
||||||
|
test("config: reads baseUrl from localLlama settings when env unset", async () => {
|
||||||
|
const { outputDir } = buildCompiledModule();
|
||||||
|
const settingsDir = mkdtempSync(join(tmpdir(), "llama-settings-"));
|
||||||
|
const settingsPath = join(settingsDir, "settings.json");
|
||||||
|
writeFileSync(
|
||||||
|
settingsPath,
|
||||||
|
JSON.stringify({ localLlama: { baseUrl: "http://10.0.0.9:8123/v1" } }),
|
||||||
|
"utf8",
|
||||||
|
);
|
||||||
|
try {
|
||||||
|
cleanLlamaEnv();
|
||||||
|
process.env.PI_SETTINGS_PATH = settingsPath;
|
||||||
|
const { pi, state } = createMockPI();
|
||||||
|
const mod = await importModule(outputDir);
|
||||||
|
mod.registerProviderWithModels(pi, [{ id: "m" }]);
|
||||||
|
assert.equal(state.providers[0].config.baseUrl, "http://10.0.0.9:8123/v1");
|
||||||
|
} finally {
|
||||||
|
cleanLlamaEnv();
|
||||||
|
rmSync(settingsDir, { recursive: true, force: true });
|
||||||
|
rmSync(outputDir, { recursive: true, force: true });
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
|
test("config: LLAMA_BASE_URL env overrides localLlama settings", async () => {
|
||||||
|
const { outputDir } = buildCompiledModule();
|
||||||
|
const settingsDir = mkdtempSync(join(tmpdir(), "llama-settings-"));
|
||||||
|
const settingsPath = join(settingsDir, "settings.json");
|
||||||
|
writeFileSync(
|
||||||
|
settingsPath,
|
||||||
|
JSON.stringify({ localLlama: { baseUrl: "http://10.0.0.9:8123/v1" } }),
|
||||||
|
"utf8",
|
||||||
|
);
|
||||||
|
try {
|
||||||
|
cleanLlamaEnv();
|
||||||
|
process.env.PI_SETTINGS_PATH = settingsPath;
|
||||||
|
process.env.LLAMA_BASE_URL = "http://env-host:9999/v1";
|
||||||
|
const { pi, state } = createMockPI();
|
||||||
|
const mod = await importModule(outputDir);
|
||||||
|
mod.registerProviderWithModels(pi, [{ id: "m" }]);
|
||||||
|
assert.equal(state.providers[0].config.baseUrl, "http://env-host:9999/v1");
|
||||||
|
} finally {
|
||||||
|
cleanLlamaEnv();
|
||||||
|
rmSync(settingsDir, { recursive: true, force: true });
|
||||||
|
rmSync(outputDir, { recursive: true, force: true });
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
test("extension entry: uses env overrides for BASE_URL", async () => {
|
test("extension entry: uses env overrides for BASE_URL", async () => {
|
||||||
const { outputDir } = buildCompiledModule();
|
const { outputDir } = buildCompiledModule();
|
||||||
try {
|
try {
|
||||||
|
|||||||
Reference in New Issue
Block a user