llm-multiverse

Author	SHA1	Message	Date
Pi Agent	e5614825de	feat: implement confidence signal handling (issue #78 ) Add ConfidenceEvaluator to parse and score subtask results based on result quality and memory candidate confidence, with configurable aggregation strategies (weighted_mean, minimum, median). Add ConfidenceReplanner to generate follow-up subtasks when confidence falls below the replan threshold, with attempt tracking and max retries. Add build_confidence_summary for human-readable confidence reporting in final responses. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 21:39:24 +01:00
Pi Agent	60b5266666	feat: implement memory write gating (issue #77 ) Add MemoryWriteGate that evaluates subagent memory candidates against configurable quality gates (confidence threshold, content length bounds, structure heuristic, result quality) and writes accepted candidates to the Memory Service with provenance tagging and audit logging. - Create memory_gate.py with MemoryWriteGate, GatingDecision, GatingReport - Add MemoryGatingConfig to config.py with YAML loading - Add write_memory() to MemoryClient in clients.py - 29 tests covering all gating rules, memory writes, tagging, audit logging, and error handling (95% coverage) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 21:14:10 +01:00
Pi Agent	45c572da5e	feat: implement rolling context compaction (issue #76 ) Add OrchestratorCompactor that monitors context size and automatically compacts completed subtask results when the threshold is exceeded. Uses Model Gateway inference for LLM-based summarization with truncation fallback when gateway is unavailable. - Create compaction.py with OrchestratorCompactor class - Extend OrchestratorContext with compacted_summaries field and get_pending_subtask_ids() method - Add CompactionConfig to config.py with YAML loading - Integrate compactor into SubagentDispatcher (async _safe_add_result) - 25 tests covering threshold detection, compaction logic, gateway interaction, context integrity, and dispatcher integration (97% coverage) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 21:04:11 +01:00
Pi Agent	6677e8add6	feat: implement parallel dispatch via asyncio (issue #74 ) Add SubagentDispatcher with dependency-aware scheduling using asyncio.wait(FIRST_COMPLETED), semaphore concurrency control, per-subtask and overall timeouts, transitive dependent cancellation, and graceful error handling for partial failures. - Create dispatcher.py with SubagentDispatcher class - Add DispatcherConfig to config.py with YAML loading - 23 tests covering dependency graphs, timeouts, error handling, concurrency control, and result ordering (95%+ coverage) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 20:52:23 +01:00
Pi Agent	f84ed9ffca	feat: implement orchestrator context management (issue #75 ) Add OrchestratorContext class that tracks the full orchestration state for a ProcessRequest call: user request, decomposition plan, subtask results, session context propagation, and agent lineage construction. Key features: - Factory method from ProcessRequestRequest proto - Agent lineage chain construction (orchestrator → subagent) - SubagentRequest builder with session config propagation - JSON serialization/deserialization using orjson + protobuf json_format - Context size monitoring with warning (512KB) and hard limit (1MB) 192 tests pass (44 new context tests), ruff clean, 100% coverage on context.py. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 20:43:21 +01:00
Pi Agent	58746687ff	feat: implement task decomposition for orchestrator (issue #73 ) Add TaskDecomposer that uses Model Gateway Inference to decompose user requests into subtasks with dependency graphs and agent type assignments. Key components: - decomposer.py: TaskDecomposer class, decomposition prompt template, JSON parsing, validation (cycle detection via Kahn's algorithm), proto conversion, and single-task fallback on failure - config.py: Add DecomposerConfig with max_tokens and max_subtasks - 42 tests covering parsing, validation, agent type mapping, proto conversion, fallback behavior, and end-to-end decompose calls All 148 tests pass, ruff clean, 98% coverage on decomposer.py. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 20:37:01 +01:00
Pi Agent	6354f877f5	test: add end-to-end validation for researcher agent (issue #71 ) Add 13 e2e tests using real gRPC mock servers to validate the full researcher agent loop through actual gRPC channels. Tests cover: - Web search task completion with tool execution verification - Memory query enrichment with prompt inspection - Tool failure handling (application-level and gRPC errors) - Context compaction triggering on long research tasks - Confidence signal mapping (VERIFIED/INFERRED/UNCERTAIN) - SubagentResult schema validation including memory candidates - Graceful degradation (no tools, gateway down, memory down) - Factory function create_researcher_agent() validation Also adds KNOWN_LIMITATIONS.md documenting 10 known limitations and failure modes of the researcher agent. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 20:30:04 +01:00
Pi Agent	6f89b3f83d	docs: mark issue #70 as COMPLETED in plan index Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 20:22:50 +01:00
Pi Agent	98f4e01d18	feat: implement context compaction for subagent prompt (issue #70 ) Add context compaction to the researcher agent to handle long-running research tasks that exceed the context window budget. When estimated tokens exceed 60% of max_tokens, older history entries are summarized via the Model Gateway's unary Inference RPC and replaced with a compact bullet-point summary, preserving the 3 most recent entries. Changes: - clients.py: Add inference() unary method to ModelGatewayClient - prompt.py: Add compact() method, compaction prompt template, and _truncate_entries() fallback for gateway failures - researcher.py: Replace hard context overflow termination with compaction-then-continue logic - 93 tests pass with 95%+ coverage on modified files Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 20:22:02 +01:00
Pi Agent	41da7f866b	feat: implement researcher agent loop with tool use cycle (issue #69 ) Add the core researcher agent: gRPC client wrappers for Model Gateway, Tool Broker, and Memory Service; prompt builder with context window management; JSON output parser for tool calls and done signals; and the main agent loop with discover → infer → execute → observe cycle. Includes termination on max iterations, timeout, context overflow, and consecutive tool failures. 78 tests total (20 parser + 11 prompt + 12 client + 24 researcher + 11 existing), 98-100% coverage on new files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 20:08:58 +01:00
shahondin1624	54034b1a38	Merge pull request 'feat: scaffold Orchestrator Python project (#72 )' (#167 ) from feature/issue-72-scaffold-orchestrator into main	2026-03-10 17:06:47 +01:00
Pi Agent	32f43e0f22	feat: scaffold orchestrator Python project (issue #72 ) Create the orchestrator service with gRPC server boilerplate, YAML configuration loading, and stub ProcessRequest endpoint. Includes 11 tests (8 config + 3 service) with full coverage. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 17:06:20 +01:00
shahondin1624	ac41f41480	Merge pull request 'test: add end-to-end integration tests for Tool Broker (#67 )' (#165 ) from feature/issue-67-integration-tests into main	2026-03-10 16:54:49 +01:00
Pi Agent	0a986f3e5c	test: add end-to-end integration tests for Tool Broker (issue #67 ) Add 13 gRPC integration tests that spin up a real ToolBrokerService server and test the full pipeline via client: - ExecuteTool: valid call, manifest block, path block, loop detection - ValidateCall: allowed, denied, no side effects - DiscoverTools: per agent type, unknown agent, override ALL Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 16:54:34 +01:00
shahondin1624	0e5d3f9c40	Merge pull request 'test: add edge case unit tests for enforcement layers (#66 )' (#164 ) from feature/issue-66-enforcement-unit-tests into main	2026-03-10 16:51:43 +01:00
Pi Agent	640bbcc9bb	test: add edge case unit tests for all enforcement layers (issue #66 ) Add 21 new edge case tests across all 5 enforcement layers: - Session override: invalid level, empty tool, case sensitivity - Agent manifest: case sensitivity, negative ID, empty tools - Lineage: self-spawn, zero depth, unknown child type - Path allowlist: relative path, traversal, trailing slash - Network egress: IP address, malformed URL, localhost Total enforcement tests: 74 (was 53). Overall: 186 tests pass. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 16:51:27 +01:00
shahondin1624	8965e849ae	Merge pull request 'feat: implement ValidateCall gRPC endpoint (#65 )' (#163 ) from feature/issue-65-validate-call-endpoint into main	2026-03-10 16:47:56 +01:00
Pi Agent	e9cb88eb28	feat: implement ValidateCall gRPC endpoint (issue #65 ) Wire the ValidateCall dry-run endpoint that runs the 5-layer enforcement pipeline without executing the tool. Reuses the existing enforce() method by constructing an ExecuteToolRequest from the ValidateCallRequest. Returns is_allowed, denial_reason, and enforcement_layer. 6 new tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 16:47:36 +01:00
shahondin1624	966c6eeaa8	Merge pull request 'feat: implement ExecuteTool gRPC endpoint (#64 )' (#162 ) from feature/issue-64-execute-tool-endpoint into main	2026-03-10 16:43:54 +01:00
Pi Agent	1bba1ad35d	feat: implement ExecuteTool gRPC endpoint (issue #64 ) Wire the full tool execution pipeline in the ToolBrokerService: 5-layer enforcement → loop detection → credential injection → dispatch → injection firewall → result tagging. Also wire DiscoverTools to the discovery module and update main.rs to construct all dependencies. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 16:43:32 +01:00
shahondin1624	2d7ef7a3d8	Merge pull request 'feat: implement tool result tagging (#63 )' (#161 ) from feature/issue-63-result-tagging into main	2026-03-10 16:35:46 +01:00
Pi Agent	9d0d35f1bc	feat: implement tool result tagging (issue #63 ) Add result_tagger module that wraps tool outputs with provenance metadata (tool name, execution time, agent/session IDs, trust level). Trust classification: Internal (memory, inference), External (web, fs, shell), Unknown. Tagging does not modify actual tool result content. 13 unit tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 16:35:29 +01:00
shahondin1624	cc8b024dfb	Merge pull request 'feat: implement prompt injection firewall (#62 )' (#160 ) from feature/issue-62-injection-firewall into main	2026-03-10 16:33:23 +01:00
Pi Agent	a243030dd0	feat: implement prompt injection firewall (issue #62 ) Add heuristic scanner for common prompt injection patterns in tool results. Supports three sensitivity levels (Low/Medium/High) with configurable sanitization. Detects role manipulation, delimiter injection, jailbreak attempts, and system prompt extraction. 19 tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 16:33:08 +01:00
shahondin1624	492db4051a	Merge pull request 'feat: implement credential injection (#61 )' (#159 ) from feature/issue-61-credential-injection into main	2026-03-10 16:30:19 +01:00
Pi Agent	8ea30b813c	feat: implement credential injection (issue #61 ) Add CredentialInjector that fetches secrets from the Secrets Service at tool execution time and injects them into parameters. Credentials are never logged or returned to agents. Uses __credential parameter key for injection. 9 tests with mock gRPC server. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 16:29:57 +01:00
shahondin1624	08eca54c72	Merge pull request 'feat: implement loop and thrash detection (#60 )' (#158 ) from feature/issue-60-loop-detection into main	2026-03-10 16:26:38 +01:00
Pi Agent	baac330fd2	feat: implement loop and thrash detection (issue #60 ) Add LoopDetector with per-session/agent sliding window tracking. Detects exact-match loops (same tool + args → block), near-match loops (same tool, varying args → warning), and thrash patterns (A→B→A→B alternation → warning). Configurable thresholds for window size, max repeats, and thrash cycles. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 16:26:23 +01:00
shahondin1624	72041378fc	Merge pull request 'feat: implement tool discovery logic (#59 )' (#157 ) from feature/issue-59-tool-discovery into main	2026-03-10 16:23:00 +01:00
Pi Agent	fe65ba6411	feat: implement tool discovery logic (issue #59 ) Add discovery module with builtin tool definitions for all well-known tools (web_search, memory_read/write, fs_read/write, run_code/shell, package_install, inference, generate_embedding). Filters by agent manifest and session overrides, returns ToolDefinition with parameter schemas. 11 unit tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 16:22:34 +01:00
shahondin1624	41514d9726	Merge pull request 'feat: implement tool execution dispatch (#58 )' (#156 ) from feature/issue-58-tool-dispatch into main	2026-03-10 16:19:11 +01:00
Pi Agent	c12698faf5	feat: implement tool execution dispatch (issue #58 ) Add ToolDispatcher with dispatch table mapping tool names to executors. Three executor types: InternalExecutor (async functions), SubprocessExecutor (command with stdout/stderr capture), GrpcExecutor (placeholder for gRPC forwarding). Includes timeout enforcement via tokio::time::timeout and execution metadata (duration, exit code, success flag). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 16:18:51 +01:00
shahondin1624	637f8b6fb2	Merge pull request 'feat: implement network egress enforcement (#57 )' (#155 ) from feature/issue-57-network-egress into main	2026-03-10 16:15:14 +01:00
Pi Agent	17e3e46889	feat: implement network egress enforcement layer (issue #57 ) Add enforcement layer 5 that verifies network destinations in tool parameters against agent type allowed egress patterns. Supports exact domain matching and wildcard subdomain patterns (*.example.com). Prevents data exfiltration by restricting agent network access. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 16:14:56 +01:00
shahondin1624	2f4dbb05a4	Merge pull request 'feat: implement path allowlist enforcement (#56 )' (#154 ) from feature/issue-56-path-allowlist into main	2026-03-10 16:12:35 +01:00
Pi Agent	2953997e28	feat: implement path allowlist enforcement layer (issue #56 ) Add enforcement layer 4 that verifies file-system paths in tool parameters against agent type path allowlist glob patterns. Includes logical path canonicalization to prevent directory traversal attacks. Uses glob-match crate for pattern matching. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 16:12:11 +01:00
shahondin1624	fc892c59bb	Merge pull request 'feat: implement lineage constraint enforcement (#55 )' (#153 ) from feature/issue-55-lineage-constraint into main	2026-03-10 16:08:50 +01:00
Pi Agent	253926c898	feat: implement lineage constraint enforcement layer (issue #55 ) Add enforcement layer 3 that verifies agent lineage chains to prevent privilege escalation through agent spawning. Checks that each parent in the chain has permission to spawn its child and that spawn depth limits are respected. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 16:08:28 +01:00
shahondin1624	90f08dcdc7	Merge pull request 'feat: enforcement layer 2 — agent type manifest check (#54 )' (#152 ) from feature/issue-54-agent-manifest-check into main	2026-03-10 16:04:56 +01:00
Pi Agent	bfce35ed22	feat: implement enforcement layer 2 — agent type manifest check (issue #54 ) Add agent_manifest enforcement layer that verifies the requested tool is in the calling agent type's allowed tool list from the manifest. Denies with clear reason if no manifest found or tool not permitted. 7 tests covering allowed/denied tools, cross-type checks, unknown agents, empty tools list. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 16:04:43 +01:00
shahondin1624	b3f5fe2576	Merge pull request 'feat: enforcement layer 1 — session override check (#53 )' (#151 ) from feature/issue-53-session-override-check into main	2026-03-10 16:02:58 +01:00
Pi Agent	f2fedbf013	feat: implement enforcement layer 1 — session override check (issue #53 ) Add session override enforcement layer that checks OverrideLevel from SessionContext: ALL bypasses all enforcement, RELAX grants tools but preserves lineage checks, NONE/UNSPECIFIED applies full manifest enforcement. Returns typed SessionOverrideResult enum for downstream layers. 8 tests covering all override levels and edge cases. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 16:02:33 +01:00
shahondin1624	11d7bab132	Merge pull request 'feat: implement Agent Type Manifest loader (#52 )' (#150 ) from feature/issue-52-manifest-loader into main	2026-03-10 15:59:59 +01:00
Pi Agent	c5ceb98a92	feat: implement Agent Type Manifest loader (issue #52 ) Add ManifestStore that loads TOML agent type manifests from a directory. Each manifest defines allowed tools, path allowlists, network egress policies, lineage constraints (can_spawn), and max spawn depth. Includes validation, reload support, and lookup by ID or name. 14 manifest tests + 8 existing = 22 total, clippy clean. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 15:59:34 +01:00
shahondin1624	b9064bfe98	Merge pull request 'feat: scaffold Tool Broker Rust project (#51 )' (#149 ) from feature/issue-51-scaffold-tool-broker into main	2026-03-10 15:56:21 +01:00
Pi Agent	09b516ec3e	feat: scaffold Tool Broker Rust project (issue #51 ) Create the Tool Broker service skeleton as a Cargo workspace member: - Tonic gRPC server with DiscoverTools, ExecuteTool, ValidateCall stubs - TOML config loading (host, port, manifest_dir, audit/secrets addrs) - Server-streaming support for ExecuteTool via ReceiverStream - 8 tests (5 config, 3 service stub) passing, clippy clean Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 15:55:53 +01:00
shahondin1624	986584b759	Merge pull request 'test: integration tests for Search Service (#50 )' (#148 ) from feature/issue-50-search-integration-tests into main	2026-03-10 15:51:33 +01:00
Pi Agent	cd75318f45	test: add integration tests for Search Service (issue #50 ) 8 integration tests wiring real service components with mocked external services (SearXNG via aioresponses, Model Gateway/Audit via mock gRPC servers). Tests cover: full pipeline with all fields populated, clean text extraction, summarization, unreachable URL handling, audit logging, SearXNG unavailability, result ordering, and Model Gateway fallback. Total: 71 tests passing across the Search Service. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 15:51:13 +01:00
shahondin1624	2a16c98597	Merge pull request 'feat: implement Search gRPC endpoint (#49 )' (#147 ) from feature/issue-49-search-endpoint into main	2026-03-10 15:48:30 +01:00
Pi Agent	6ecc8b8f38	feat: implement Search gRPC endpoint with full pipeline (issue #49 ) Wire the Search RPC handler to orchestrate the full search pipeline: SearXNG query → content extraction → Model Gateway summarization. Supports configurable pipeline stages (extraction/summarization can be disabled), audit logging via Audit Service, and graceful degradation at each stage. 14 tests covering full pipeline, partial pipelines, validation, error handling, and audit logging. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 15:48:11 +01:00

1 2 3 4

180 Commits