Files
llm-multiverse/implementation-plans/issue-037.md
Pi Agent 640871e554 test: add integration tests for Memory Service (issue #37)
Add 16 integration tests exercising the full gRPC flow through a real
tonic server with mock Model Gateway and mock Audit Service:

- WriteMemory: stores entry, generates embeddings, verifies DB contents
- QueryMemory: returns streamed results, verifies cache hit on repeat
- GetCorrelated: by memory_id, explicit IDs, and session context
- Provenance: external sanitization, clean external, internal trusted
- RevokeMemory: verifies revocation in provenance table
- Audit logging: verifies write (action 4) and read (action 3) entries
- End-to-end lifecycle: write -> query -> correlate -> audit verify

Also fix clippy warnings for redundant ..Default::default() in tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 13:30:10 +01:00

6.0 KiB

Implementation Plan — Issue #37: Integration tests for Memory Service

Metadata

Field Value
Issue #37
Title Integration tests for Memory Service
Milestone Phase 4: Memory Service
Labels
Status COMPLETED
Language Rust
Related Plans issue-021.md, issue-034.md, issue-035.md, issue-036.md
Blocked by #34 (completed), #35 (completed), #36 (completed)

Acceptance Criteria

  • Test: WriteMemory stores entry and generates embedding
  • Test: QueryMemory returns relevant results via staged retrieval
  • Test: Semantic cache hit on repeated similar queries
  • Test: GetCorrelated returns linked memories
  • Test: Provenance tagging and poisoning protection
  • Test: Audit logging for all operations
  • Tests run in CI

Architecture Analysis

Service Context

This issue creates an integration test file at services/memory/tests/integration_test.rs that exercises all four gRPC endpoints (WriteMemory, QueryMemory, GetCorrelated, RevokeMemory) through a real tonic gRPC server, verifying the full client-to-server-to-DB-to-response flow.

Key difference from unit tests: The existing 246 unit tests in service.rs call MemoryService trait methods directly on MemoryServiceImpl (bypassing gRPC transport). Integration tests will instead:

  1. Spin up a real tonic gRPC server with MemoryServiceServer
  2. Connect via MemoryServiceClient
  3. Exercise the full request serialization, transport, handler, DB, and response path
  4. Also spin up a mock Model Gateway server (for embeddings/inference) and a mock Audit Service server (to capture audit log calls)

Existing Patterns

The audit service integration tests at services/audit/tests/integration_test.rs establish the pattern:

  • start_test_server() binds a TcpListener on 127.0.0.1:0, spawns a tonic Server, sleeps 50ms, returns (SocketAddr, ...)
  • connect_client(addr) creates a gRPC client via XxxClient::connect(format!("http://{addr}"))
  • Tests use #[tokio::test], each test gets its own server instance for isolation

The memory service already follows this pattern: lib.rs exposes all modules publicly. Integration tests can import memory_service::service::MemoryServiceImpl, memory_service::db::DuckDbManager, etc.

The unit test helpers in service.rs (inside #[cfg(test)] mod tests) are not accessible from integration tests. The integration test file must replicate the mock gateway pattern.

Implementation Steps

1. Test Infrastructure — Mock Servers

a) Mock Model Gateway Server — Replicates the MockGateway from unit tests: generate_embedding returns fixed 768-dim 0.10 vector, inference returns configurable text.

b) Mock Audit Service Server — Implements AuditService trait: append stores AppendRequest in Arc<Mutex<Vec<AppendRequest>>> and returns success. Allows tests to inspect captured audit entries.

c) Real Memory Service Server — Constructed with DuckDbManager::in_memory(), EmbeddingClient connected to mock gateway, AuditServiceClient connected to mock audit, default configs.

d) Helper Functions:

  • start_mock_gateway(inference_response: Option<&str>) -> SocketAddr
  • start_mock_audit() -> (SocketAddr, Arc<Mutex<Vec<AppendRequest>>>)
  • start_memory_server(gateway_addr, audit_addr) -> (SocketAddr, Arc<DuckDbManager>)
  • connect_memory_client(addr) -> MemoryServiceClient<Channel>
  • valid_ctx() -> SessionContext

2. Test Cases

Test Acceptance Criterion
test_grpc_write_memory_stores_entry WriteMemory stores entry and generates embedding
test_grpc_write_memory_with_provided_id WriteMemory stores entry
test_grpc_query_memory_returns_results QueryMemory returns relevant results
test_grpc_query_memory_empty_db QueryMemory returns empty stream on empty DB
test_grpc_query_memory_cache_hit Semantic cache hit on repeated queries
test_grpc_get_correlated_by_memory_id GetCorrelated returns linked memories
test_grpc_get_correlated_by_explicit_ids GetCorrelated returns linked memories
test_grpc_get_correlated_by_session GetCorrelated returns linked memories
test_grpc_write_external_sanitized Provenance tagging and poisoning protection
test_grpc_write_external_clean Provenance tagging and poisoning protection
test_grpc_write_internal_trusted Provenance tagging and poisoning protection
test_grpc_revoke_memory Provenance tagging and poisoning protection
test_grpc_write_memory_audit_logged Audit logging for all operations
test_grpc_get_correlated_audit_logged Audit logging for all operations
test_grpc_write_then_query_then_correlate End-to-end lifecycle test

3. File Structure

services/memory/tests/
  integration_test.rs    -- all integration tests with mock infrastructure

Files to Create/Modify

File Action Purpose
services/memory/tests/integration_test.rs Create Integration test suite with mock servers and ~15 tests

Risks and Edge Cases

  • Mock embedding uniformity: Mock returns identical embeddings for all inputs (cosine similarity = 1.0). Tests verify gRPC flow, not retrieval quality.
  • HNSW index rebuild: After writing via gRPC, the HNSW index needs rebuilding. The write handler calls ensure_hnsw_index after inserting embeddings, so this should work automatically.
  • Race between server start and client connect: 50ms sleep after spawning server, same as audit service pattern.
  • Correlations via gRPC: WriteMemory inserts correlating_ids into correlations table, so writing with correlating_ids should create the correlation rows.
  • Cache test ordering: Must issue two identical queries in sequence. CacheConfig::default() has enabled: true.

Deviation Log

(Filled during implementation if deviations from plan occur)

Deviation Reason