Files
llm-multiverse/implementation-plans/issue-122.md
Pi Agent a8be3accb4 fix: resolve tech debt from issue #32 review
Closes #122

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 09:46:00 +01:00

67 lines
2.8 KiB
Markdown

# Issue #122: Tech debt: minor findings from issue #32 review
## Summary
Address five tech debt items from the semantic cache review:
1. **Return best match, not first match** - `lookup()` returns the first entry above the similarity threshold via linear scan. Track and return the highest-similarity match instead of short-circuiting.
2. **Deduplicate extraction segment building** - The logic for building `(extracted_segment, extraction_confidence)` tuples is duplicated between cache population and streaming response paths. Extract into a helper.
3. **Reuse params.tag_filter** - The `tag_filter_ref` derivation duplicates the same empty-string-to-None conversion already done for `RetrievalParams`. Derive from the already-constructed params instead.
4. **Prevent duplicate cache entries** - `insert()` doesn't check if a sufficiently similar entry already exists, so repeated identical queries can accumulate duplicates.
5. **Proactive expired entry eviction** - `evict_expired()` only runs during insert. Also evict during the periodic metrics logging.
## Item 1: Return best match
### Approach
Replace the `for` loop with early return in `lookup()` with a scan that tracks the best (highest similarity) match. After scanning all non-expired, tag-matching entries, return the best match if it exceeds the threshold.
### Files changed
- `services/memory/src/cache/mod.rs` - Rewrite `lookup()` loop
## Item 2: Deduplicate extraction segment building
### Approach
Extract a helper function `get_extraction_pair(extraction_results, rank)` that returns `(Option<String>, Option<f32>)`. Use it in both the cache population and streaming response paths.
### Files changed
- `services/memory/src/service.rs` - Add helper, use in both places
## Item 3: Reuse params.tag_filter
### Approach
The `RetrievalParams` already converts empty `memory_type` to `None` via the `from_config` call. Use `params.tag_filter.as_deref()` instead of re-deriving `tag_filter_ref` from `req.memory_type`.
### Files changed
- `services/memory/src/service.rs` - Replace `tag_filter_ref` derivation
## Item 4: Prevent duplicate cache entries
### Approach
In `insert()`, before pushing the new entry, check if any existing non-expired entry has similarity >= threshold and the same tag filter. If so, replace it instead of adding a duplicate.
### Files changed
- `services/memory/src/cache/mod.rs` - Add duplicate check in `insert()`
## Item 5: Proactive expired entry eviction
### Approach
Add a public `evict_expired_entries()` method that acquires the write lock and calls the internal `evict_expired()`. Call it from the periodic metrics logging task in `main.rs`.
### Files changed
- `services/memory/src/cache/mod.rs` - Add public `evict_expired_entries()` method
- `services/memory/src/main.rs` - Call eviction before metrics logging