67 lines
2.8 KiB
Markdown
67 lines
2.8 KiB
Markdown
# Issue #122: Tech debt: minor findings from issue #32 review
|
|
|
|
## Summary
|
|
|
|
Address five tech debt items from the semantic cache review:
|
|
|
|
1. **Return best match, not first match** - `lookup()` returns the first entry above the similarity threshold via linear scan. Track and return the highest-similarity match instead of short-circuiting.
|
|
|
|
2. **Deduplicate extraction segment building** - The logic for building `(extracted_segment, extraction_confidence)` tuples is duplicated between cache population and streaming response paths. Extract into a helper.
|
|
|
|
3. **Reuse params.tag_filter** - The `tag_filter_ref` derivation duplicates the same empty-string-to-None conversion already done for `RetrievalParams`. Derive from the already-constructed params instead.
|
|
|
|
4. **Prevent duplicate cache entries** - `insert()` doesn't check if a sufficiently similar entry already exists, so repeated identical queries can accumulate duplicates.
|
|
|
|
5. **Proactive expired entry eviction** - `evict_expired()` only runs during insert. Also evict during the periodic metrics logging.
|
|
|
|
## Item 1: Return best match
|
|
|
|
### Approach
|
|
|
|
Replace the `for` loop with early return in `lookup()` with a scan that tracks the best (highest similarity) match. After scanning all non-expired, tag-matching entries, return the best match if it exceeds the threshold.
|
|
|
|
### Files changed
|
|
|
|
- `services/memory/src/cache/mod.rs` - Rewrite `lookup()` loop
|
|
|
|
## Item 2: Deduplicate extraction segment building
|
|
|
|
### Approach
|
|
|
|
Extract a helper function `get_extraction_pair(extraction_results, rank)` that returns `(Option<String>, Option<f32>)`. Use it in both the cache population and streaming response paths.
|
|
|
|
### Files changed
|
|
|
|
- `services/memory/src/service.rs` - Add helper, use in both places
|
|
|
|
## Item 3: Reuse params.tag_filter
|
|
|
|
### Approach
|
|
|
|
The `RetrievalParams` already converts empty `memory_type` to `None` via the `from_config` call. Use `params.tag_filter.as_deref()` instead of re-deriving `tag_filter_ref` from `req.memory_type`.
|
|
|
|
### Files changed
|
|
|
|
- `services/memory/src/service.rs` - Replace `tag_filter_ref` derivation
|
|
|
|
## Item 4: Prevent duplicate cache entries
|
|
|
|
### Approach
|
|
|
|
In `insert()`, before pushing the new entry, check if any existing non-expired entry has similarity >= threshold and the same tag filter. If so, replace it instead of adding a duplicate.
|
|
|
|
### Files changed
|
|
|
|
- `services/memory/src/cache/mod.rs` - Add duplicate check in `insert()`
|
|
|
|
## Item 5: Proactive expired entry eviction
|
|
|
|
### Approach
|
|
|
|
Add a public `evict_expired_entries()` method that acquires the write lock and calls the internal `evict_expired()`. Call it from the periodic metrics logging task in `main.rs`.
|
|
|
|
### Files changed
|
|
|
|
- `services/memory/src/cache/mod.rs` - Add public `evict_expired_entries()` method
|
|
- `services/memory/src/main.rs` - Call eviction before metrics logging
|