feat: implement Inference + GenerateEmbedding endpoints (#42) #140

Merged
shahondin1624 merged 2 commits from feature/issue-42-inference-embedding-endpoints into main 2026-03-10 15:09:23 +01:00

Summary\n- Implement unary Inference endpoint: validates request, routes model, calls Ollama generate(), returns text + finish_reason + tokens_used\n- Implement unary GenerateEmbedding endpoint: validates request, resolves embedding model, calls Ollama embed(), returns embedding vector + dimensions\n- Both endpoints use model routing, audit logging, and consistent error mapping\n- Added 7 validation tests, removed 2 stale unimplemented stub tests\n\n## Test plan\n- [x] All 74 model-gateway tests pass\n- [x] Clippy clean (no warnings)\n- [x] Request validation covers missing params, missing context, empty prompt/text, empty session_id\n- [ ] Integration tests with mocked Ollama (deferred to issue #43)

## Summary\n- Implement unary `Inference` endpoint: validates request, routes model, calls Ollama generate(), returns text + finish_reason + tokens_used\n- Implement unary `GenerateEmbedding` endpoint: validates request, resolves embedding model, calls Ollama embed(), returns embedding vector + dimensions\n- Both endpoints use model routing, audit logging, and consistent error mapping\n- Added 7 validation tests, removed 2 stale unimplemented stub tests\n\n## Test plan\n- [x] All 74 model-gateway tests pass\n- [x] Clippy clean (no warnings)\n- [x] Request validation covers missing params, missing context, empty prompt/text, empty session_id\n- [ ] Integration tests with mocked Ollama (deferred to issue #43)
shahondin1624 added 2 commits 2026-03-10 15:09:17 +01:00
- Inference: validates request, routes model via ModelRouter, calls
  Ollama generate(), returns text + finish_reason + tokens_used
- GenerateEmbedding: validates request, resolves embedding model,
  calls Ollama embed(), returns embedding vector + dimensions
- Both endpoints use audit logging (best-effort) and consistent
  error mapping via ollama_err_to_status()
- Added 7 validation unit tests, removed 2 stale unimplemented tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
shahondin1624 merged commit 80c272bd56 into main 2026-03-10 15:09:23 +01:00
shahondin1624 deleted branch feature/issue-42-inference-embedding-endpoints 2026-03-10 15:09:23 +01:00
Sign in to join this conversation.