feat: add LiteLLM as embedding provider#809
Conversation
c029eb3 to
849f9f5
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c029eb3b86
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| elif provider_name == "litellm": | ||
| from basic_memory.repository.litellm_provider import LiteLLMEmbeddingProvider | ||
|
|
||
| model_name = app_config.semantic_embedding_model or "openai/text-embedding-3-small" |
There was a problem hiding this comment.
Map the built-in default model for LiteLLM
When users switch only semantic_embedding_provider to litellm, BasicMemoryConfig still supplies the non-empty default model bge-small-en-v1.5, so this or never selects the LiteLLM provider default. The factory then instantiates LiteLLMEmbeddingProvider(model_name="bge-small-en-v1.5") instead of a LiteLLM-routable model such as openai/text-embedding-3-small, making the new provider fail for the documented minimal configuration; mirror the OpenAI branch's remapping of the FastEmbed default or otherwise treat it as unset.
Useful? React with 👍 / 👎.
|
cc @phernandez |
|
Thanks for opening this. I took a careful maintainer pass because this adds a new runtime provider and dependency. The direction is useful, but I do not think we can merge this as-is yet. Main blockers:
A smaller design question for maintainers/contributor: adding |
|
@RheagalFire thanks again for contributing this. The overall idea is useful, but there are a few correctness and test-coverage issues we need fixed before we can move it toward merge. Would you like to take a pass at addressing the review notes above? If so, we are happy to review another commit on this PR. If you would rather not, just say so and we can decide whether someone on the Basic Memory side should pick it up from here. |
Thanks for the review. I'm happy to pick up the changes. |
b106193 to
d6100c7
Compare
Signed-off-by: RheagalFire <[email protected]>
…ckfile Signed-off-by: RheagalFire <[email protected]>
d6100c7 to
7d068f5
Compare
Signed-off-by: RheagalFire <[email protected]>
|
Addressed all 3 blockers:
On the design question about dependency surface -- you are right, litellm pulls in a sizable transitive set. If you would prefer it as an optional extra rather than a core dependency, I am happy to move it to |
|
Thanks @RheagalFire — the rewrite addresses all three earlier blockers cleanly, and the factory mapping mirrors the OpenAI branch nicely. On the dependency-surface question: keep Before we merge I'd like to add two small things on top of your branch. I'll push a fixup commit so you don't have to context-switch:
Both are small; I'll keep your authorship on the commit history. |
Bring the LiteLLM provider in line with the unit-norm contract from sqlite_search_repository.py (lines 65-67): the cosine-similarity formula `1 - L²/2` is correct only for unit-normalized vectors. LiteLLM routes to many backends (Cohere, Vertex, Bedrock, etc.) that do not return normalized embeddings, so normalize at the provider boundary — same fix shape as the parallel FastEmbed change in basicmachines-co#843. Also align the response handling with OpenAIEmbeddingProvider: - attribute access on response items (item.index / item.embedding) - explicit duplicate-index guard Tests cover the three behaviors directly (unit norm, zero-vector pass-through, duplicate-index error) and the existing ordering test now reconstructs the expected normalized vectors so a normalization regression would be caught. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> Signed-off-by: phernandez <[email protected]>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f9e7029ae7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if normalized and len(normalized[0]) != self.dimensions: | ||
| raise RuntimeError( |
There was a problem hiding this comment.
Derive LiteLLM dimensions instead of hard-coding 1536
LiteLLMEmbeddingProvider enforces len(normalized[0]) == self.dimensions, but when semantic_embedding_dimensions is unset the factory leaves dimensions at the constructor default of 1536; this makes any non-1536 LiteLLM model (for example many Cohere/Azure deployments) fail at runtime even though the new provider is presented as broadly provider-agnostic. In practice, switching semantic_embedding_model to a valid non-OpenAI model will reliably raise this error during indexing unless users manually discover and set dimensions, so the provider should infer dimensions from the first response (or fail earlier with explicit config validation).
Useful? React with 👍 / 👎.
Summary
LiteLLMEmbeddingProviderimplementing theEmbeddingProviderprotocol, following the exact same pattern asOpenAIEmbeddingProvidercreate_embedding_provider()factory withprovider_name == "litellm"Changes
src/basic_memory/repository/litellm_provider.py- newLiteLLMEmbeddingProviderwith:litellm.aembedding()for async embeddingdrop_params=Truefor cross-provider kwargs compatibilitysrc/basic_memory/repository/embedding_provider_factory.py- addedelif provider_name == "litellm"branchpyproject.toml- addedlitellm>=1.60.0,<2.0.0to dependenciestests/repository/test_litellm_provider.py- 13 unit tests (all passing)Tests
Unit tests (13/13 passing):
Example usage
See https://docs.litellm.ai/docs/embedding/supported_embedding for all supported embedding models.
Impact
litellmadded as dependency in pyproject.tomldrop_params=Truesilently drops provider-unsupported kwargsOpenAIEmbeddingProviderprovider_name == "litellm"config