feat: add LiteLLM as embedding provider by RheagalFire · Pull Request #809 · basicmachines-co/basic-memory

RheagalFire · 2026-05-08T22:36:42Z

Summary

Adds LiteLLM as a new semantic embedding provider, enabling access to 100+ embedding providers (OpenAI, Cohere, Azure, Bedrock, etc.) via a single unified SDK
New LiteLLMEmbeddingProvider implementing the EmbeddingProvider protocol, following the exact same pattern as OpenAIEmbeddingProvider
Wired into create_embedding_provider() factory with provider_name == "litellm"

Changes

src/basic_memory/repository/litellm_provider.py - new LiteLLMEmbeddingProvider with:
- litellm.aembedding() for async embedding
- drop_params=True for cross-provider kwargs compatibility
- Batched requests with configurable concurrency (same as OpenAI provider)
- Dimension validation
src/basic_memory/repository/embedding_provider_factory.py - added elif provider_name == "litellm" branch
pyproject.toml - added litellm>=1.60.0,<2.0.0 to dependencies
tests/repository/test_litellm_provider.py - 13 unit tests (all passing)

Tests

Unit tests (13/13 passing):

$ pytest tests/repository/test_litellm_provider.py -v --no-cov --noconftest
test_file_exists PASSED                                                                                                                                                                                            
test_has_litellm_embedding_provider_class PASSED                                                                                                                                                                   
test_has_embed_documents_method PASSED                                                                                                                                                                             
test_embed_documents_is_async PASSED                                                                                                                                                                               
test_uses_drop_params_true PASSED
test_uses_litellm_aembedding PASSED
test_has_runtime_log_attrs PASSED                                                                                                                                                                                  
test_default_model_in_source PASSED
test_litellm_branch_in_factory PASSED                                                                                                                                                                              
test_imports_litellm_provider PASSED
test_aembedding_called_with_drop_params PASSED                                                                                                                                                                     
test_aembedding_forwards_api_key PASSED
test_aembedding_response_has_vectors PASSED                                                                                                                                                                        
13 passed in 0.04s

Example usage

# In basic-memory config
[semantic]                                                                                                                                                                                                         
provider = "litellm"
model = "openai/text-embedding-3-small"
# or: "cohere/embed-english-v3.0", "azure/my-deployment", etc.

from basic_memory.repository.litellm_provider import LiteLLMEmbeddingProvider

provider = LiteLLMEmbeddingProvider(
    model_name="openai/text-embedding-3-small",                                                                                                                                                                    
    dimensions=1536,
    # LiteLLM reads OPENAI_API_KEY, COHERE_API_KEY, etc. from env automatically                                                                                                                                    
)                                                                                                                                                                                                                  
                                                                                                                                                                                                                   
vectors = await provider.embed_documents(["hello world", "basic memory"])                                                                                                                                          
query_vec = await provider.embed_query("search term")

See https://docs.litellm.ai/docs/embedding/supported_embedding for all supported embedding models.

Impact

Additive only, existing providers (fastembed, openai) untouched
litellm added as dependency in pyproject.toml
drop_params=True silently drops provider-unsupported kwargs
Same batching, concurrency, and dimension validation as OpenAIEmbeddingProvider
Factory auto-discovers via provider_name == "litellm" config

CLAassistant · 2026-05-08T22:36:48Z

All committers have signed the CLA.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c029eb3b86

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-08T22:38:41Z

+    elif provider_name == "litellm":
+        from basic_memory.repository.litellm_provider import LiteLLMEmbeddingProvider
+
+        model_name = app_config.semantic_embedding_model or "openai/text-embedding-3-small"


Map the built-in default model for LiteLLM

When users switch only semantic_embedding_provider to litellm, BasicMemoryConfig still supplies the non-empty default model bge-small-en-v1.5, so this or never selects the LiteLLM provider default. The factory then instantiates LiteLLMEmbeddingProvider(model_name="bge-small-en-v1.5") instead of a LiteLLM-routable model such as openai/text-embedding-3-small, making the new provider fail for the documented minimal configuration; mirror the OpenAI branch's remapping of the FastEmbed default or otherwise treat it as unset.

Useful? React with 👍 / 👎.

RheagalFire · 2026-05-08T22:39:30Z

cc @phernandez

phernandez · 2026-05-14T14:59:54Z

Thanks for opening this. I took a careful maintainer pass because this adds a new runtime provider and dependency. The direction is useful, but I do not think we can merge this as-is yet.

Main blockers:

The new tests do not actually exercise LiteLLMEmbeddingProvider.

Most of tests/repository/test_litellm_provider.py parses source text with AST/string checks, and the SDK interaction tests call fake.aembedding() directly instead of importing the provider and calling embed_documents() / embed_query(). I ran:
```
python -m pytest tests/repository/test_litellm_provider.py --cov=basic_memory.repository.litellm_provider --cov-report=term-missing
```
and coverage reported that basic_memory.repository.litellm_provider was never imported; the new provider file stayed at 0% coverage. That means regressions in the actual provider implementation would not fail the test suite.

What I would expect here is closer to the existing OpenAI/FastEmbed provider tests: import LiteLLMEmbeddingProvider, monkeypatch sys.modules["litellm"] with an async aembedding, call the provider methods, and assert batching, output ordering, dimensions, API key forwarding, missing dependency behavior, and malformed response handling through the provider itself.
The default LiteLLM provider config currently creates an invalid model/dimension pairing.

BasicMemoryConfig.semantic_embedding_model defaults to "bge-small-en-v1.5". The new factory branch uses:
```
model_name = app_config.semantic_embedding_model or "openai/text-embedding-3-small"
```
so semantic_embedding_provider="litellm" with otherwise default config creates a LiteLLM provider with model_name="bge-small-en-v1.5" and dimensions=1536. I verified that a 384-dimensional response then fails the provider's dimension check.

This should either map the Basic Memory default model to a valid LiteLLM default, adjust dimensions consistently, or require explicit LiteLLM model/dimensions config with a clear error. The important part is that selecting provider = "litellm" should not create a broken provider by default.
uv.lock was not updated after adding litellm to pyproject.toml.

uv lock --check fails on this branch with:
```
The lockfile at `uv.lock` needs to be updated, but `--check` was provided.
```
Please run uv lock and include the lockfile update if this stays as a direct project dependency.

A smaller design question for maintainers/contributor: adding litellm as a default dependency is a fairly large dependency surface for all Basic Memory installs. That may still be acceptable, but it is worth explicitly confirming whether this should be a core dependency or an optional semantic-provider extra.

phernandez · 2026-05-14T15:00:08Z

@RheagalFire thanks again for contributing this. The overall idea is useful, but there are a few correctness and test-coverage issues we need fixed before we can move it toward merge.

Would you like to take a pass at addressing the review notes above? If so, we are happy to review another commit on this PR. If you would rather not, just say so and we can decide whether someone on the Basic Memory side should pick it up from here.

RheagalFire · 2026-05-14T16:06:40Z

@RheagalFire thanks again for contributing this. The overall idea is useful, but there are a few correctness and test-coverage issues we need fixed before we can move it toward merge.

Would you like to take a pass at addressing the review notes above? If so, we are happy to review another commit on this PR. If you would rather not, just say so and we can decide whether someone on the Basic Memory side should pick it up from here.

Thanks for the review. I'm happy to pick up the changes.

Signed-off-by: RheagalFire <[email protected]>

…ckfile Signed-off-by: RheagalFire <[email protected]>

Signed-off-by: RheagalFire <[email protected]>

RheagalFire · 2026-05-18T21:58:48Z

@phernandez

Addressed all 3 blockers:

Rewrote tests to exercise LiteLLMEmbeddingProvider directly -- 13 tests covering embed_query, embed_documents, batching, api_key forwarding, drop_params, dimension mismatch, missing dependency, output ordering, and factory selection
Fixed default model mapping -- bge-small-en-v1.5 now remaps to openai/text-embedding-3-small in the factory (matching the OpenAI branch pattern)
Also to confirm: uv lock --check now passes cleanly.

On the design question about dependency surface -- you are right, litellm pulls in a sizable transitive set. If you would prefer it as an optional extra rather than a core dependency, I am happy to move it to [project.optional-dependencies] so users install with pip install basic-memory[litellm]. Let me know your preference and I will adjust.

phernandez · 2026-05-26T18:25:36Z

Thanks @RheagalFire — the rewrite addresses all three earlier blockers cleanly, and the factory mapping mirrors the OpenAI branch nicely.

On the dependency-surface question: keep litellm as a core dependency. It aligns with our near-term roadmap where LiteLLM expands from embedding-only to a general LLM provider (BYO key / Ollama / cloud chat completions + provider fallback), so making it optional would just create churn when that work lands.

Before we merge I'd like to add two small things on top of your branch. I'll push a fixup commit so you don't have to context-switch:

L2-normalize the LiteLLM output vectors. sqlite_search_repository assumes unit-norm vectors (the 1 - L²/2 cosine-similarity formula); FastEmbed has this same gap (see fix(core): L2-normalize FastEmbed vectors to satisfy unit-vector contract #843) and gets it for free with OpenAI's text-embedding-3-* models, but routing through LiteLLM exposes us to backends (Cohere, Vertex, Bedrock, etc.) that don't return normalized vectors by default. Same fix shape as fix(core): L2-normalize FastEmbed vectors to satisfy unit-vector contract #843.
Mirror the OpenAI provider's response handling. Switch item["index"] / item["embedding"] to attribute access (item.index / item.embedding) and add the duplicate-index check it already has. Keeps the two providers visually parallel.

Both are small; I'll keep your authorship on the commit history.

Bring the LiteLLM provider in line with the unit-norm contract from sqlite_search_repository.py (lines 65-67): the cosine-similarity formula `1 - L²/2` is correct only for unit-normalized vectors. LiteLLM routes to many backends (Cohere, Vertex, Bedrock, etc.) that do not return normalized embeddings, so normalize at the provider boundary — same fix shape as the parallel FastEmbed change in basicmachines-co#843. Also align the response handling with OpenAIEmbeddingProvider: - attribute access on response items (item.index / item.embedding) - explicit duplicate-index guard Tests cover the three behaviors directly (unit norm, zero-vector pass-through, duplicate-index error) and the existing ordering test now reconstructs the expected normalized vectors so a normalization regression would be caught. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> Signed-off-by: phernandez <[email protected]>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f9e7029ae7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-26T18:32:39Z

+        if normalized and len(normalized[0]) != self.dimensions:
+            raise RuntimeError(


Derive LiteLLM dimensions instead of hard-coding 1536

LiteLLMEmbeddingProvider enforces len(normalized[0]) == self.dimensions, but when semantic_embedding_dimensions is unset the factory leaves dimensions at the constructor default of 1536; this makes any non-1536 LiteLLM model (for example many Cohere/Azure deployments) fail at runtime even though the new provider is presented as broadly provider-agnostic. In practice, switching semantic_embedding_model to a valid non-OpenAI model will reliably raise this error during indexing unless users manually discover and set dimensions, so the provider should infer dimensions from the first response (or fail earlier with explicit config validation).

Useful? React with 👍 / 👎.

RheagalFire force-pushed the feat/add-litellm-provider branch from c029eb3 to 849f9f5 Compare May 8, 2026 22:37

chatgpt-codex-connector Bot reviewed May 8, 2026

View reviewed changes

phernandez added the On Hold Don't review or merge. Work is pending label May 14, 2026

RheagalFire force-pushed the feat/add-litellm-provider branch from b106193 to d6100c7 Compare May 18, 2026 21:15

RheagalFire added 2 commits May 19, 2026 02:46

feat: add LiteLLM as embedding provider

be7cd78

Signed-off-by: RheagalFire <[email protected]>

fix: rewrite tests to exercise provider, fix default model, update lo…

7d068f5

…ckfile Signed-off-by: RheagalFire <[email protected]>

RheagalFire force-pushed the feat/add-litellm-provider branch from d6100c7 to 7d068f5 Compare May 18, 2026 21:16

fix: minimal uv.lock update for litellm deps only

fce7a67

Signed-off-by: RheagalFire <[email protected]>

chatgpt-codex-connector Bot reviewed May 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add LiteLLM as embedding provider#809

feat: add LiteLLM as embedding provider#809
RheagalFire wants to merge 4 commits into
basicmachines-co:mainfrom
RheagalFire:feat/add-litellm-provider

RheagalFire commented May 8, 2026

Uh oh!

CLAassistant commented May 8, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 8, 2026

Uh oh!

RheagalFire commented May 8, 2026

Uh oh!

phernandez commented May 14, 2026

Uh oh!

phernandez commented May 14, 2026

Uh oh!

RheagalFire commented May 14, 2026

Uh oh!

RheagalFire commented May 18, 2026 •

edited

Loading

Uh oh!

phernandez commented May 26, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		if normalized and len(normalized[0]) != self.dimensions:
		raise RuntimeError(

Conversation

RheagalFire commented May 8, 2026

Summary

Changes

Tests

Example usage

Impact

Uh oh!

CLAassistant commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

RheagalFire commented May 8, 2026

Uh oh!

phernandez commented May 14, 2026

Uh oh!

phernandez commented May 14, 2026

Uh oh!

RheagalFire commented May 14, 2026

Uh oh!

RheagalFire commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

phernandez commented May 26, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CLAassistant commented May 8, 2026 •

edited

Loading

RheagalFire commented May 18, 2026 •

edited

Loading