Skip to content

[bot] Add sentence-transformers integration for SentenceTransformer.encode, CrossEncoder, and SparseEncoder embedding instrumentation #515

@braintrust-bot

Description

@braintrust-bot

Summary

The sentence-transformers package is the most widely used Python library for generating local text embeddings. Its SentenceTransformer.encode() method is the primary embedding execution API in the Python AI ecosystem, powering RAG pipelines, semantic search, clustering, and reranking across thousands of applications. The latest release is v5.5.1 (May 12, 2026). This repository has zero instrumentation for any sentence-transformers execution surface — no integration directory, no wrapper, no patcher, no auto_instrument() support.

This gap is distinct from the existing huggingface_hub integration, which traces cloud inference through InferenceClient.feature_extraction() (HuggingFace Inference API). sentence-transformers runs models locally using a different execution path: SentenceTransformer(model_name).encode(texts) does not call any remote API and cannot be traced through huggingface_hub or any existing integration.

Comparable embedding/execution libraries with dedicated integrations in this repo: huggingface_hub (cloud inference), openai (embeddings via client.embeddings.create()), cohere (embeddings via client.embed()).

What needs to be instrumented

The sentence-transformers package exposes these execution surfaces, none of which are instrumented:

Dense embeddings (highest priority)

Class / Method Description Return type
SentenceTransformer.encode(sentences, ...) Generate dense vector embeddings for a list of sentences — the primary execution surface np.ndarray or list[Tensor]
SentenceTransformer.encode_multi_process(sentences, pool, ...) Parallel embedding generation across multiple CPUs/GPUs np.ndarray

encode() accepts batch_size, show_progress_bar, output_value (sentence_embedding, token_embeddings), precision (float32, int8, uint8, binary, ubinary), convert_to_numpy, convert_to_tensor, device, normalize_embeddings.

Token counting: encode() calls the underlying tokenizer; prompt token counts can be extracted from the tokenizer before encoding.

Async: No async variant exists in the standard API; parallelism is via encode_multi_process().

Reranking (CrossEncoder)

Class / Method Description Return type
CrossEncoder.predict(sentence_pairs, ...) Compute similarity scores for sentence pairs — used for reranking retrieved documents np.ndarray
CrossEncoder.rank(query, documents, ...) Rank documents against a query using cross-encoder scoring list[dict]

Sparse embeddings (SparseEncoder)

Class / Method Description Return type
SparseEncoder.encode(sentences, ...) Generate sparse vector representations — used with SPLADE and similar models dict with token IDs and weights

Implementation notes

Local inference model: Unlike cloud API SDKs, sentence-transformers loads models locally using PyTorch. Patching the execution surface means wrapping SentenceTransformer.encode() directly rather than intercepting HTTP requests.

Span metrics: Useful span fields include:

  • input: the list of sentences encoded
  • output: embedding dimensions/count (not the raw vectors — potentially large)
  • metadata: model_name_or_path, device, precision, normalize_embeddings, model architecture details from SentenceTransformer.model_card_data
  • metrics: encoding latency, sentence count, approximate token counts

Token count estimation: The tokenizer is accessible at SentenceTransformer.tokenizer. Token counts for the input can be computed before encoding via len(tokenizer.encode(sentence)).

Model identification: SentenceTransformer.model_card_data.model_id or the constructor argument provides the model name for span metadata.

Patching strategy: Wrap SentenceTransformer.encode and CrossEncoder.predict at the class level. The SparseEncoder.encode follows the same pattern as SentenceTransformer.encode.

No coverage in any instrumentation layer

  • No integration directory (py/src/braintrust/integrations/sentence_transformers/)
  • No wrapper function (e.g. wrap_sentence_transformers())
  • No patcher in any existing integration
  • No nox test session (test_sentence_transformers)
  • No version entry in py/src/braintrust/integrations/versioning.py
  • No mention in py/src/braintrust/integrations/__init__.py

A grep for sentence.transformers, sentence_transformers, or sbert across py/src/braintrust/ returns zero matches.

Braintrust docs status

not_foundsentence-transformers is not listed on the Braintrust integrations directory or the tracing guide. There is no auto_instrument() reference, no wrap_sentence_transformers() function, and no sentence-transformers setup documentation anywhere in Braintrust docs.

Upstream references

Local repo files inspected

  • py/src/braintrust/integrations/ — no sentence_transformers/ directory exists on main
  • py/src/braintrust/wrappers/ — no sentence-transformers wrapper
  • py/noxfile.py — no test_sentence_transformers session
  • py/src/braintrust/integrations/__init__.py — sentence-transformers not listed in integration registry
  • py/src/braintrust/integrations/versioning.py — no sentence-transformers version matrix
  • py/pyproject.toml [tool.braintrust.matrix] — no sentence-transformers entry
  • py/src/braintrust/auto.py — sentence-transformers not listed in auto_instrument() parameters
  • Full repo grep for sentence.transformers, sentence_transformers, sbert across py/src/braintrust/ — zero matches

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions