LCORE-2308: LlamaStack Pydantic AI Provider#1806
Conversation
WalkthroughThis PR adds a Pydantic AI provider that integrates with Llama Stack, enabling OpenAI-compatible API calls to be routed through in-process Llama Stack route handlers or remote Llama Stack servers. The implementation includes a dual-mode ChangesPydantic AI Llama Stack Integration
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 8
♻️ Duplicate comments (3)
src/pydantic_ai_lightspeed/llamastack/_provider.py (2)
85-87:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winConfigure a timeout on the library-mode
httpx.AsyncClient.Although this transport dispatches in-process, the client is constructed without an explicit timeout (also flagged by static analysis). Set an explicit
timeoutto keep behavior predictable and silence the scanner.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/pydantic_ai_lightspeed/llamastack/_provider.py` around lines 85 - 87, The AsyncClient instantiation for lib_http_client is missing an explicit timeout; update the httpx.AsyncClient(...) call (where lib_http_client is created, passing the existing transport and base_url) to include a timeout parameter (e.g., timeout=httpx.Timeout(10.0) or another sensible duration) so the client has a bounded request timeout and the static scanner is satisfied.
74-81:⚠️ Potential issue | 🟠 Major | ⚡ Quick winReplace
assertwith explicit exceptions for argument validation.These mutual-exclusion checks are stripped when Python runs with
-O, so the constraint silently disappears in optimized builds. Raise a proper error instead.🛡️ Proposed fix
- if library_client is not None: - assert ( - base_url is None - ), "Cannot provide both `library_client` and `base_url`" - assert api_key is None, "Cannot provide both `library_client` and `api_key`" - assert ( - http_client is None - ), "Cannot provide both `library_client` and `http_client`" + if library_client is not None: + if base_url is not None: + raise ValueError("Cannot provide both `library_client` and `base_url`") + if api_key is not None: + raise ValueError("Cannot provide both `library_client` and `api_key`") + if http_client is not None: + raise ValueError("Cannot provide both `library_client` and `http_client`")🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/pydantic_ai_lightspeed/llamastack/_provider.py` around lines 74 - 81, Replace the runtime-optimized asserts in the mutual-exclusion checks with explicit exceptions so validation cannot be stripped under -O: in the block checking library_client against base_url, api_key, and http_client (symbols: library_client, base_url, api_key, http_client) raise a clear exception (e.g., ValueError or TypeError) for each conflict instead of using assert, preserving the existing messages like "Cannot provide both `library_client` and `base_url`" etc.; ensure each check uses raise with the same descriptive text so callers get deterministic validation errors regardless of Python optimization flags.src/pydantic_ai_lightspeed/llamastack/_transport.py (1)
100-100:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAvoid
assertfor runtime invariants.
assert self._client.route_impls is not Noneis removed underpython -O. Sincehandle_async_requestalready raisesRuntimeErrorwhenroute_impls is None, narrow the type explicitly or raise instead of asserting. Same applies to Line 133.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/pydantic_ai_lightspeed/llamastack/_transport.py` at line 100, Replace the runtime `assert` checks on `self._client.route_impls` with an explicit None-check and raise a RuntimeError (or use typing.cast after the check) so the invariant holds even under -O; specifically, in handle_async_request (and the similar occurrence at line 133) change `assert self._client.route_impls is not None` to an explicit `if self._client.route_impls is None: raise RuntimeError("route_impls is not initialized")` and then assign `route_impls = self._client.route_impls` (or `route_impls = cast(ConcreteType, self._client.route_impls)`) to narrow the type for subsequent code.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/pydantic_ai_lightspeed/llamastack/__init__.py`:
- Line 3: Replace the relative import in the package initializer by importing
the provider using its absolute module path: change the import that references
LlamaStackProvider (from ._provider import LlamaStackProvider) to use the full
package path pydantic_ai_lightspeed.llamastack._provider so that the symbol
LlamaStackProvider is imported via absolute import.
In `@src/pydantic_ai_lightspeed/llamastack/_provider.py`:
- Around line 111-112: Add a descriptive docstring to the _set_http_client
method explaining its purpose and behavior: state that it injects an
httpx.AsyncClient into the underlying client by setting self._client._client
(note the protected attribute access), describe expected parameter http_client:
httpx.AsyncClient and that the method returns None, and mention any side effects
or threading/async considerations if relevant; place this docstring immediately
above def _set_http_client to satisfy project conventions.
- Line 14: The import of LlamaStackLibraryTransport uses a relative import;
change it to an absolute import so internal modules follow the codebase
guideline — replace "from ._transport import LlamaStackLibraryTransport" with an
absolute import referencing the package path that exposes _transport (import the
LlamaStackLibraryTransport symbol via its full module path) and ensure any tests
or import consumers still resolve the new absolute name.
In `@src/pydantic_ai_lightspeed/llamastack/_transport.py`:
- Line 105: The code mutates the caller's body dict via the in-place union
operator (body |= path_params); instead, create a new merged mapping and use
that local variable (e.g., merged_body = {**(body or {}), **(path_params or
{})}) so the original body parameter is not modified, and replace both
occurrences of the in-place mutation (the body |= path_params at the shown
location and the similar one later) with assignment to a new dict variable used
for downstream processing.
- Around line 25-30: Add concise docstrings for the _AsyncByteStream class
methods: document __init__(self, gen: AsyncGenerator[bytes, None]) to explain
that it stores an async generator that yields raw bytes (mention the parameter
name "gen" and its type/purpose), and document async def __aiter__(self) ->
AsyncIterator[bytes] to state that it returns an async iterator yielding bytes
chunks for streaming (matching httpx.AsyncByteStream contract). Follow project
convention style (short summary line, param description for gen, and return
description indicating an async iterator of bytes).
- Around line 93-99: Both helper methods _handle_non_streaming and
_handle_streaming are missing docstrings; add descriptive docstrings to each
function that summarize purpose, list parameters with types (request:
httpx.Request, method: str, path: str, body: dict[str, Any] for
_handle_non_streaming and the appropriate params for _handle_streaming),
describe the return type (httpx.Response or async stream/response behavior), and
mention possible exceptions/errors raised. Place the docstrings immediately
below each def, using triple-quoted strings and concise sentences following the
project's docstring style so linters and readers can understand intent and
types.
In `@tests/unit/pydantic_ai_lightspeed/llamastack/test_provider.py`:
- Around line 70-75: The test currently only checks that provider.client is not
None, which won't detect if the constructor ignored the provided http_client;
update test_custom_http_client to assert identity against the provided instance
(e.g., assert provider.client is custom_client) so it verifies
LlamaStackProvider(http_client=custom_client) actually wires the same client
into provider.client.
- Around line 72-73: The tests leak httpx.AsyncClient instances because
LlamaStackProvider (and its _set_http_client path that assigns to AsyncOpenAI)
never closes a provided client; update the failing tests to either (a) replace
real httpx.AsyncClient() with a mock/stub (e.g., monkeypatch or a simple dummy
object) when testing non-I/O logic, or (b) convert the tests that create
httpx.AsyncClient() into async tests (use pytest.mark.asyncio) and explicitly
close the client after use with await custom_client.aclose() (or use "async with
httpx.AsyncClient() as custom_client:"), ensuring any use of LlamaStackProvider,
_set_http_client, or AsyncOpenAI in those tests does not leave open connections.
---
Duplicate comments:
In `@src/pydantic_ai_lightspeed/llamastack/_provider.py`:
- Around line 85-87: The AsyncClient instantiation for lib_http_client is
missing an explicit timeout; update the httpx.AsyncClient(...) call (where
lib_http_client is created, passing the existing transport and base_url) to
include a timeout parameter (e.g., timeout=httpx.Timeout(10.0) or another
sensible duration) so the client has a bounded request timeout and the static
scanner is satisfied.
- Around line 74-81: Replace the runtime-optimized asserts in the
mutual-exclusion checks with explicit exceptions so validation cannot be
stripped under -O: in the block checking library_client against base_url,
api_key, and http_client (symbols: library_client, base_url, api_key,
http_client) raise a clear exception (e.g., ValueError or TypeError) for each
conflict instead of using assert, preserving the existing messages like "Cannot
provide both `library_client` and `base_url`" etc.; ensure each check uses raise
with the same descriptive text so callers get deterministic validation errors
regardless of Python optimization flags.
In `@src/pydantic_ai_lightspeed/llamastack/_transport.py`:
- Line 100: Replace the runtime `assert` checks on `self._client.route_impls`
with an explicit None-check and raise a RuntimeError (or use typing.cast after
the check) so the invariant holds even under -O; specifically, in
handle_async_request (and the similar occurrence at line 133) change `assert
self._client.route_impls is not None` to an explicit `if
self._client.route_impls is None: raise RuntimeError("route_impls is not
initialized")` and then assign `route_impls = self._client.route_impls` (or
`route_impls = cast(ConcreteType, self._client.route_impls)`) to narrow the type
for subsequent code.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: b16d8b64-3aea-4572-a5f6-de9532c70e71
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (9)
pyproject.tomlsrc/pydantic_ai_lightspeed/__init__.pysrc/pydantic_ai_lightspeed/llamastack/__init__.pysrc/pydantic_ai_lightspeed/llamastack/_provider.pysrc/pydantic_ai_lightspeed/llamastack/_transport.pytests/unit/pydantic_ai_lightspeed/__init__.pytests/unit/pydantic_ai_lightspeed/llamastack/__init__.pytests/unit/pydantic_ai_lightspeed/llamastack/test_provider.pytests/unit/pydantic_ai_lightspeed/llamastack/test_transport.py
📜 Review details
🧰 Additional context used
📓 Path-based instructions (3)
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Usepytest.mark.asynciomarker for async tests
Files:
tests/unit/pydantic_ai_lightspeed/llamastack/__init__.pytests/unit/pydantic_ai_lightspeed/__init__.pytests/unit/pydantic_ai_lightspeed/llamastack/test_transport.pytests/unit/pydantic_ai_lightspeed/llamastack/test_provider.py
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules:from authentication import get_auth_dependency
Llama Stack imports: Usefrom llama_stack_client import AsyncLlamaStackClient
Checkconstants.pyfor shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Useasync deffor I/O operations and external API calls
Use standard log levels with clear purposes:debug()for diagnostic info,info()for program execution,warning()for unexpected events,error()for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes:Configuration,Error/Exception,Resolver,Interface
Abstract classes must use ABC with@abstractmethoddecorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes
Files:
src/pydantic_ai_lightspeed/__init__.pysrc/pydantic_ai_lightspeed/llamastack/__init__.pysrc/pydantic_ai_lightspeed/llamastack/_transport.pysrc/pydantic_ai_lightspeed/llamastack/_provider.py
src/**/__init__.py
📄 CodeRabbit inference engine (AGENTS.md)
Package
__init__.pyfiles must contain brief package descriptions
Files:
src/pydantic_ai_lightspeed/__init__.pysrc/pydantic_ai_lightspeed/llamastack/__init__.py
🔇 Additional comments (10)
src/pydantic_ai_lightspeed/__init__.py (1)
1-1: LGTM!src/pydantic_ai_lightspeed/llamastack/__init__.py (2)
1-1: LGTM!
5-5: LGTM!tests/unit/pydantic_ai_lightspeed/__init__.py (1)
1-1: LGTM!tests/unit/pydantic_ai_lightspeed/llamastack/__init__.py (1)
1-1: LGTM!pyproject.toml (1)
82-83: ⚡ Quick winConfirm
pydantic-ai>=1.99.0is valid.
pydantic-aihas a published release1.99.0on PyPI, so the dependency constraintpydantic-ai>=1.99.0is not based on a non-existent version.src/pydantic_ai_lightspeed/llamastack/_provider.py (1)
31-49: LGTM!src/pydantic_ai_lightspeed/llamastack/_transport.py (1)
10-19: Guard against private Llama Stack internals breakage
src/pydantic_ai_lightspeed/llamastack/_transport.pyimports several non-public Llama Stack internals (convert_pydantic_to_json_value,PROVIDER_DATA_VAR,request_provider_data_context,find_matching_route,preserve_contexts_async_generator, plus laterroute_impls/_convert_body). The repo pinsllama-stack==0.6.0(and related*-client/*-api), but the unit tests mostly patch these symbols rather than exercising real imports from the pinned package. Add a small CI smoke test that installs the pinnedllama-stackand imports/uses these symbols, and consider narrowing the version range to reduce silent breakage when internals change.tests/unit/pydantic_ai_lightspeed/llamastack/test_transport.py (1)
18-344: LGTM!tests/unit/pydantic_ai_lightspeed/llamastack/test_provider.py (1)
1-69: LGTM!Also applies to: 77-141, 146-151, 155-157
| lib_http_client = httpx.AsyncClient( | ||
| transport=transport, |
Description
Added Pydantic AI provider to use llama stack as a provider for Pydantic AI models.
Type of change
Tools used to create PR
Identify any AI code assistants used in this PR (for transparency and review context)
Related Tickets & Documents
Checklist before requesting a review
Testing
Summary by CodeRabbit
Dependencies
pydantic-ai>=1.99.0dependency to support new integrationsNew Features
Tests