Description
On Vertex AI, when the async Google-auth / mTLS path is selected
(BaseApiClient._use_google_auth_async() returns True), the client creates an
aiohttp AsyncAuthorizedSession and caches it on self._aiohttp_session
without any closed-loop recreation check. The non-auth aiohttp branch in the
same method does check self._aiohttp_session._loop.is_closed() and recreates, but
the async-auth branch returns the cached session unconditionally.
In any server that handles each request on a fresh event loop (e.g. a
sync-to-async bridge that calls asyncio.run() per request, which is exactly what
Vertex AI Agent Engine / ADK's deployed runner does), the cached
AsyncAuthorizedSession is bound to the first request's now-closed loop. The
second request fails with RuntimeError: Event loop is closed (raised from aiohttp
DNS resolution / getaddrinfo), and the call returns nothing.
This is the same class of bug as #1083 and #1518, but specific to the
Vertex + aiohttp + mTLS client-cert path.
Environment
google-genai 1.75.0 (also present in nearby 1.7x)
- Vertex AI (
vertexai=True), aiohttp installed
mtls.should_use_client_cert() is True (i.e. a client certificate is present —
e.g. an agent deployed to Vertex AI Agent Engine with agent identity, which
provisions a SPIFFE x509 client cert)
- Server model: one
asyncio.run() per request on a worker thread
Relevant code (google/genai/_api_client.py, ~v1.75.0)
_use_google_auth_async() (~line 859) returns True when
has_aiohttp and self.vertexai and mtls.should_use_client_cert().
In _get_aiohttp_session():
- async-auth branch (~lines 879-911):
if self._aiohttp_session is None and self._use_google_auth_async(): → creates AsyncAuthorizedSession, caches it,
returns. No loop / closed check on subsequent calls.
- non-auth branch (~lines 915-918): recreates when
self._aiohttp_session is None or .closed or ._loop.is_closed().
Steps to reproduce
- Vertex client with
aiohttp installed and a client certificate available so
_use_google_auth_async() is True.
- Issue a streaming/generate request inside one
asyncio.run(...) call on a
worker thread; let that loop close.
- Issue a second request inside a new
asyncio.run(...) on a new loop, reusing
the same client.
Expected: the second request succeeds (session recreated on the current loop,
as the non-auth branch already does).
Actual: RuntimeError: Event loop is closed; the call returns no data.
Real-world impact
ADK agents deployed to Vertex AI Agent Engine with agent identity return a
correct answer to the first query and then an empty response to every
subsequent query on the same warm instance. It is invisible in local dev because
no client cert is present there (so the self-healing non-auth branch is used).
Reproduced on multiple independent agents; identical signature.
Suggested fix
Apply the same closed-loop guard the non-auth branch uses to the async-auth
branch — recreate the AsyncAuthorizedSession when its bound loop is closed (or
when the running loop differs from the one it was created on).
Workaround
Wrap _get_aiohttp_session to drop the cached session when the event loop changes:
import asyncio
from google.genai import _api_client as g
_orig = g.BaseApiClient._get_aiohttp_session
async def _loop_safe(self):
try:
running = asyncio.get_running_loop()
except RuntimeError:
running = None
if (getattr(self, "_bound_loop", None) is not None
and self._bound_loop is not running
and getattr(self, "_aiohttp_session", None) is not None):
self._aiohttp_session = None # recreate on the current loop
session = await _orig(self)
self._bound_loop = running
return session
g.BaseApiClient._get_aiohttp_session = _loop_safe
This keeps the mTLS auth intact and only repairs the loop binding. Verified to fix
repeated queries on a deployed Agent Engine instance (5/5 vs 1/5 before).
Description
On Vertex AI, when the async Google-auth / mTLS path is selected
(
BaseApiClient._use_google_auth_async()returnsTrue), the client creates anaiohttp
AsyncAuthorizedSessionand caches it onself._aiohttp_sessionwithout any closed-loop recreation check. The non-auth aiohttp branch in the
same method does check
self._aiohttp_session._loop.is_closed()and recreates, butthe async-auth branch returns the cached session unconditionally.
In any server that handles each request on a fresh event loop (e.g. a
sync-to-async bridge that calls
asyncio.run()per request, which is exactly whatVertex AI Agent Engine / ADK's deployed runner does), the cached
AsyncAuthorizedSessionis bound to the first request's now-closed loop. Thesecond request fails with
RuntimeError: Event loop is closed(raised from aiohttpDNS resolution /
getaddrinfo), and the call returns nothing.This is the same class of bug as #1083 and #1518, but specific to the
Vertex + aiohttp + mTLS client-cert path.
Environment
google-genai1.75.0 (also present in nearby 1.7x)vertexai=True),aiohttpinstalledmtls.should_use_client_cert()isTrue(i.e. a client certificate is present —e.g. an agent deployed to Vertex AI Agent Engine with agent identity, which
provisions a SPIFFE x509 client cert)
asyncio.run()per request on a worker threadRelevant code (
google/genai/_api_client.py, ~v1.75.0)_use_google_auth_async()(~line 859) returnsTruewhenhas_aiohttp and self.vertexai and mtls.should_use_client_cert().In
_get_aiohttp_session():if self._aiohttp_session is None and self._use_google_auth_async():→ createsAsyncAuthorizedSession, caches it,returns. No loop /
closedcheck on subsequent calls.self._aiohttp_session is None or .closed or ._loop.is_closed().Steps to reproduce
aiohttpinstalled and a client certificate available so_use_google_auth_async()isTrue.asyncio.run(...)call on aworker thread; let that loop close.
asyncio.run(...)on a new loop, reusingthe same client.
Expected: the second request succeeds (session recreated on the current loop,
as the non-auth branch already does).
Actual:
RuntimeError: Event loop is closed; the call returns no data.Real-world impact
ADK agents deployed to Vertex AI Agent Engine with agent identity return a
correct answer to the first query and then an empty response to every
subsequent query on the same warm instance. It is invisible in local dev because
no client cert is present there (so the self-healing non-auth branch is used).
Reproduced on multiple independent agents; identical signature.
Suggested fix
Apply the same closed-loop guard the non-auth branch uses to the async-auth
branch — recreate the
AsyncAuthorizedSessionwhen its bound loop is closed (orwhen the running loop differs from the one it was created on).
Workaround
Wrap
_get_aiohttp_sessionto drop the cached session when the event loop changes:This keeps the mTLS auth intact and only repairs the loop binding. Verified to fix
repeated queries on a deployed Agent Engine instance (5/5 vs 1/5 before).