Feature/more llm metrics in chatmessage by rachel-facere · Pull Request #5909 · livekit/agents

rachel-facere · 2026-05-31T06:53:23Z

No description provided.

CLAassistant · 2026-05-31T06:53:30Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

devin-ai-integration

Devin Review found 2 potential issues.

View 3 additional findings in Devin Review.

devin-ai-integration · 2026-05-31T06:56:12Z

+            if chunk.usage is not None:
+                data.usage = chunk.usage.model_dump()


🔴 AttributeError when LLM node yields a str or FlushSentinel due to unconditional chunk.usage access

chunk.usage is accessed at line 149 before the type-dispatching isinstance checks on lines 153/157/191. The LLMNode type signature (AsyncIterable[llm.ChatChunk | str | FlushSentinel]) and the existing code both confirm that chunk can be a str or FlushSentinel—neither of which has a .usage attribute. When any LLM node yields a plain string token (which is a supported and documented code path per the comment on line 152), this line will raise AttributeError: 'str' object has no attribute 'usage', crashing the entire LLM inference task.

Suggested change

if chunk.usage is not None:

data.usage = chunk.usage.model_dump()

if isinstance(chunk, ChatChunk) and chunk.usage is not None:

data.usage = chunk.usage.model_dump()

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-05-31T06:56:13Z

+class TokenUsage(TypedDict, total=False):
+    completion_tokens: int
+    """The number of tokens in the completion."""
+    prompt_tokens: int
+    """The number of tokens read from the cache."""
+    total_tokens: int
+    """The number of input tokens used (includes cached tokens)."""
+    prompt_cached_tokens: int
+    """The number of cached input tokens used."""
+    cache_creation_tokens: int
+    """The number of tokens used to create the cache."""
+    cache_read_tokens: int
+    """The total number of tokens used (completion + prompt tokens)."""


🟡 Docstrings for prompt_tokens, total_tokens, and cache_read_tokens are swapped in TokenUsage

The docstrings in the new TokenUsage TypedDict are mismatched with their field names. Comparing with the source-of-truth CompletionUsage model at livekit-agents/livekit/agents/llm/llm.py:35-51:

prompt_tokens says "The number of tokens read from the cache" — should be "The number of input tokens used (includes cached tokens)."

total_tokens says "The number of input tokens used (includes cached tokens)" — should be "The total number of tokens used (completion + prompt tokens)."

cache_read_tokens says "The total number of tokens used (completion + prompt tokens)" — should be "The number of tokens read from the cache."

It appears the fields were reordered relative to CompletionUsage but the docstrings were left in the old positions. Since TokenUsage is a public-facing TypedDict (part of MetricsReport), wrong docs will mislead consumers interpreting metrics data.

Suggested change

class TokenUsage(TypedDict, total=False):

completion_tokens: int

"""The number of tokens in the completion."""

prompt_tokens: int

"""The number of tokens read from the cache."""

total_tokens: int

"""The number of input tokens used (includes cached tokens)."""

prompt_cached_tokens: int

"""The number of cached input tokens used."""

cache_creation_tokens: int

"""The number of tokens used to create the cache."""

cache_read_tokens: int

"""The total number of tokens used (completion + prompt tokens)."""

class TokenUsage(TypedDict, total=False):

completion_tokens: int

"""The number of tokens in the completion."""

prompt_tokens: int

"""The number of input tokens used (includes cached tokens)."""

total_tokens: int

"""The total number of tokens used (completion + prompt tokens)."""

prompt_cached_tokens: int

"""The number of cached input tokens used."""

cache_creation_tokens: int

"""The number of tokens used to create the cache."""

cache_read_tokens: int

"""The number of tokens read from the cache."""

Was this helpful? React with 👍 or 👎 to provide feedback.

rachel-facere added 3 commits May 31, 2026 16:10

working changes to get ttlt

6c5a8e3

working token usage metrics

4af63b6

tts_node_ttlb

3754387

rachel-facere marked this pull request as draft May 31, 2026 06:53

devin-ai-integration Bot reviewed May 31, 2026

View reviewed changes

rachel-facere closed this May 31, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/more llm metrics in chatmessage#5909

Feature/more llm metrics in chatmessage#5909
rachel-facere wants to merge 3 commits into
livekit:mainfrom
facere-ai:feature/more-llm-metrics-in-chatmessage

rachel-facere commented May 31, 2026 •

edited

Loading

Uh oh!

CLAassistant commented May 31, 2026

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

devin-ai-integration Bot May 31, 2026

Uh oh!

devin-ai-integration Bot May 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		if chunk.usage is not None:
		data.usage = chunk.usage.model_dump()

Conversation

rachel-facere commented May 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CLAassistant commented May 31, 2026

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot May 31, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot May 31, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rachel-facere commented May 31, 2026 •

edited

Loading