Skip to content

Feature/more llm metrics in chatmessage#5909

Closed
rachel-facere wants to merge 3 commits into
livekit:mainfrom
facere-ai:feature/more-llm-metrics-in-chatmessage
Closed

Feature/more llm metrics in chatmessage#5909
rachel-facere wants to merge 3 commits into
livekit:mainfrom
facere-ai:feature/more-llm-metrics-in-chatmessage

Conversation

@rachel-facere
Copy link
Copy Markdown

@rachel-facere rachel-facere commented May 31, 2026

No description provided.

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@rachel-facere rachel-facere marked this pull request as draft May 31, 2026 06:53
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View 3 additional findings in Devin Review.

Open in Devin Review

Comment on lines +149 to +150
if chunk.usage is not None:
data.usage = chunk.usage.model_dump()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

πŸ”΄ AttributeError when LLM node yields a str or FlushSentinel due to unconditional chunk.usage access

chunk.usage is accessed at line 149 before the type-dispatching isinstance checks on lines 153/157/191. The LLMNode type signature (AsyncIterable[llm.ChatChunk | str | FlushSentinel]) and the existing code both confirm that chunk can be a str or FlushSentinelβ€”neither of which has a .usage attribute. When any LLM node yields a plain string token (which is a supported and documented code path per the comment on line 152), this line will raise AttributeError: 'str' object has no attribute 'usage', crashing the entire LLM inference task.

Suggested change
if chunk.usage is not None:
data.usage = chunk.usage.model_dump()
if isinstance(chunk, ChatChunk) and chunk.usage is not None:
data.usage = chunk.usage.model_dump()
Open in Devin Review

Was this helpful? React with πŸ‘ or πŸ‘Ž to provide feedback.

Comment on lines +262 to +274
class TokenUsage(TypedDict, total=False):
completion_tokens: int
"""The number of tokens in the completion."""
prompt_tokens: int
"""The number of tokens read from the cache."""
total_tokens: int
"""The number of input tokens used (includes cached tokens)."""
prompt_cached_tokens: int
"""The number of cached input tokens used."""
cache_creation_tokens: int
"""The number of tokens used to create the cache."""
cache_read_tokens: int
"""The total number of tokens used (completion + prompt tokens)."""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟑 Docstrings for prompt_tokens, total_tokens, and cache_read_tokens are swapped in TokenUsage

The docstrings in the new TokenUsage TypedDict are mismatched with their field names. Comparing with the source-of-truth CompletionUsage model at livekit-agents/livekit/agents/llm/llm.py:35-51:

  • prompt_tokens says "The number of tokens read from the cache" β€” should be "The number of input tokens used (includes cached tokens)."
  • total_tokens says "The number of input tokens used (includes cached tokens)" β€” should be "The total number of tokens used (completion + prompt tokens)."
  • cache_read_tokens says "The total number of tokens used (completion + prompt tokens)" β€” should be "The number of tokens read from the cache."

It appears the fields were reordered relative to CompletionUsage but the docstrings were left in the old positions. Since TokenUsage is a public-facing TypedDict (part of MetricsReport), wrong docs will mislead consumers interpreting metrics data.

Suggested change
class TokenUsage(TypedDict, total=False):
completion_tokens: int
"""The number of tokens in the completion."""
prompt_tokens: int
"""The number of tokens read from the cache."""
total_tokens: int
"""The number of input tokens used (includes cached tokens)."""
prompt_cached_tokens: int
"""The number of cached input tokens used."""
cache_creation_tokens: int
"""The number of tokens used to create the cache."""
cache_read_tokens: int
"""The total number of tokens used (completion + prompt tokens)."""
class TokenUsage(TypedDict, total=False):
completion_tokens: int
"""The number of tokens in the completion."""
prompt_tokens: int
"""The number of input tokens used (includes cached tokens)."""
total_tokens: int
"""The total number of tokens used (completion + prompt tokens)."""
prompt_cached_tokens: int
"""The number of cached input tokens used."""
cache_creation_tokens: int
"""The number of tokens used to create the cache."""
cache_read_tokens: int
"""The number of tokens read from the cache."""
Open in Devin Review

Was this helpful? React with πŸ‘ or πŸ‘Ž to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants