Surface the real inference error; don't report a gripper-overload shutdown as failed#48
Open
nobullryder wants to merge 1 commit into
Open
Surface the real inference error; don't report a gripper-overload shutdown as failed#48nobullryder wants to merge 1 commit into
nobullryder wants to merge 1 commit into
Conversation
…tdown as failed When a rollout exits non-zero the UI just said "failed — check logs". Three small improvements in rollout.py: - _extract_error_from_log: pull the actual exception out of the rollout log so the UI can show it directly instead of sending the user digging through the HF cache. - _friendly_hint: a plain-language, actionable headline for the common SO-101 failures (gripper overload, unresponsive motor, can't-connect, camera too slow, unsupported resolution, busy serial port). - _classify_outcome: a non-zero exit *after* the rollout main loop started, where the error is a torque-disable/overload on shutdown (e.g. disabling torque on a gripper still holding an object), is reported as `ran_with_warning` rather than `failed` — the skill actually ran; only cleanup tripped. Tests cover the classification, the hint mapping, and the log-error extraction. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
When a rollout (inference) exits non-zero, the UI just said "failed — check logs", which (a) buries the real error in a log file inside the HF cache, and (b) calls a working run a failure when only shutdown tripped — e.g. disabling torque on a gripper that's still holding an object trips an overload during cleanup.
What this does (all in
rollout.py)_extract_error_from_log— pulls the actual exception out of the rollout log so the UI can show it directly._friendly_hint— a plain-language, actionable headline for the common SO-101 failures (gripper overload, unresponsive motor id 6, can't-connect, camera-too-slow, unsupported resolution, busy serial port)._classify_outcome— a non-zero exit after the rollout main loop started, where the error is a torque-disable/overload on shutdown, is reported asran_with_warningrather thanfailed. The skill ran; only cleanup complained.Tests
tests/test_rollout.pyadds coverage for the outcome classification, the hint mapping, and the log-error extraction.🤖 Generated with Claude Code