Skip to content

Add SayCan-style affordance grounding (embodied_ai/39)#15

Merged
rsasaki0109 merged 1 commit into
mainfrom
add-saycan-affordance-grounding
Jun 5, 2026
Merged

Add SayCan-style affordance grounding (embodied_ai/39)#15
rsasaki0109 merged 1 commit into
mainfrom
add-saycan-affordance-grounding

Conversation

@rsasaki0109

Copy link
Copy Markdown
Owner

What

A new example, examples/embodied_ai/39_saycan_affordance_grounding.py: a tiny, faithful SayCan loop (Ahn et al., 2022, Do As I Can, Not As I Say). The repo had no foundation-model loop; this adds the smallest honest one and ties it to the existing clarifying-question / conformal-ask-for-help line.

The lesson

A language model is a good planner and a bad robot. Asked to "wipe the table" it proposes pick up the sponge — the right idea — without knowing whether the robot is anywhere near the sponge. SayCan scores every skill twice and multiplies:

score(skill) = p_LLM(skill furthers the instruction) × p_affordance(skill works now)

The product is high only for a skill that is both useful and executable, so the greedy argmax walks out a feasible plan with no separate planner and never commands a skill whose preconditions are unmet.

The contrast is built into the same file via a ground flag (mirroring MCL's augment):

grounded   (SayCan):  go_to_sponge → pick_sponge → go_to_table → wipe_table   (goal in 4–5 steps)
ungrounded (Say only): argmax LLM = "pick the sponge" from the wrong place, forever → timeout

The failure log tells the story directly: ungrounded racks up affordance_violation every step then timeout; grounded has none (or a retried skill_slip).

Contents

  • A self-contained KitchenWorld (two locations, five stochastic skills with preconditions + affordances) and a SayCanAgent.
  • The "LLM" is a small, transparent scorer conditioned on the running facts (held / clean) — the history-conditioned query SayCan makes — but deliberately blind to physical preconditions, which is exactly what the affordance term grounds. Comments make the stand-in explicit.
  • Three smoke tests: grounded walks the feasible plan with no affordance violations; grounded retries a stochastic skill_slip and still wins; ungrounded loops on affordance_violation and times out while grounded succeeds on the same seed.
  • examples/README.md row + a full examples/embodied_ai/README.md section; example count 41→42 and test count 115→118 in README.md / docs/status.md.

Verification

  • Seeds 0–7: grounded cleans every seed in 4–5 steps (retrying slips); ungrounded never cleans.
  • Full suite green (131 passed).

References

  • M. Ahn et al., "Do As I Can, Not As I Say: Grounding Language in Robotic Affordances," CoRL 2022 (arXiv:2204.01691, https://say-can.github.io/).

🤖 Generated with Claude Code

A language model is a good planner and a bad robot: asked to "wipe the table" it
proposes "pick up the sponge" without knowing whether the robot is near the
sponge. SayCan (Ahn et al., 2022, "Do As I Can, Not As I Say") grounds the model
by scoring every skill twice and multiplying:

    score(skill) = p_LLM(skill furthers the instruction) * p_affordance(works now)

so the greedy argmax walks out a feasible plan with no separate planner and never
commands a skill whose preconditions are unmet. The repo had no foundation-model
loop; this adds the smallest honest one and ties it to the existing
clarifying-question / conformal-ask-for-help line.

The contrast is built into the same file via a `ground` flag (mirroring MCL's
`augment`):

  grounded:   go_to_sponge -> pick_sponge -> go_to_table -> wipe   (goal in 4-5 steps)
  ungrounded: argmax LLM = pick_sponge from the wrong place, forever -> timeout

The "LLM" is a small, transparent scorer conditioned on the running facts (the
history-conditioned query SayCan makes) but deliberately blind to physical
preconditions — which is exactly what the affordance term grounds.

- self-contained KitchenWorld (two locations, five stochastic skills with
  preconditions/affordances) + a SayCanAgent and a References section
- three smoke tests: grounded walks the feasible plan with no affordance
  violations; grounded retries a stochastic skill_slip and still wins; ungrounded
  language-only loops on affordance_violation and times out while grounded
  succeeds on the same seed
- examples index + embodied_ai README section; example 41->42, tests 115->118

Verified across seeds 0-7 (grounded cleans every seed in 4-5 steps, retrying
slips; ungrounded never cleans).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@rsasaki0109 rsasaki0109 merged commit ea7207e into main Jun 5, 2026
3 checks passed
@rsasaki0109 rsasaki0109 deleted the add-saycan-affordance-grounding branch June 5, 2026 01:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant