Skip to content

[codex] Add reproducible minimal PPO WSL workflow#484

Draft
HC-Seaple wants to merge 1 commit into
Emerge-Lab:2.0from
HC-Seaple:codex/minimal-ppo-wsl
Draft

[codex] Add reproducible minimal PPO WSL workflow#484
HC-Seaple wants to merge 1 commit into
Emerge-Lab:2.0from
HC-Seaple:codex/minimal-ppo-wsl

Conversation

@HC-Seaple

Copy link
Copy Markdown

What changed

  • Add a self-contained continuous-action PPO trainer with vectorized rollouts, GAE, clipped updates, checkpointing, and deterministic evaluation.
  • Add Windows/WSL setup and launch scripts for the Linux-native Raylib build.
  • Add generic WOMD JSON-to-map preparation without committing datasets or generated binaries.
  • Add native third-person checkpoint visualization and JSON metrics.
  • Ensure complete renderer frames are written to ffmpeg.
  • Document clone, setup, map preparation, training, visualization, and handoff.

Validation

  • Python scripts pass python -m py_compile.
  • WSL launchers pass bash -n.
  • The staged change set passes git diff --check.
  • The end-to-end workflow was previously exercised in WSL with a 10,112-step checkpoint and 92-frame native render.

Current limitation

This is a smoke-test training architecture. Reward shaping still needs route-progress reward, reverse-motion penalties, and stronger collision/off-road costs before scaling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants