Files
roboimi/docs/superpowers/plans/2026-04-02-imf-rollout-trajectory-images-and-short-horizon-training.md

4.1 KiB

IMF Rollout Trajectory Images and Short-Horizon Training Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Add training-time rollout front trajectory image export plus SwanLab image logging, then start a new local IMF training run with emb=384, layer=12, pred_horizon=8, num_action_steps=4, max_steps=50000.

Architecture: Extend eval_vla.py so a rollout can emit one per-episode static front-view image with red EE trajectory overlay. Extend train_vla.py so rollout validation forces image export, forces video off, and uploads those per-episode images to SwanLab. Launch the requested new run through explicit command-line overrides rather than branch-default config changes.

Tech Stack: Python, PyTorch, Hydra/OmegaConf, MuJoCo, OpenCV, SwanLab.


Task 1: Add and validate rollout image tests

Files:

  • Modify: tests/test_eval_vla_rollout_artifacts.py

  • Modify: tests/test_train_vla_swanlab_logging.py

  • Modify: tests/test_train_vla_rollout_validation.py

  • Add/adjust eval tests so they assert per-episode trajectory image paths are produced without requiring video export.

  • Add/adjust training tests so they assert training-time rollout validation forces record_video=false.

  • Add/adjust training tests so they assert trajectory image paths flow from eval summary into SwanLab media logging.

  • Add/adjust training tests so they assert image media is logged, not only scalar reward metrics.

Task 2: Implement per-episode front trajectory image export in eval

Files:

  • Modify: roboimi/demos/vla_scripts/eval_vla.py

  • Reuse/Read: roboimi/utils/raw_action_trajectory_viewer.py

  • Modify: roboimi/vla/conf/eval/eval.yaml

  • Add config plumbing for save_trajectory_image and trajectory_image_camera_name.

  • Ensure the default training-time camera resolution path is pinned to front.

  • Implement distinct per-episode image naming so 5 rollout episodes create 5 distinct PNGs.

  • Reuse the existing red trajectory representation logic when composing the PNG.

  • Ensure headless eval works under EGL even on machines with DISPLAY set.

Task 3: Implement SwanLab rollout image logging in training

Files:

  • Modify: roboimi/demos/vla_scripts/train_vla.py

  • Modify: tests/test_train_vla_swanlab_logging.py

  • Modify: tests/test_train_vla_rollout_validation.py

  • Make run_rollout_validation() force record_video=false.

  • Make run_rollout_validation() force save_trajectory_image=true and trajectory_image_camera_name=front.

  • Ensure rollout validation still uses 5 episodes per validation event for the requested run.

  • Add a best-effort helper that converts per-episode image paths into SwanLab image media payloads.

  • Keep image-upload failures non-fatal and warning-only.

Task 4: Verify action-chunk semantics for the new run

Files:

  • Verify: roboimi/vla/agent.py

  • Verify: roboimi/vla/agent_imf.py

  • Test: tests/test_imf_vla_agent.py

  • Confirm the existing queue logic still means “predict 8, execute first 4”.

  • Do not change branch defaults unless strictly necessary; prefer launch-time overrides.

Task 5: Verify and launch the requested local training run

Files:

  • Use: roboimi/demos/vla_scripts/train_vla.py

  • Use: roboimi/demos/vla_scripts/eval_vla.py

  • Run the targeted verification suite.

  • Run one real headless smoke eval and confirm a front trajectory PNG is produced while video_mp4 stays null.

  • Launch the new local training run with explicit overrides including:

    • agent=resnet_imf_attnres
    • agent.head.n_emb=384
    • agent.head.n_layer=12
    • agent.pred_horizon=8
    • agent.num_action_steps=4
    • train.max_steps=50000
    • train.rollout_num_episodes=5
    • train.use_swanlab=true
    • current local baseline dataset/camera/CUDA/batch/lr/num_workers/backbone settings
  • Verify PID, GPU allocation, log tail, and SwanLab run URL.