4.0 KiB
IMF Horizon Grid and AttnRes Ablation Implementation Plan
For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (
- [ ]) syntax for tracking.
Goal: Run a 6-run Phase-1 IMF horizon/action-step experiment grid across available GPUs, monitor progress and collect best rollout metrics, then use the best horizon setting for a Phase-2 visual-attnres ablation.
Architecture: Use the current IMF training code as-is for Phase-1 by sweeping explicit (pred_horizon, num_action_steps) overrides while keeping emb=384, layer=12, and max_steps=50k fixed. Maintain a local experiment suite directory with a manifest and machine-readable status snapshots so progress can be resumed and summarized across turns. After Phase-1 completes, compare the current head-only attnres setup against a variant that also adds attnres into the visual ResNet path.
Tech Stack: Python, Hydra/OmegaConf, PyTorch, SSH/Tailscale, JSON/CSV status files, SwanLab.
Task 1: Prepare the experiment suite manifest and state tracking
Files:
-
Create:
experiment_suites/2026-04-04-imf-horizon-grid/manifest.json -
Create:
experiment_suites/2026-04-04-imf-horizon-grid/status.json -
Create:
experiment_suites/2026-04-04-imf-horizon-grid/notes.md -
Define the 6 legal Phase-1 combinations:
(8,8),(16,8),(16,16),(32,8),(32,16),(32,32). -
Record for each run: name, host, GPU slot, command, log path, SwanLab run name, and completion criteria.
-
Define the comparison metric as the maximum rollout average reward seen during training (
max avg_reward), preferably read from the best-checkpoint metadata and cross-checked against logs. -
Keep
status.jsonupdated with per-run state: queued / running / finished / failed plus latest parsed progress.
Task 2: Prepare the remote 8-GPU execution target
Files:
-
Remote working directory under
/home/droid/ -
Reuse or create a synced code directory for this suite
-
Verify the remote dataset path and environment path.
-
Verify GPU availability and reserve 6 GPUs for Phase-1 launches.
-
Sync the required code to a dedicated remote suite directory.
-
Record exact remote paths back into the local suite manifest.
Task 3: Launch the 6 Phase-1 experiments in parallel
Files:
-
Reuse:
roboimi/demos/vla_scripts/train_vla.py -
Modify only local suite tracking files unless a launch bug is discovered
-
Launch 6 runs concurrently with fixed settings: IMF, emb=384, layer=12, max_steps=50k.
-
Keep all other relevant training hyperparameters aligned to the current strong baseline unless a concrete blocker appears.
-
Assign one GPU per run on the 8xL20 host.
-
Capture PID, log path, and SwanLab URL for each run in
status.json.
Task 4: Monitor and summarize Phase-1 until all 6 finish
Files:
-
Update:
experiment_suites/2026-04-04-imf-horizon-grid/status.json -
Update:
experiment_suites/2026-04-04-imf-horizon-grid/notes.md -
Periodically parse each run’s log/checkpoints to extract latest step, latest rollout reward, and best rollout reward so far.
-
Keep a resumable local summary so progress can be continued in later turns without rediscovery.
-
After all 6 runs finish, rank them by
max avg_rewardand write a compact Phase-1 summary.
Task 5: Prepare the Phase-2 visual-attnres ablation
Files:
-
Likely modify: vision backbone implementation and config files (to be confirmed after code inspection)
-
Add/update targeted tests for the visual backbone path if code changes are needed
-
Use the best Phase-1
(pred_horizon, num_action_steps)combination as the fixed rollout setting for Phase-2. -
Compare:
- current setup: attnres only in the IMF head
- ablation setup: attnres in both IMF head and visual encoder path
-
Keep the rest of the training settings fixed.
-
Launch and monitor the Phase-2 pair after Phase-1 summary is complete.