Files
roboimi/docs/superpowers/plans/2026-04-04-imf-horizon-grid-and-attnres-ablation.md

69 lines
4.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# IMF Horizon Grid and AttnRes Ablation Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Run a 6-run Phase-1 IMF horizon/action-step experiment grid across available GPUs, monitor progress and collect best rollout metrics, then use the best horizon setting for a Phase-2 visual-attnres ablation.
**Architecture:** Use the current IMF training code as-is for Phase-1 by sweeping explicit `(pred_horizon, num_action_steps)` overrides while keeping emb=384, layer=12, and max_steps=50k fixed. Maintain a local experiment suite directory with a manifest and machine-readable status snapshots so progress can be resumed and summarized across turns. After Phase-1 completes, compare the current head-only attnres setup against a variant that also adds attnres into the visual ResNet path.
**Tech Stack:** Python, Hydra/OmegaConf, PyTorch, SSH/Tailscale, JSON/CSV status files, SwanLab.
---
### Task 1: Prepare the experiment suite manifest and state tracking
**Files:**
- Create: `experiment_suites/2026-04-04-imf-horizon-grid/manifest.json`
- Create: `experiment_suites/2026-04-04-imf-horizon-grid/status.json`
- Create: `experiment_suites/2026-04-04-imf-horizon-grid/notes.md`
- [ ] Define the 6 legal Phase-1 combinations: `(8,8)`, `(16,8)`, `(16,16)`, `(32,8)`, `(32,16)`, `(32,32)`.
- [ ] Record for each run: name, host, GPU slot, command, log path, SwanLab run name, and completion criteria.
- [ ] Define the comparison metric as the maximum rollout average reward seen during training (`max avg_reward`), preferably read from the best-checkpoint metadata and cross-checked against logs.
- [ ] Keep `status.json` updated with per-run state: queued / running / finished / failed plus latest parsed progress.
### Task 2: Prepare the remote 8-GPU execution target
**Files:**
- Remote working directory under `/home/droid/`
- Reuse or create a synced code directory for this suite
- [ ] Verify the remote dataset path and environment path.
- [ ] Verify GPU availability and reserve 6 GPUs for Phase-1 launches.
- [ ] Sync the required code to a dedicated remote suite directory.
- [ ] Record exact remote paths back into the local suite manifest.
### Task 3: Launch the 6 Phase-1 experiments in parallel
**Files:**
- Reuse: `roboimi/demos/vla_scripts/train_vla.py`
- Modify only local suite tracking files unless a launch bug is discovered
- [ ] Launch 6 runs concurrently with fixed settings: IMF, emb=384, layer=12, max_steps=50k.
- [ ] Keep all other relevant training hyperparameters aligned to the current strong baseline unless a concrete blocker appears.
- [ ] Assign one GPU per run on the 8xL20 host.
- [ ] Capture PID, log path, and SwanLab URL for each run in `status.json`.
### Task 4: Monitor and summarize Phase-1 until all 6 finish
**Files:**
- Update: `experiment_suites/2026-04-04-imf-horizon-grid/status.json`
- Update: `experiment_suites/2026-04-04-imf-horizon-grid/notes.md`
- [ ] Periodically parse each runs log/checkpoints to extract latest step, latest rollout reward, and best rollout reward so far.
- [ ] Keep a resumable local summary so progress can be continued in later turns without rediscovery.
- [ ] After all 6 runs finish, rank them by `max avg_reward` and write a compact Phase-1 summary.
### Task 5: Prepare the Phase-2 visual-attnres ablation
**Files:**
- Likely modify: vision backbone implementation and config files (to be confirmed after code inspection)
- Add/update targeted tests for the visual backbone path if code changes are needed
- [ ] Use the best Phase-1 `(pred_horizon, num_action_steps)` combination as the fixed rollout setting for Phase-2.
- [ ] Compare:
1. current setup: attnres only in the IMF head
2. ablation setup: attnres in both IMF head and visual encoder path
- [ ] Keep the rest of the training settings fixed.
- [ ] Launch and monitor the Phase-2 pair after Phase-1 summary is complete.