69 lines
4.0 KiB
Markdown
69 lines
4.0 KiB
Markdown
# IMF Horizon Grid and AttnRes Ablation Implementation Plan
|
||
|
||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||
|
||
**Goal:** Run a 6-run Phase-1 IMF horizon/action-step experiment grid across available GPUs, monitor progress and collect best rollout metrics, then use the best horizon setting for a Phase-2 visual-attnres ablation.
|
||
|
||
**Architecture:** Use the current IMF training code as-is for Phase-1 by sweeping explicit `(pred_horizon, num_action_steps)` overrides while keeping emb=384, layer=12, and max_steps=50k fixed. Maintain a local experiment suite directory with a manifest and machine-readable status snapshots so progress can be resumed and summarized across turns. After Phase-1 completes, compare the current head-only attnres setup against a variant that also adds attnres into the visual ResNet path.
|
||
|
||
**Tech Stack:** Python, Hydra/OmegaConf, PyTorch, SSH/Tailscale, JSON/CSV status files, SwanLab.
|
||
|
||
---
|
||
|
||
### Task 1: Prepare the experiment suite manifest and state tracking
|
||
|
||
**Files:**
|
||
- Create: `experiment_suites/2026-04-04-imf-horizon-grid/manifest.json`
|
||
- Create: `experiment_suites/2026-04-04-imf-horizon-grid/status.json`
|
||
- Create: `experiment_suites/2026-04-04-imf-horizon-grid/notes.md`
|
||
|
||
- [ ] Define the 6 legal Phase-1 combinations: `(8,8)`, `(16,8)`, `(16,16)`, `(32,8)`, `(32,16)`, `(32,32)`.
|
||
- [ ] Record for each run: name, host, GPU slot, command, log path, SwanLab run name, and completion criteria.
|
||
- [ ] Define the comparison metric as the maximum rollout average reward seen during training (`max avg_reward`), preferably read from the best-checkpoint metadata and cross-checked against logs.
|
||
- [ ] Keep `status.json` updated with per-run state: queued / running / finished / failed plus latest parsed progress.
|
||
|
||
### Task 2: Prepare the remote 8-GPU execution target
|
||
|
||
**Files:**
|
||
- Remote working directory under `/home/droid/`
|
||
- Reuse or create a synced code directory for this suite
|
||
|
||
- [ ] Verify the remote dataset path and environment path.
|
||
- [ ] Verify GPU availability and reserve 6 GPUs for Phase-1 launches.
|
||
- [ ] Sync the required code to a dedicated remote suite directory.
|
||
- [ ] Record exact remote paths back into the local suite manifest.
|
||
|
||
### Task 3: Launch the 6 Phase-1 experiments in parallel
|
||
|
||
**Files:**
|
||
- Reuse: `roboimi/demos/vla_scripts/train_vla.py`
|
||
- Modify only local suite tracking files unless a launch bug is discovered
|
||
|
||
- [ ] Launch 6 runs concurrently with fixed settings: IMF, emb=384, layer=12, max_steps=50k.
|
||
- [ ] Keep all other relevant training hyperparameters aligned to the current strong baseline unless a concrete blocker appears.
|
||
- [ ] Assign one GPU per run on the 8xL20 host.
|
||
- [ ] Capture PID, log path, and SwanLab URL for each run in `status.json`.
|
||
|
||
### Task 4: Monitor and summarize Phase-1 until all 6 finish
|
||
|
||
**Files:**
|
||
- Update: `experiment_suites/2026-04-04-imf-horizon-grid/status.json`
|
||
- Update: `experiment_suites/2026-04-04-imf-horizon-grid/notes.md`
|
||
|
||
- [ ] Periodically parse each run’s log/checkpoints to extract latest step, latest rollout reward, and best rollout reward so far.
|
||
- [ ] Keep a resumable local summary so progress can be continued in later turns without rediscovery.
|
||
- [ ] After all 6 runs finish, rank them by `max avg_reward` and write a compact Phase-1 summary.
|
||
|
||
### Task 5: Prepare the Phase-2 visual-attnres ablation
|
||
|
||
**Files:**
|
||
- Likely modify: vision backbone implementation and config files (to be confirmed after code inspection)
|
||
- Add/update targeted tests for the visual backbone path if code changes are needed
|
||
|
||
- [ ] Use the best Phase-1 `(pred_horizon, num_action_steps)` combination as the fixed rollout setting for Phase-2.
|
||
- [ ] Compare:
|
||
1. current setup: attnres only in the IMF head
|
||
2. ablation setup: attnres in both IMF head and visual encoder path
|
||
- [ ] Keep the rest of the training settings fixed.
|
||
- [ ] Launch and monitor the Phase-2 pair after Phase-1 summary is complete.
|