# IMF Rollout Trajectory Images and Short-Horizon Training Implementation Plan

> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.

**Goal:** Add training-time rollout front trajectory image export plus SwanLab image logging, then start a new local IMF training run with `emb=384`, `layer=12`, `pred_horizon=8`, `num_action_steps=4`, `max_steps=50000`.

**Architecture:** Extend `eval_vla.py` so a rollout can emit one per-episode static front-view image with red EE trajectory overlay. Extend `train_vla.py` so rollout validation forces image export, forces video off, and uploads those per-episode images to SwanLab. Launch the requested new run through explicit command-line overrides rather than branch-default config changes.

**Tech Stack:** Python, PyTorch, Hydra/OmegaConf, MuJoCo, OpenCV, SwanLab.

---

### Task 1: Add and validate rollout image tests

**Files:**
- Modify: `tests/test_eval_vla_rollout_artifacts.py`
- Modify: `tests/test_train_vla_swanlab_logging.py`
- Modify: `tests/test_train_vla_rollout_validation.py`

- [ ] Add/adjust eval tests so they assert per-episode trajectory image paths are produced without requiring video export.
- [ ] Add/adjust training tests so they assert training-time rollout validation forces `record_video=false`.
- [ ] Add/adjust training tests so they assert trajectory image paths flow from eval summary into SwanLab media logging.
- [ ] Add/adjust training tests so they assert image media is logged, not only scalar reward metrics.

### Task 2: Implement per-episode front trajectory image export in eval

**Files:**
- Modify: `roboimi/demos/vla_scripts/eval_vla.py`
- Reuse/Read: `roboimi/utils/raw_action_trajectory_viewer.py`
- Modify: `roboimi/vla/conf/eval/eval.yaml`

- [ ] Add config plumbing for `save_trajectory_image` and `trajectory_image_camera_name`.
- [ ] Ensure the default training-time camera resolution path is pinned to `front`.
- [ ] Implement distinct per-episode image naming so 5 rollout episodes create 5 distinct PNGs.
- [ ] Reuse the existing red trajectory representation logic when composing the PNG.
- [ ] Ensure headless eval works under EGL even on machines with `DISPLAY` set.

### Task 3: Implement SwanLab rollout image logging in training

**Files:**
- Modify: `roboimi/demos/vla_scripts/train_vla.py`
- Modify: `tests/test_train_vla_swanlab_logging.py`
- Modify: `tests/test_train_vla_rollout_validation.py`

- [ ] Make `run_rollout_validation()` force `record_video=false`.
- [ ] Make `run_rollout_validation()` force `save_trajectory_image=true` and `trajectory_image_camera_name=front`.
- [ ] Ensure rollout validation still uses 5 episodes per validation event for the requested run.
- [ ] Add a best-effort helper that converts per-episode image paths into SwanLab image media payloads.
- [ ] Keep image-upload failures non-fatal and warning-only.

### Task 4: Verify action-chunk semantics for the new run

**Files:**
- Verify: `roboimi/vla/agent.py`
- Verify: `roboimi/vla/agent_imf.py`
- Test: `tests/test_imf_vla_agent.py`

- [ ] Confirm the existing queue logic still means “predict 8, execute first 4”.
- [ ] Do not change branch defaults unless strictly necessary; prefer launch-time overrides.

### Task 5: Verify and launch the requested local training run

**Files:**
- Use: `roboimi/demos/vla_scripts/train_vla.py`
- Use: `roboimi/demos/vla_scripts/eval_vla.py`

- [ ] Run the targeted verification suite.
- [ ] Run one real headless smoke eval and confirm a front trajectory PNG is produced while `video_mp4` stays null.
- [ ] Launch the new local training run with explicit overrides including:
  - `agent=resnet_imf_attnres`
  - `agent.head.n_emb=384`
  - `agent.head.n_layer=12`
  - `agent.pred_horizon=8`
  - `agent.num_action_steps=4`
  - `train.max_steps=50000`
  - `train.rollout_num_episodes=5`
  - `train.use_swanlab=true`
  - current local baseline dataset/camera/CUDA/batch/lr/num_workers/backbone settings
- [ ] Verify PID, GPU allocation, log tail, and SwanLab run URL.