# Rollout Artifacts Implementation Plan > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. **Goal:** Extend rollout evaluation so one selected checkpoint can be run once with video capture, timing breakdown, and saved EE trajectory artifacts. **Architecture:** Keep the implementation centered in `eval_vla.py` so existing training-time rollout validation remains compatible. Add config-gated artifact capture helpers, serialize outputs under the eval run directory, and add lightweight tests for helper behavior and summary wiring; default eval behavior must remain unchanged when artifact capture is off. **Tech Stack:** Python, Hydra/OmegaConf, NumPy, OpenCV, JSON, PyTorch unittest/mocking. --- ### Task 1: Add artifact capture configuration and helper wiring **Files:** - Modify: `roboimi/demos/vla_scripts/eval_vla.py` - Modify: `roboimi/vla/conf/eval/eval.yaml` - Test: `tests/test_eval_vla_rollout_artifacts.py` - [ ] **Step 1: Write failing tests for optional artifact config / summary wiring** - [ ] **Step 2: Implement config-backed artifact flags and output paths with defaults that write nothing** - [ ] **Step 3: Verify existing eval call sites still work with defaults** ### Task 2: Add timing breakdown, video recording, and trajectory export **Files:** - Modify: `roboimi/demos/vla_scripts/eval_vla.py` - Test: `tests/test_eval_vla_rollout_artifacts.py` - [ ] **Step 1: Write failing tests for timing aggregation, trajectory serialization, and summary schema** - [ ] **Step 2: Implement per-step timing capture for `obs_read_ms`, `preprocess_ms`, `inference_ms`, `env_step_ms`, `loop_total_ms`** - [ ] **Step 3: Implement MP4 recording from a chosen camera stream and canonical `trajectory.npz` export using `left_link7/right_link7` executed poses after `env.step`** - [ ] **Step 4: Run focused tests and fix issues** ### Task 3: Stop training safely and execute one real rollout **Files:** - Use: `roboimi/demos/vla_scripts/eval_vla.py` - Output: `runs/.../eval_artifacts/...` - [ ] **Step 1: Stop the active training process, wait for exit, and confirm the target checkpoint is readable** - [ ] **Step 2: Select the latest completed checkpoint if an explicit one is not provided; fall back to prior completed / best checkpoint if needed** - [ ] **Step 3: Run one headless rollout with artifact capture enabled** - [ ] **Step 4: Verify the MP4 / timing summary / trajectory files exist and summarize findings**