45 lines
2.6 KiB
Markdown
45 lines
2.6 KiB
Markdown
# Rollout Artifacts Implementation Plan
|
|
|
|
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
|
|
|
**Goal:** Extend rollout evaluation so one selected checkpoint can be run once with video capture, timing breakdown, and saved EE trajectory artifacts.
|
|
|
|
**Architecture:** Keep the implementation centered in `eval_vla.py` so existing training-time rollout validation remains compatible. Add config-gated artifact capture helpers, serialize outputs under the eval run directory, and add lightweight tests for helper behavior and summary wiring; default eval behavior must remain unchanged when artifact capture is off.
|
|
|
|
**Tech Stack:** Python, Hydra/OmegaConf, NumPy, OpenCV, JSON, PyTorch unittest/mocking.
|
|
|
|
---
|
|
|
|
### Task 1: Add artifact capture configuration and helper wiring
|
|
|
|
**Files:**
|
|
- Modify: `roboimi/demos/vla_scripts/eval_vla.py`
|
|
- Modify: `roboimi/vla/conf/eval/eval.yaml`
|
|
- Test: `tests/test_eval_vla_rollout_artifacts.py`
|
|
|
|
- [ ] **Step 1: Write failing tests for optional artifact config / summary wiring**
|
|
- [ ] **Step 2: Implement config-backed artifact flags and output paths with defaults that write nothing**
|
|
- [ ] **Step 3: Verify existing eval call sites still work with defaults**
|
|
|
|
### Task 2: Add timing breakdown, video recording, and trajectory export
|
|
|
|
**Files:**
|
|
- Modify: `roboimi/demos/vla_scripts/eval_vla.py`
|
|
- Test: `tests/test_eval_vla_rollout_artifacts.py`
|
|
|
|
- [ ] **Step 1: Write failing tests for timing aggregation, trajectory serialization, and summary schema**
|
|
- [ ] **Step 2: Implement per-step timing capture for `obs_read_ms`, `preprocess_ms`, `inference_ms`, `env_step_ms`, `loop_total_ms`**
|
|
- [ ] **Step 3: Implement MP4 recording from a chosen camera stream and canonical `trajectory.npz` export using `left_link7/right_link7` executed poses after `env.step`**
|
|
- [ ] **Step 4: Run focused tests and fix issues**
|
|
|
|
### Task 3: Stop training safely and execute one real rollout
|
|
|
|
**Files:**
|
|
- Use: `roboimi/demos/vla_scripts/eval_vla.py`
|
|
- Output: `runs/.../eval_artifacts/...`
|
|
|
|
- [ ] **Step 1: Stop the active training process, wait for exit, and confirm the target checkpoint is readable**
|
|
- [ ] **Step 2: Select the latest completed checkpoint if an explicit one is not provided; fall back to prior completed / best checkpoint if needed**
|
|
- [ ] **Step 3: Run one headless rollout with artifact capture enabled**
|
|
- [ ] **Step 4: Verify the MP4 / timing summary / trajectory files exist and summarize findings**
|