docs(plan): cover rollout entrypoint and eval regressions

2026-04-23 17:14:49 +08:00
parent fce6839daa
commit 06ac6c6d18
1 changed files with 6 additions and 3 deletions
@@ -204,6 +204,7 @@ Run:
 **Files:**
 - Create: `roboimi/demos/diana_air_insert_policy.py`
 - Modify: `roboimi/demos/diana_record_sim_episodes.py`
 - Modify: `tests/test_air_insert_env.py`
 - Optionally Modify: `roboimi/demos/vla_scripts/eval_vla.py` (only if integration gaps remain after Task 1)
@@ -217,9 +218,11 @@ Add tests covering:
 Keep the tests unit-level; do not require a full MuJoCo rollout for every assertion.
- [ ] **Step 2: Write a real failing headless smoke test for the new task path**
+- [ ] **Step 2: Write failing tests for the scripted rollout entrypoint and a real headless smoke path**
-Add a deterministic integration/smoke test that instantiates `make_sim_env('sim_air_insert_ring_bar', headless=True)`, resets with sampled named task state, and steps a few actions or scripted-policy outputs. Use the real task XML and task-specific environment wiring so broken includes, joint names, or dispatch mismatches are caught.
+Add coverage for both:
 - the standard scripted rollout entrypoint (`roboimi/demos/diana_record_sim_episodes.py`) can select the new task sampler/policy instead of remaining sim_transfer-only
 - a deterministic integration/smoke test that instantiates `make_sim_env('sim_air_insert_ring_bar', headless=True)`, resets with sampled named task state, and steps a few actions or scripted-policy outputs using the real task XML and task-specific wiring
 - [ ] **Step 3: Run the scripted-policy tests and verify they fail**
@@ -252,7 +255,7 @@ Expected:
 - [ ] **Step 6: Run the combined verification suite for this feature**
 Run:
-`/home/droid/.conda/envs/roboimi/bin/python -m unittest tests.test_air_insert_env tests.test_eval_vla_headless tests.test_robot_asset_paths -v`
+`/home/droid/.conda/envs/roboimi/bin/python -m unittest tests.test_air_insert_env tests.test_eval_vla_headless tests.test_eval_vla_rollout_artifacts tests.test_train_vla_rollout_validation tests.test_robot_asset_paths -v`
 Expected:
 - PASS with 0 failures