docs(plan): cover rollout entrypoint and eval regressions

This commit is contained in:
Logic
2026-04-23 17:14:49 +08:00
parent fce6839daa
commit 06ac6c6d18

View File

@@ -204,6 +204,7 @@ Run:
**Files:** **Files:**
- Create: `roboimi/demos/diana_air_insert_policy.py` - Create: `roboimi/demos/diana_air_insert_policy.py`
- Modify: `roboimi/demos/diana_record_sim_episodes.py`
- Modify: `tests/test_air_insert_env.py` - Modify: `tests/test_air_insert_env.py`
- Optionally Modify: `roboimi/demos/vla_scripts/eval_vla.py` (only if integration gaps remain after Task 1) - Optionally Modify: `roboimi/demos/vla_scripts/eval_vla.py` (only if integration gaps remain after Task 1)
@@ -217,9 +218,11 @@ Add tests covering:
Keep the tests unit-level; do not require a full MuJoCo rollout for every assertion. Keep the tests unit-level; do not require a full MuJoCo rollout for every assertion.
- [ ] **Step 2: Write a real failing headless smoke test for the new task path** - [ ] **Step 2: Write failing tests for the scripted rollout entrypoint and a real headless smoke path**
Add a deterministic integration/smoke test that instantiates `make_sim_env('sim_air_insert_ring_bar', headless=True)`, resets with sampled named task state, and steps a few actions or scripted-policy outputs. Use the real task XML and task-specific environment wiring so broken includes, joint names, or dispatch mismatches are caught. Add coverage for both:
- the standard scripted rollout entrypoint (`roboimi/demos/diana_record_sim_episodes.py`) can select the new task sampler/policy instead of remaining sim_transfer-only
- a deterministic integration/smoke test that instantiates `make_sim_env('sim_air_insert_ring_bar', headless=True)`, resets with sampled named task state, and steps a few actions or scripted-policy outputs using the real task XML and task-specific wiring
- [ ] **Step 3: Run the scripted-policy tests and verify they fail** - [ ] **Step 3: Run the scripted-policy tests and verify they fail**
@@ -252,7 +255,7 @@ Expected:
- [ ] **Step 6: Run the combined verification suite for this feature** - [ ] **Step 6: Run the combined verification suite for this feature**
Run: Run:
`/home/droid/.conda/envs/roboimi/bin/python -m unittest tests.test_air_insert_env tests.test_eval_vla_headless tests.test_robot_asset_paths -v` `/home/droid/.conda/envs/roboimi/bin/python -m unittest tests.test_air_insert_env tests.test_eval_vla_headless tests.test_eval_vla_rollout_artifacts tests.test_train_vla_rollout_validation tests.test_robot_asset_paths -v`
Expected: Expected:
- PASS with 0 failures - PASS with 0 failures