diff --git a/docs/superpowers/plans/2026-04-23-sim-air-insert-ring-bar.md b/docs/superpowers/plans/2026-04-23-sim-air-insert-ring-bar.md index 184a1ab..54c7f28 100644 --- a/docs/superpowers/plans/2026-04-23-sim-air-insert-ring-bar.md +++ b/docs/superpowers/plans/2026-04-23-sim-air-insert-ring-bar.md @@ -204,6 +204,7 @@ Run: **Files:** - Create: `roboimi/demos/diana_air_insert_policy.py` +- Modify: `roboimi/demos/diana_record_sim_episodes.py` - Modify: `tests/test_air_insert_env.py` - Optionally Modify: `roboimi/demos/vla_scripts/eval_vla.py` (only if integration gaps remain after Task 1) @@ -217,9 +218,11 @@ Add tests covering: Keep the tests unit-level; do not require a full MuJoCo rollout for every assertion. -- [ ] **Step 2: Write a real failing headless smoke test for the new task path** +- [ ] **Step 2: Write failing tests for the scripted rollout entrypoint and a real headless smoke path** -Add a deterministic integration/smoke test that instantiates `make_sim_env('sim_air_insert_ring_bar', headless=True)`, resets with sampled named task state, and steps a few actions or scripted-policy outputs. Use the real task XML and task-specific environment wiring so broken includes, joint names, or dispatch mismatches are caught. +Add coverage for both: +- the standard scripted rollout entrypoint (`roboimi/demos/diana_record_sim_episodes.py`) can select the new task sampler/policy instead of remaining sim_transfer-only +- a deterministic integration/smoke test that instantiates `make_sim_env('sim_air_insert_ring_bar', headless=True)`, resets with sampled named task state, and steps a few actions or scripted-policy outputs using the real task XML and task-specific wiring - [ ] **Step 3: Run the scripted-policy tests and verify they fail** @@ -252,7 +255,7 @@ Expected: - [ ] **Step 6: Run the combined verification suite for this feature** Run: -`/home/droid/.conda/envs/roboimi/bin/python -m unittest tests.test_air_insert_env tests.test_eval_vla_headless tests.test_robot_asset_paths -v` +`/home/droid/.conda/envs/roboimi/bin/python -m unittest tests.test_air_insert_env tests.test_eval_vla_headless tests.test_eval_vla_rollout_artifacts tests.test_train_vla_rollout_validation tests.test_robot_asset_paths -v` Expected: - PASS with 0 failures