Files
roboimi/docs/superpowers/plans/2026-04-23-sim-air-insert-ring-bar.md

14 KiB

sim_air_insert_ring_bar Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Add an independent dual-Diana MuJoCo task sim_air_insert_ring_bar with a square ring block, a square bar block, staged rewards, strict finite-geometry in-air insertion success detection, and a task-specific scripted policy.

Architecture: Reuse the current dual-Diana EE-control stack and environment factory, but add a task-specific scene XML, robot asset entrypoint, sampling helpers, and a new task-specific environment module. Keep sim_transfer untouched while introducing pure-Python geometry helpers and focused tests so reward/success behavior can be regression tested without requiring a full MuJoCo rollout in every test.

Tech Stack: Python, unittest, MuJoCo XML assets, existing dual-Diana environment classes, Hydra-compatible task naming/config patterns.


File Structure / Responsibilities

  • Create: roboimi/assets/models/manipulators/DianaMed/ring_bar_objects.xml
    • Defines the rigid ring body and bar body, each with a free joint and stable box-based geoms.
  • Create: roboimi/assets/models/manipulators/DianaMed/bi_diana_ring_bar_ee.xml
    • Scene entrypoint that includes the shared world/table/robot assets plus the new object XML.
  • Modify: roboimi/assets/robots/diana_med.py
    • Add a task-specific robot asset class for the new scene XML without changing existing BiDianaMed behavior.
  • Modify: roboimi/utils/act_ex_utils.py
    • Add deterministic helpers to sample left/right planar placement regions for ring and bar objects.
  • Modify: roboimi/utils/constants.py
    • Register the new task name and default metadata.
  • Create: roboimi/envs/double_air_insert_env.py
    • New task-specific environment, finite-geometry success helpers, reset logic, reward logic, and task factory branch.
  • Modify: roboimi/envs/double_pos_ctrl_env.py
    • Route make_sim_env() to the new task-specific environment while keeping current sim_transfer logic unchanged.
  • Create: roboimi/demos/diana_air_insert_policy.py
    • Task-specific waypoint/open-loop scripted policy for grasp-lift-align-insert.
  • Modify: roboimi/demos/vla_scripts/eval_vla.py
    • Reset the new task with the correct sampled task state instead of assuming a single transfer box pose.
  • Create: tests/test_air_insert_env.py
    • Focused unit tests for sampling, reset helpers, reward progression, and strict success detection.
  • Modify: tests/test_eval_vla_headless.py
    • Add coverage that headless evaluation dispatches the correct reset sampler for the new task.
  • Modify: tests/test_robot_asset_paths.py
    • Verify the new robot asset class resolves its XML path correctly independent of cwd.

Task 1: Add failing tests for task registration, samplers, and asset wiring

Files:

  • Create: tests/test_air_insert_env.py

  • Modify: tests/test_eval_vla_headless.py

  • Modify: tests/test_robot_asset_paths.py

  • Modify: roboimi/utils/act_ex_utils.py (later in implementation)

  • Modify: roboimi/utils/constants.py (later in implementation)

  • Modify: roboimi/assets/robots/diana_med.py (later in implementation)

  • Modify: roboimi/envs/double_pos_ctrl_env.py (later in implementation)

  • Create: roboimi/envs/double_air_insert_env.py (minimal stub in this task)

  • Step 1: Write failing tests for task config and sampling helpers

Add tests in tests/test_air_insert_env.py covering:

  • SIM_TASK_CONFIGS['sim_air_insert_ring_bar'] exists

  • sample_air_insert_ring_bar_pose() (or equivalent helper) returns ring/bar positions with fixed z and correct left/right planar ranges

  • output structure is explicit and easy for reset/eval code to consume

  • Step 2: Write failing tests for environment factory dispatch and robot asset resolution

Add tests covering:

  • make_sim_env('sim_air_insert_ring_bar', headless=True) dispatches to the new environment with rendering disabled

  • a new robot asset class resolves the new XML path independent of cwd, similar to the existing BiDianaMed test pattern

  • Step 3: Write failing tests for eval reset helper dispatch

Extend tests/test_eval_vla_headless.py so headless eval can reset the new task using the new sampler instead of hard-coding sample_transfer_pose().

  • Step 4: Run the targeted tests to verify they fail for the expected missing-feature reasons

Run: /home/droid/.conda/envs/roboimi/bin/python -m unittest tests.test_air_insert_env tests.test_eval_vla_headless tests.test_robot_asset_paths -v

Expected:

  • FAIL because the new task config/helper/class/dispatch branch does not exist yet

  • Step 5: Implement the minimal production code to satisfy the new task registration and helper tests

Implement only enough to make the new tests pass:

  • add new task config entry

  • add the new placement sampler

  • add the new robot asset class

  • create a minimal importable double_air_insert_env.py stub and class/function surface needed for factory dispatch tests

  • add the factory dispatch branch / headless wiring

  • update eval reset dispatch for the new task

  • Step 6: Re-run the targeted tests to verify they pass

Run: /home/droid/.conda/envs/roboimi/bin/python -m unittest tests.test_air_insert_env tests.test_eval_vla_headless tests.test_robot_asset_paths -v

Expected:

  • PASS for the new registration/sampler/dispatch/asset tests

  • Step 7: Commit Task 1

Run: git add tests/test_air_insert_env.py tests/test_eval_vla_headless.py tests/test_robot_asset_paths.py roboimi/utils/act_ex_utils.py roboimi/utils/constants.py roboimi/assets/robots/diana_med.py roboimi/envs/double_pos_ctrl_env.py roboimi/envs/double_air_insert_env.py roboimi/demos/vla_scripts/eval_vla.py && git commit -m "feat(env): register sim air insert ring bar task"


Task 2: Add the MuJoCo ring+bar scene assets and reset helpers

Files:

  • Create: roboimi/assets/models/manipulators/DianaMed/ring_bar_objects.xml

  • Create: roboimi/assets/models/manipulators/DianaMed/bi_diana_ring_bar_ee.xml

  • Create or Modify: roboimi/envs/double_air_insert_env.py

  • Modify: tests/test_air_insert_env.py

  • Step 1: Write failing tests for object reset helpers and scene-specific joint naming assumptions

In tests/test_air_insert_env.py, add unit tests for helper functions that:

  • write ring pose to ring_block_joint from the named task-state mapping
  • write bar pose to bar_block_joint from the named task-state mapping
  • read back env_state as a stable 14D vector [ring_pos, ring_quat, bar_pos, bar_quat]

Use fake mj_data objects so tests stay fast and deterministic.

  • Step 2: Run the focused test slice and verify it fails

Run: /home/droid/.conda/envs/roboimi/bin/python -m unittest tests.test_air_insert_env -v

Expected:

  • FAIL because reset/state helper functions and joint conventions are not implemented yet

  • Step 3: Implement the scene XML files and reset/state helper code

Implement:

  • the object XML with one rigid ring body and one rigid bar body

  • the task scene XML entrypoint using the shared world/table/robot includes

  • reset helper(s) in double_air_insert_env.py that set qpos for both free joints with fixed quaternions

  • task-state accessor(s) returning both object poses in a stable structure

  • Step 4: Re-run the focused test slice and verify it passes

Run: /home/droid/.conda/envs/roboimi/bin/python -m unittest tests.test_air_insert_env -v

Expected:

  • PASS for reset/state helper tests

  • Step 5: Commit Task 2

Run: git add roboimi/assets/models/manipulators/DianaMed/ring_bar_objects.xml roboimi/assets/models/manipulators/DianaMed/bi_diana_ring_bar_ee.xml roboimi/envs/double_air_insert_env.py tests/test_air_insert_env.py && git commit -m "feat(scene): add ring and bar insertion scene assets"


Task 3: Implement strict reward and finite-geometry success detection

Files:

  • Modify: roboimi/envs/double_air_insert_env.py

  • Modify: tests/test_air_insert_env.py

  • Step 1: Write failing tests for reward stages and strict success detection

Add tests in tests/test_air_insert_env.py for:

  • left contact stage reward
  • right contact stage reward
  • ring lifted off table stage
  • bar lifted off table stage
  • positive success case where a finite bar truly passes through the aperture
  • negative case where the centerline would pass but the finite square body would clip
  • negative case where the bar has not crossed the ring thickness direction enough
  • negative case where one/both objects are still on the table

Structure the tests around pure helper functions and light fake contact/state objects so the geometry logic is directly regression tested.

  • Step 2: Run the focused tests and verify they fail for missing reward/success logic

Run: /home/droid/.conda/envs/roboimi/bin/python -m unittest tests.test_air_insert_env -v

Expected:

  • FAIL because the staged reward and finite-geometry insertion logic are not implemented yet

  • Step 3: Implement minimal strict success helpers and reward logic

Implement in roboimi/envs/double_air_insert_env.py:

  • pure helper(s) for transforming bar geometry into ring-local coordinates

  • finite-geometry insertion predicate (not centerline-only)

  • table-contact / airborne checks

  • staged reward function returning the highest achieved stage with max_reward = 5

  • Step 4: Re-run the focused tests to verify the logic passes

Run: /home/droid/.conda/envs/roboimi/bin/python -m unittest tests.test_air_insert_env -v

Expected:

  • PASS for reward and success-detection regression tests

  • Step 5: Commit Task 3

Run: git add roboimi/envs/double_air_insert_env.py tests/test_air_insert_env.py && git commit -m "feat(env): add strict air insertion reward and success logic"


Task 4: Add the scripted policy and integration smoke coverage

Files:

  • Create: roboimi/demos/diana_air_insert_policy.py

  • Modify: roboimi/demos/diana_record_sim_episodes.py

  • Modify: tests/test_air_insert_env.py

  • Optionally Modify: roboimi/demos/vla_scripts/eval_vla.py (only if integration gaps remain after Task 1)

  • Step 1: Write failing tests for scripted-policy action shape and basic generation

Add tests covering:

  • the new policy produces a 16D action
  • trajectory generation accepts sampled named task state without error
  • the first action is a valid open-gripper safe pose command
  • a deterministic nominal smoke path (with canonical sampled state or fake env shim) reaches the intended terminal interface contract without shape/reward mismatches

Keep the tests unit-level; do not require a full MuJoCo rollout for every assertion.

  • Step 2: Write failing tests for the scripted rollout entrypoint and a real headless smoke path

Add coverage for both:

  • the standard scripted rollout entrypoint (roboimi/demos/diana_record_sim_episodes.py) can select the new task sampler/policy instead of remaining sim_transfer-only

  • a deterministic integration/smoke test that instantiates make_sim_env('sim_air_insert_ring_bar', headless=True), resets with sampled named task state, and steps a few actions or scripted-policy outputs using the real task XML and task-specific wiring

  • Step 3: Run the scripted-policy tests and verify they fail

Run: /home/droid/.conda/envs/roboimi/bin/python -m unittest tests.test_air_insert_env -v

Expected:

  • FAIL because the new scripted policy does not exist yet

  • Step 4: Implement the waypoint-based scripted policy

Implement a conservative open-loop policy with phases:

  • safe wait pose
  • above-target approach
  • descend + grasp
  • dual lift
  • airborne meeting alignment
  • bar push-through insertion

Use fixed orientations for version 1 and follow the existing repository style from diana_policy.py.

  • Step 5: Re-run the scripted-policy tests to verify they pass

Run: /home/droid/.conda/envs/roboimi/bin/python -m unittest tests.test_air_insert_env -v

Expected:

  • PASS for scripted-policy tests

  • Step 6: Run the combined verification suite for this feature

Run: /home/droid/.conda/envs/roboimi/bin/python -m unittest tests.test_air_insert_env tests.test_eval_vla_headless tests.test_eval_vla_rollout_artifacts tests.test_train_vla_rollout_validation tests.test_robot_asset_paths -v

Expected:

  • PASS with 0 failures

  • Step 6b: Run the mandatory real headless smoke check

Run a focused smoke command that instantiates the real task, resets with sampled state, and steps a few actions using the new scripted policy or a deterministic action sequence.

Example command (adjust module/test helper if needed): /home/droid/.conda/envs/roboimi/bin/python -m unittest tests.test_air_insert_env.AirInsertEnvSmokeTest -v

Expected:

  • PASS, proving the real XML/assets/env wiring instantiate and step correctly in headless mode

  • Step 7: Commit Task 4

Run: git add roboimi/demos/diana_air_insert_policy.py tests/test_air_insert_env.py tests/test_eval_vla_headless.py tests/test_robot_asset_paths.py roboimi/demos/vla_scripts/eval_vla.py && git commit -m "feat(policy): add scripted air insertion policy"


Task 5: Final verification and implementation review

Files:

  • Review all files touched above

  • Step 1: Run fresh end-to-end verification before claiming completion

Run: /home/droid/.conda/envs/roboimi/bin/python -m unittest tests.test_air_insert_env tests.test_eval_vla_headless tests.test_robot_asset_paths -v

Expected:

  • PASS with 0 failures

  • Step 2: Inspect git status and recent commits

Run: git status --short && git log --oneline --decorate -n 8

Expected:

  • only intended feature files modified / committed

  • Step 3: Request final code review for the completed feature

Use the requesting-code-review skill against the full diff from the feature branch starting point to current HEAD.

  • Step 4: Address any review findings and re-run verification if code changes

If fixes are made, repeat the unittest command from Step 1.

  • Step 5: Hand off using finishing-a-development-branch

After verification and review, use the finishing-a-development-branch skill to decide merge / PR / cleanup.