docs(spec): add sim air insert ring bar design

2026-04-23 16:33:58 +08:00
parent b76bcd8b37
commit 27f4a07632
1 changed files with 306 additions and 0 deletions
--- a/docs/superpowers/specs/2026-04-23-sim-air-insert-ring-bar-design.md
+++ b/docs/superpowers/specs/2026-04-23-sim-air-insert-ring-bar-design.md
@@ -0,0 +1,306 @@
+# sim_air_insert_ring_bar Design
+
+## Summary
+
+Add a new independent MuJoCo simulation task named `sim_air_insert_ring_bar` that keeps the existing dual-Diana tabletop setup but replaces the single transfer box with two randomized objects:
+
+- a square ring block grasped by the left arm
+- a square bar block grasped by the right arm
+
+The task is to pick both objects off the table and complete an in-air insertion where the bar truly passes through the ring aperture. The existing `sim_transfer` task must remain unchanged.
+
+## Goals
+
+- Reuse the current dual-Diana EE-control simulation stack
+- Keep the same table/base robot arrangement as the existing transfer task
+- Add an independent task entrypoint and scene definition
+- Randomize planar placement of both objects within left/right task-specific regions
+- Implement reward staging for contact, lift, and successful in-air insertion
+- Add a scripted policy that performs pick, lift, align, and in-air insertion
+- Preserve compatibility with existing environment creation, evaluation, and rollout patterns
+
+## Non-Goals
+
+- No random yaw in the first version
+- No visual servoing or closed-loop insertion controller
+- No general multi-task environment framework refactor
+- No guarantee that the VLA training stack is immediately tuned for this new task
+- No replacement or behavior change for `sim_transfer`
+
+## Task Name
+
+Use a new task name:
+
+- `sim_air_insert_ring_bar`
+
+This task should be exposed alongside `sim_transfer`, not as a replacement.
+
+## Scene Geometry
+
+### Shared Base Scene
+
+Keep the dual Diana robot, the table, and the existing camera layout conceptually unchanged.
+
+### Ring Block
+
+Represent the square ring as a rigid free body composed from simple MuJoCo box geoms rather than an external mesh.
+
+Dimensions:
+
+- outer side length: 68 mm
+- inner aperture side length: 32 mm
+- thickness: 18 mm
+- ring wall width: 18 mm
+
+The ring should behave as a single object body with a single free joint.
+
+### Bar Block
+
+Represent the bar as a rigid free body with a single box geom.
+
+Dimensions:
+
+- length: 90 mm
+- cross-section: 18 mm x 18 mm
+
+The bar should also be a single free-joint body.
+
+## Initial Placement / Reset
+
+The first version uses position-only randomization with fixed orientation.
+
+- ring block: randomized only in a left-side planar sampling region
+- bar block: randomized only in a right-side planar sampling region
+- both objects start flat on the table
+- both objects use fixed orientation at reset
+- no random yaw, tilt, or flip in this version
+
+The sampling regions should be chosen conservatively so that:
+
+- the left arm can comfortably reach and grasp the ring
+- the right arm can comfortably reach and grasp the bar
+- scripted open-loop pick trajectories remain feasible
+
+## Control / Action Interface
+
+Reuse the current 16D EE-space action convention already used by the dual-Diana position-control environment:
+
+- left arm EE pose: 7D (`xyz + quat`)
+- right arm EE pose: 7D (`xyz + quat`)
+- left gripper command: 1D
+- right gripper command: 1D
+
+The new task should continue using EE targets transformed through the existing IK-based control path.
+
+## Environment Structure
+
+Implement this as a new task-specific environment path while reusing the existing dual-Diana simulation base where possible.
+
+Expected responsibilities:
+
+- scene instantiation for the ring+bar setup
+- task reset for randomized object placement
+- environment-state accessors for both objects
+- reward computation
+- in-air insertion success detection
+
+The environment factory must dispatch by task name and leave the `sim_transfer` branch unchanged.
+
+## Observation / Environment State
+
+The task should retain the current observation structure style used by the dual-Diana environment:
+
+- `qpos`
+- multi-camera images
+
+For task state access, the environment should expose at least the pose information needed to reason about both objects:
+
+- ring position
+- ring orientation if needed for insertion checks / debugging
+- bar position
+- bar orientation if needed for insertion checks / debugging
+
+This state should be sufficient for scripted-policy debugging and future rollout analysis.
+
+## Reward Design
+
+Use staged rewards in the same spirit as the current task, returning the highest achieved stage rather than accumulating one-time sparse bonuses per event.
+
+Maximum reward:
+
+- `max_reward = 5`
+
+Reward stages:
+
+1. left gripper touches the ring block
+2. right gripper touches the bar block
+3. ring block is lifted off the table
+4. bar block is lifted off the table
+5. while both objects are off the table, the bar truly passes through the ring aperture
+
+Notes:
+
+- contact rewards are intended as grasp-progress stages
+- lift rewards require the object to be off the table, not merely touched
+- final success reward only applies when both objects are airborne
+
+## Success Detection
+
+Success must **not** be based on a centerline-only check.
+
+A centerline-only test is insufficient because:
+
+- the bar has thickness, so a centerline can pass through while the body cannot
+- a square bar with imperfect orientation can have its centerline inside the aperture while its corners still collide with the ring
+
+### Required Success Semantics
+
+A successful insertion requires all of the following:
+
+1. the ring is off the table
+2. the bar is off the table
+3. the bar has actually crossed through the ring thickness direction
+4. the bar’s finite square cross-section fits through the square aperture during that crossing
+
+### Recommended Detection Approach
+
+Use a task-level geometric check in Python rather than relying on contact alone.
+
+Implementation intent:
+
+- transform the bar geometry into the ring’s local frame
+- reason about the bar as a finite oriented box (not a line)
+- verify that the bar has crossed the ring thickness direction
+- verify that the portion of the bar passing the aperture fits within the inner square opening, accounting for the bar’s cross-section and orientation
+
+This geometric check is the primary success test.
+
+### Role of Contacts
+
+Contacts may still be used for:
+
+- grasp-stage rewards
+- debugging / diagnostics
+
+But contact alone should **not** be the sole criterion for insertion success, since:
+
+- a true clean insertion may have limited aperture-wall contact
+- persistent contact can also happen while the bar is jammed and not actually inserted
+
+## Scripted Policy
+
+Add a new task-specific scripted policy for `sim_air_insert_ring_bar`.
+
+### Policy Intent
+
+The first version prioritizes a conservative, reliable open-loop demonstration rather than an optimized trajectory.
+
+### Action Phases
+
+The scripted policy should follow these phases:
+
+1. move both arms to safe initial / waiting poses with grippers open
+2. move left arm above the ring and right arm above the bar
+3. descend and grasp the assigned objects
+4. lift both objects clear of the table
+5. move both objects to an airborne meeting region above the table
+6. hold the ring stably while aligning the bar with the aperture
+7. push the bar along the intended insertion direction until the geometric success condition is met
+
+### Grasp Assignment
+
+- left arm: ring only
+- right arm: bar only
+
+### Motion Style
+
+Keep the current repository style:
+
+- waypoint-based trajectory definition
+- open-loop interpolation between waypoints
+- fixed grasp orientation in the first version
+
+No adaptive replanning is required for the first version.
+
+## Files / Integration Scope
+
+The implementation is expected to add task-specific files rather than broadly refactoring the codebase.
+
+Likely additions / changes:
+
+- a new MuJoCo scene XML for the ring+bar task
+- one or more XML fragments defining the two new objects
+- a new task-specific dual-Diana environment file
+- robot asset wiring for the new scene XML
+- reset sampling helpers for the new task
+- task registration in constants / environment factory paths
+- a new scripted policy file
+- focused tests for task creation, reset, rewards, success detection, and scripted policy shape/smoke behavior
+
+## Testing Requirements
+
+At minimum, add regression coverage for:
+
+### Environment Creation
+
+- the new task can be created via the task factory
+- the existing `sim_transfer` task remains unchanged
+
+### Reset / Sampling
+
+- ring reset positions are inside the left sampling region
+- bar reset positions are inside the right sampling region
+- reset orientation is fixed as intended
+
+### Environment State
+
+- environment-state access returns both object poses in the expected structure
+
+### Success Detection
+
+Must include both positive and negative cases.
+
+Positive case:
+
+- a configuration where the finite bar truly passes through the ring aperture is detected as success
+
+Negative cases:
+
+- centerline-inside but finite body would clip the aperture
+- not enough depth / not actually crossing the ring thickness direction
+- one or both objects still on the table
+
+### Reward Logic
+
+- left contact stage
+- right contact stage
+- ring lift stage
+- bar lift stage
+- final success stage with `max_reward = 5`
+
+### Scripted Policy
+
+At minimum:
+
+- policy emits valid 16D actions
+- trajectory generation does not error
+- rollout smoke path can step through the new environment
+
+## Risks / Constraints
+
+- MuJoCo contact naming must remain stable enough for stage rewards
+- geometric insertion checks must be strict enough to avoid false positives but not so brittle that numerically valid insertions are missed
+- scripted open-loop insertion may require conservative alignment and lift heights to keep the first version reliable
+
+## Acceptance Criteria
+
+The feature is complete when all of the following are true:
+
+- `sim_air_insert_ring_bar` is creatable as an independent task
+- the scene contains the dual Diana, table, ring block, and bar block
+- reset randomizes ring and bar positions in left/right planar regions with fixed orientation
+- the environment exposes task state for both objects
+- staged rewards progress to `max_reward = 5`
+- final success is based on finite-geometry insertion semantics, not a centerline-only shortcut
+- a new scripted policy can execute the intended pick-lift-align-insert behavior in the new environment
+- existing `sim_transfer` behavior is preserved