Files
diffusion_policy/docs/superpowers/plans/2026-03-27-pusht-imf-fullattn-implementation.md
2026-03-27 22:02:31 +08:00

2.9 KiB

PushT iMF Full-Attention Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Add a separate full-attention PushT image iMF config, commit/push it on a new branch, and launch the 9-run 350-epoch architecture sweep across 3 GPUs.

Architecture: Keep the existing causal iMF path untouched and add a standalone full-attention config that only flips policy.causal_attn=false while retaining one-step iMF inference and SwanLab-safe naming. Reuse the previous 9-run architecture matrix and balanced 3-queue scheduling across local 5090 plus 5880 GPU0/GPU1.

Tech Stack: Hydra, Diffusion Policy iMF image workspace, SwanLab, uv env, local shell + trusted remote 5880 over SSH.


Task 1: Add full-attention iMF config with TDD

Files:

  • Create: image_pusht_diffusion_policy_dit_imf_fullattn.yaml

  • Modify: tests/test_pusht_swanlab_config.py

  • Write a failing config regression test asserting the new config uses SwanLab-safe naming and policy.causal_attn == False.

  • Run the targeted pytest command and verify it fails because the config does not exist yet.

  • Add the minimal full-attention config by composing from the existing PushT image iMF config and overriding only exp_name and policy.causal_attn=false.

  • Re-run the targeted pytest and verify it passes.

Task 2: Verify the new config

Files:

  • Read: image_pusht_diffusion_policy_dit_imf_fullattn.yaml

  • Run train.py --help for the new config.

  • Run a real training.debug=true smoke test locally to confirm the training path is valid.

Task 3: Commit and push the new branch

Files:

  • Commit only the new config/test/plan files needed for the full-attention experiment chain.

  • Run verification commands again before commit.

  • Commit with a focused message.

  • Push feat/pusht-imf-fullattn to origin.

Task 4: Launch the 9-run sweep

Files:

  • Write queue scripts and logs under data/run_logs/ locally and on 5880.

  • Write outputs under data/outputs/ locally and on 5880.

  • Use the same matrix as the prior iMF sweep: n_emb ∈ {128,256,384}, n_layer ∈ {6,12,18}, seed=42.

  • Set training.num_epochs=350 for all 9 runs.

  • Encode fullattn in every exp_name, logging.name, and run directory to avoid collisions.

  • Balance the 9 runs across local 5090, 5880 GPU0, and 5880 GPU1 as three serial queues.

  • Sync the new config to the remote smoke repo before launching remote queues.

Task 5: Monitor and auto-summarize

Files:

  • Read local and remote pid files, logs, outputs, checkpoints.

  • Start an xhigh monitoring agent that polls all three queues.

  • On completion, parse all 9 logs.json.txt files and rank by max test_mean_score.

  • Report embedding/layer trends and the best configuration.