Explore Help

Logic/diffusion_policy

1

0

You've already forked diffusion_policy

Code Issues Pull Requests Actions Packages Projects Releases Wiki Activity

Files

feat/pusht-imf-fullattn

diffusion_policy/docs/superpowers/plans/2026-03-27-pusht-imf-fullattn-implementation.md

Logic 78ab18e8f3 feat: add pusht imf full-attention config

2026-03-27 22:02:31 +08:00

2.9 KiB

Raw Permalink Blame History

PushT iMF Full-Attention Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Add a separate full-attention PushT image iMF config, commit/push it on a new branch, and launch the 9-run 350-epoch architecture sweep across 3 GPUs.

Architecture: Keep the existing causal iMF path untouched and add a standalone full-attention config that only flips policy.causal_attn=false while retaining one-step iMF inference and SwanLab-safe naming. Reuse the previous 9-run architecture matrix and balanced 3-queue scheduling across local 5090 plus 5880 GPU0/GPU1.

Tech Stack: Hydra, Diffusion Policy iMF image workspace, SwanLab, uv env, local shell + trusted remote 5880 over SSH.

Task 1: Add full-attention iMF config with TDD

Files:

Create: image_pusht_diffusion_policy_dit_imf_fullattn.yaml
Modify: tests/test_pusht_swanlab_config.py
Write a failing config regression test asserting the new config uses SwanLab-safe naming and policy.causal_attn == False.
Run the targeted pytest command and verify it fails because the config does not exist yet.
Add the minimal full-attention config by composing from the existing PushT image iMF config and overriding only exp_name and policy.causal_attn=false.
Re-run the targeted pytest and verify it passes.

Task 2: Verify the new config

Files:

Read: image_pusht_diffusion_policy_dit_imf_fullattn.yaml
Run train.py --help for the new config.
Run a real training.debug=true smoke test locally to confirm the training path is valid.

Task 3: Commit and push the new branch

Files:

Commit only the new config/test/plan files needed for the full-attention experiment chain.
Run verification commands again before commit.
Commit with a focused message.
Push feat/pusht-imf-fullattn to origin.

Task 4: Launch the 9-run sweep

Files:

Write queue scripts and logs under data/run_logs/ locally and on 5880.
Write outputs under data/outputs/ locally and on 5880.
Use the same matrix as the prior iMF sweep: n_emb ∈ {128,256,384}, n_layer ∈ {6,12,18}, seed=42.
Set training.num_epochs=350 for all 9 runs.
Encode fullattn in every exp_name, logging.name, and run directory to avoid collisions.
Balance the 9 runs across local 5090, 5880 GPU0, and 5880 GPU1 as three serial queues.
Sync the new config to the remote smoke repo before launching remote queues.

Task 5: Monitor and auto-summarize

Files:

Read local and remote pid files, logs, outputs, checkpoints.
Start an xhigh monitoring agent that polls all three queues.
On completion, parse all 9 logs.json.txt files and rank by max test_mean_score.
Report embedding/layer trends and the best configuration.

Reference in New Issue View Git Blame Copy Permalink

Powered by Gitea Version: 1.25.3 Page: 56ms Template: 4ms

English

Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語简体中文繁體中文（台灣）繁體中文（香港） 한국어

Licenses API