docs: refine pusht imf spec scope

This commit is contained in:
Logic
2026-03-26 17:02:17 +08:00
parent 15a0c41cbf
commit 23374a4cd2

View File

@@ -26,6 +26,10 @@ The work is split into two verified phases:
- Replace diffusion training with the iMeanFlow training objective.
- Use one-step inference for validation/rollout in the iMF path.
The implementation planning boundary for this spec is:
- code changes through a smoke-tested, pushed iMF branch
- not the full 3x3 sweep execution/monitoring workflow, which should be planned separately after the code path is verified and pushed
## Logging Design
### Scope
Only the PushT image DiT experiment chain is changed:
@@ -74,6 +78,7 @@ The iMF transformer mirrors the current transformer policy structure closely eno
- `u`: average velocity field
The same function is reused at two evaluation points:
- canonical signature: `fn(z, r, t, cond)`
- `fn(z_t, r, t, cond)` predicts average velocity `u`
- `fn(z_t, t, t, cond)` predicts the instantaneous velocity surrogate `v`
@@ -103,7 +108,7 @@ There is **no auxiliary `v` loss** in the initial implementation. The implementa
Inference uses a single step starting from noise:
- initialize `z_1 ~ N(0, I)`
- set `t = 1.0`, `r = 0.0`
- predict `u(z_1, t, r, cond)`
- predict `u = fn(z_1, r, t, cond)`
- produce the action sample with one update:
- `x_hat = z_1 - (t - r) * u`
@@ -139,17 +144,18 @@ This matches the time direction in the reference iMeanFlow sampling logic.
3. continue with the iMF implementation
4. once iMF smoke tests pass, create/preserve a dedicated feature branch for the experiment code and push it to Gitea
## Experiment Plan
After the iMF path is smoke-tested and pushed:
## Post-Implementation Experiment Plan
After the iMF path is smoke-tested and pushed, a separate experiment-execution plan should launch:
- run a 3x3 grid over:
- `n_emb ∈ {128, 256, 384}`
- `n_layer ∈ {6, 12, 18}`
- keep the rest of the setup fixed
- use a fixed single-seed setting for comparability unless a later explicit experiment plan expands that scope
- run each experiment for 300 epochs
- primary comparison metric: `test_mean_score`
## Resource Allocation
Three concurrent runs should be scheduled continuously until the matrix is complete:
## Post-Implementation Resource Allocation
The separate experiment-execution plan should schedule three concurrent runs until the matrix is complete:
- local machine: 1 GPU
- `5880`: 2 GPUs