docs: refine pusht imf spec scope
This commit is contained in:
@@ -26,6 +26,10 @@ The work is split into two verified phases:
|
|||||||
- Replace diffusion training with the iMeanFlow training objective.
|
- Replace diffusion training with the iMeanFlow training objective.
|
||||||
- Use one-step inference for validation/rollout in the iMF path.
|
- Use one-step inference for validation/rollout in the iMF path.
|
||||||
|
|
||||||
|
The implementation planning boundary for this spec is:
|
||||||
|
- code changes through a smoke-tested, pushed iMF branch
|
||||||
|
- not the full 3x3 sweep execution/monitoring workflow, which should be planned separately after the code path is verified and pushed
|
||||||
|
|
||||||
## Logging Design
|
## Logging Design
|
||||||
### Scope
|
### Scope
|
||||||
Only the PushT image DiT experiment chain is changed:
|
Only the PushT image DiT experiment chain is changed:
|
||||||
@@ -74,6 +78,7 @@ The iMF transformer mirrors the current transformer policy structure closely eno
|
|||||||
- `u`: average velocity field
|
- `u`: average velocity field
|
||||||
|
|
||||||
The same function is reused at two evaluation points:
|
The same function is reused at two evaluation points:
|
||||||
|
- canonical signature: `fn(z, r, t, cond)`
|
||||||
- `fn(z_t, r, t, cond)` predicts average velocity `u`
|
- `fn(z_t, r, t, cond)` predicts average velocity `u`
|
||||||
- `fn(z_t, t, t, cond)` predicts the instantaneous velocity surrogate `v`
|
- `fn(z_t, t, t, cond)` predicts the instantaneous velocity surrogate `v`
|
||||||
|
|
||||||
@@ -103,7 +108,7 @@ There is **no auxiliary `v` loss** in the initial implementation. The implementa
|
|||||||
Inference uses a single step starting from noise:
|
Inference uses a single step starting from noise:
|
||||||
- initialize `z_1 ~ N(0, I)`
|
- initialize `z_1 ~ N(0, I)`
|
||||||
- set `t = 1.0`, `r = 0.0`
|
- set `t = 1.0`, `r = 0.0`
|
||||||
- predict `u(z_1, t, r, cond)`
|
- predict `u = fn(z_1, r, t, cond)`
|
||||||
- produce the action sample with one update:
|
- produce the action sample with one update:
|
||||||
- `x_hat = z_1 - (t - r) * u`
|
- `x_hat = z_1 - (t - r) * u`
|
||||||
|
|
||||||
@@ -139,17 +144,18 @@ This matches the time direction in the reference iMeanFlow sampling logic.
|
|||||||
3. continue with the iMF implementation
|
3. continue with the iMF implementation
|
||||||
4. once iMF smoke tests pass, create/preserve a dedicated feature branch for the experiment code and push it to Gitea
|
4. once iMF smoke tests pass, create/preserve a dedicated feature branch for the experiment code and push it to Gitea
|
||||||
|
|
||||||
## Experiment Plan
|
## Post-Implementation Experiment Plan
|
||||||
After the iMF path is smoke-tested and pushed:
|
After the iMF path is smoke-tested and pushed, a separate experiment-execution plan should launch:
|
||||||
- run a 3x3 grid over:
|
- run a 3x3 grid over:
|
||||||
- `n_emb ∈ {128, 256, 384}`
|
- `n_emb ∈ {128, 256, 384}`
|
||||||
- `n_layer ∈ {6, 12, 18}`
|
- `n_layer ∈ {6, 12, 18}`
|
||||||
- keep the rest of the setup fixed
|
- keep the rest of the setup fixed
|
||||||
|
- use a fixed single-seed setting for comparability unless a later explicit experiment plan expands that scope
|
||||||
- run each experiment for 300 epochs
|
- run each experiment for 300 epochs
|
||||||
- primary comparison metric: `test_mean_score`
|
- primary comparison metric: `test_mean_score`
|
||||||
|
|
||||||
## Resource Allocation
|
## Post-Implementation Resource Allocation
|
||||||
Three concurrent runs should be scheduled continuously until the matrix is complete:
|
The separate experiment-execution plan should schedule three concurrent runs until the matrix is complete:
|
||||||
- local machine: 1 GPU
|
- local machine: 1 GPU
|
||||||
- `5880`: 2 GPUs
|
- `5880`: 2 GPUs
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user