Removes the `dt_head` network and associated configuration parameters (dt_min, dt_max, lambda_nfe, warmup_epochs). Replaces predicted time steps with a fixed value derived from sequence length. Eliminates the warmup phase and NFE loss calculation.
15 KiB
15 KiB