1.1 KiB
1.1 KiB
PLAN
Goal
Train a 50k-step IMF baseline with the original ResNet vision backbone (no full-AttnRes vision replacement), using only top and front cameras as image conditioning.
Fixed comparison contract
- Agent:
resnet_imf_attnres - Vision backbone mode:
resnet pred_horizon=16num_action_steps=8n_emb=384,n_layer=12,n_head=1,n_kv_head=1inference_steps=1batch_size=80,lr=2.5e-4, cosine scheduler, warmup 2000- dataset:
/home/droid/project/diana_sim/sim_transfer - cameras:
[top, front]only - training budget:
max_steps=50000 - rollout validation: every 5 epochs, 5 episodes, headless
Resource plan
- Host: local
- GPU: RTX 5090 (GPU 0)
Execution path
- Run a short 2-step smoke test on GPU with the exact 2-camera config.
- If smoke passes, launch the 50k main run with durable log redirection.
- Record run name, pid, log path, and SwanLab URL into suite status.
Fallbacks
- If batch 80 OOMs, fall back to batch 64 with scaled lr 2.0e-4.
- If dataloader startup is unstable, reduce num_workers from 12 to 8.