feat: add vision transfer backbones and IMF variants
This commit is contained in:
69
experiment_suites/2026-04-05-camera-ablation-summary.md
Normal file
69
experiment_suites/2026-04-05-camera-ablation-summary.md
Normal file
@@ -0,0 +1,69 @@
|
||||
# Camera Ablation Summary (`pred_horizon=16`, `num_action_steps=8`, ResNet IMF)
|
||||
|
||||
- Generated: 2026-04-05
|
||||
- Common setup: original ResNet vision backbone, `n_emb=384`, `n_layer=12`, `batch_size=80`, `lr=2.5e-4`, `max_steps=50k`, rollout every 5 epochs with 5 episodes, headless eval.
|
||||
- Metric for comparison: `checkpoints/vla_model_best.pt -> rollout_avg_reward`.
|
||||
|
||||
## Leaderboard
|
||||
|
||||
| Rank | Cameras | Best avg_reward | Best step | Final loss | Run name |
|
||||
|---:|---|---:|---:|---:|---|
|
||||
| 1 | `top + front` | **274.8** | 48124 | 0.0056 | `imf-resnet-topfront-2cam-ph16-ex08-emb384-l12-ms50k-5090-20260405-085023` |
|
||||
| 2 | `top` | **271.2** | 43749 | 0.0052 | `imf-resnet-top-1cam-ph16-ex08-emb384-l12-ms50k-l20g4-20260405-125844` |
|
||||
| 3 | `r_vis + front` | **244.0** | 21874 | 0.0043 | `imf-resnet-frontrvis-2cam-ph16-ex08-emb384-l12-ms50k-l20g1-20260405-102029` |
|
||||
| 4 | `r_vis` | **6.4** | 17499 | 0.0047 | `imf-resnet-rvis-1cam-ph16-ex08-emb384-l12-ms50k-l20g3-20260405-125844` |
|
||||
| 5 | `r_vis + top` | **1.2** | 4374 | 0.0047 | `imf-resnet-rvistop-2cam-ph16-ex08-emb384-l12-ms50k-l20g2-20260405-125844` |
|
||||
| 6 | `front` | **0.0** | 4374 | 0.0074 | `imf-resnet-front-1cam-ph16-ex08-emb384-l12-ms50k-l20g0-20260405-095607` |
|
||||
|
||||
## Main takeaways
|
||||
|
||||
1. **`top` 是最关键的单相机视角**:`top only = 271.2`,几乎与 `top + front = 274.8` 持平。
|
||||
2. **`front` 单独几乎没有效用**:`front only = 0.0`。
|
||||
3. **`r_vis` 单独也基本无效**:`r_vis only = 6.4`。
|
||||
4. **`r_vis + front` 可以显著优于单独 `front` / `r_vis`**,说明这两个视角有一定互补性,但仍明显弱于任何包含 `top` 且表现正常的配置。
|
||||
5. **`r_vis + top` 的结果异常差**:只有 `1.2`,远低于 `top only = 271.2`。这说明简单加入 `r_vis` 并不保证增益,甚至可能破坏当前设置下的学习。
|
||||
6. **训练 loss 与 rollout reward 明显不一致**:例如 `r_vis + top` 和 `r_vis only` 的 final loss 都不高,但 reward 很差,因此本组实验必须以 rollout reward 而不是 loss 选型。
|
||||
|
||||
## Horizontal comparison views
|
||||
|
||||
### Single-camera comparison
|
||||
|
||||
- `top`: **271.2**
|
||||
- `r_vis`: **6.4**
|
||||
- `front`: **0.0**
|
||||
|
||||
结论:**`top >>> r_vis > front`**。
|
||||
|
||||
### Two-camera comparison
|
||||
|
||||
- `top + front`: **274.8**
|
||||
- `r_vis + front`: **244.0**
|
||||
- `r_vis + top`: **1.2**
|
||||
|
||||
结论:
|
||||
- **最稳妥的双相机组合是 `top + front`**。
|
||||
- `r_vis + front` 有效,但不如 `top + front`。
|
||||
- `r_vis + top` 在当前设置下几乎失效。
|
||||
|
||||
### Incremental effect of adding a second view
|
||||
|
||||
- 在 `top` 基础上加 `front`:`271.2 -> 274.8`,**增益很小**。
|
||||
- 在 `front` 基础上加 `r_vis`:`0.0 -> 244.0`,**增益很大**。
|
||||
- 在 `top` 基础上加 `r_vis`:`271.2 -> 1.2`,**显著退化**。
|
||||
|
||||
## Practical recommendation
|
||||
|
||||
如果只从这 6 个实验里选:
|
||||
|
||||
- **首选**:`top + front`
|
||||
- **次选**:`top only`
|
||||
- 如果必须不用 `top`:`r_vis + front` 明显优于 `front only` / `r_vis only`
|
||||
- **不建议**:`r_vis + top`
|
||||
|
||||
## Note relative to previous 3-camera baseline
|
||||
|
||||
此前 3 相机 `[r_vis, top, front]` 的最佳 reward 为 **610.8**。
|
||||
因此这次 6 个 camera ablation 的最佳结果(`top + front = 274.8`)说明:
|
||||
|
||||
- 当前这个训练批次里,**去掉任意一个视角都会显著低于之前的 3 相机最优结果**;
|
||||
- 但在去掉视角的约束下,**`top` 仍然是最核心的保留对象**。
|
||||
@@ -0,0 +1,8 @@
|
||||
# CHECKLIST
|
||||
|
||||
- [x] Confirm remote free GPU
|
||||
- [x] Create front-only run contract
|
||||
- [x] Remote smoke test passes
|
||||
- [x] Launch 50k run on remote GPU0
|
||||
- [x] Record pid / log / SwanLab
|
||||
- [x] Report status back to user
|
||||
28
experiment_suites/2026-04-05-front-only-resnet-1cam/PLAN.md
Normal file
28
experiment_suites/2026-04-05-front-only-resnet-1cam/PLAN.md
Normal file
@@ -0,0 +1,28 @@
|
||||
# PLAN
|
||||
|
||||
## Goal
|
||||
Train a 50k-step IMF baseline with the original ResNet vision backbone, using only the `front` camera as image conditioning.
|
||||
|
||||
## Fixed comparison contract
|
||||
- Same as the active `top/front` run except image input is reduced to `[front]`
|
||||
- Agent: `resnet_imf_attnres`
|
||||
- Vision backbone mode: `resnet`
|
||||
- `pred_horizon=16`, `num_action_steps=8`
|
||||
- `n_emb=384`, `n_layer=12`, `n_head=1`, `n_kv_head=1`
|
||||
- `inference_steps=1`
|
||||
- `batch_size=80`, `lr=2.5e-4`, cosine, warmup=2000
|
||||
- dataset: `/home/droid/sim_dataset/sim_transfer`
|
||||
- cameras: `[front]` only
|
||||
- rollout every 5 epochs with 5 episodes, headless
|
||||
|
||||
## Resource plan
|
||||
- Host: `100.119.99.14`
|
||||
- GPU: `0`
|
||||
|
||||
## Important dimension override
|
||||
- Single-camera visual cond dim = `64 + 16 = 80`, so override `agent.head.cond_dim=80` and `agent.num_cams=1`.
|
||||
|
||||
## Execution path
|
||||
1. 2-step smoke test on remote GPU0.
|
||||
2. If smoke passes, launch 50k main run with SwanLab.
|
||||
3. Record pid / run_dir / log / URL locally.
|
||||
@@ -0,0 +1,6 @@
|
||||
# Notes
|
||||
|
||||
- 2026-04-05 09:55:27: remote 2-step smoke passed on `100.119.99.14` GPU0 with `front` only, batch=80, no OOM.
|
||||
- 2026-04-05 09:56:26: launched main run `imf-resnet-front-1cam-ph16-ex08-emb384-l12-ms50k-l20g0-20260405-095607`.
|
||||
- 2026-04-05 09:57:36: confirmed training is stable through step 200, latest loss 0.2830.
|
||||
- SwanLab: https://swanlab.cn/@game-loader/roboimi-vla/runs/7kdii8oc6tjkcyu5y0lwq
|
||||
@@ -0,0 +1,51 @@
|
||||
{
|
||||
"suite_name": "2026-04-05-front-only-resnet-1cam",
|
||||
"updated_at": "2026-04-05 09:57:36",
|
||||
"phase": "running",
|
||||
"baseline_reference": {
|
||||
"source_run": "imf-resnet-topfront-2cam-ph16-ex08-emb384-l12-ms50k-5090-20260405-085023",
|
||||
"notes": "Same hyperparameters as the active top/front run, but image input is reduced to [front] only."
|
||||
},
|
||||
"smoke_test": {
|
||||
"status": "passed",
|
||||
"host": "100.119.99.14",
|
||||
"gpu": 0,
|
||||
"run_dir": "/home/droid/roboimi_suite_20260404/runs/smoke-frontonly-resnet-ph16-ex08-20260405-095509",
|
||||
"batch_size": 80,
|
||||
"max_steps": 2,
|
||||
"note": "2-step remote CUDA smoke passed on L20 GPU0 without OOM."
|
||||
},
|
||||
"main_run": {
|
||||
"status": "running",
|
||||
"host": "100.119.99.14",
|
||||
"gpu": 0,
|
||||
"launch_pid": 158874,
|
||||
"pid": 158877,
|
||||
"run_name": "imf-resnet-front-1cam-ph16-ex08-emb384-l12-ms50k-l20g0-20260405-095607",
|
||||
"run_dir": "/home/droid/roboimi_suite_20260404/runs/imf-resnet-front-1cam-ph16-ex08-emb384-l12-ms50k-l20g0-20260405-095607",
|
||||
"log_path": "/home/droid/roboimi_suite_20260404/runs/imf-resnet-front-1cam-ph16-ex08-emb384-l12-ms50k-l20g0-20260405-095607/train_vla.log",
|
||||
"launch_log": "/home/droid/roboimi_suite_20260404/experiment_suite_launch_logs/imf-resnet-front-1cam-ph16-ex08-emb384-l12-ms50k-l20g0-20260405-095607.launch.log",
|
||||
"dataset_dir": "/home/droid/sim_dataset/sim_transfer",
|
||||
"camera_names": [
|
||||
"front"
|
||||
],
|
||||
"pred_horizon": 16,
|
||||
"num_action_steps": 8,
|
||||
"head_cond_dim": 80,
|
||||
"head_n_emb": 384,
|
||||
"head_n_layer": 12,
|
||||
"vision_backbone_mode": "resnet",
|
||||
"pretrained_backbone_weights": null,
|
||||
"freeze_backbone": false,
|
||||
"batch_size": 80,
|
||||
"lr": 0.00025,
|
||||
"num_workers": 12,
|
||||
"max_steps": 50000,
|
||||
"rollout_val_freq_epochs": 5,
|
||||
"rollout_num_episodes": 5,
|
||||
"swanlab_url": "https://swanlab.cn/@game-loader/roboimi-vla/runs/7kdii8oc6tjkcyu5y0lwq",
|
||||
"latest_step": 200,
|
||||
"latest_loss": 0.283,
|
||||
"process_running": true
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,8 @@
|
||||
# CHECKLIST
|
||||
|
||||
- [x] Confirm camera mapping (`right` -> `r_vis`)
|
||||
- [x] Create front+r_vis run contract
|
||||
- [x] Remote smoke test passes
|
||||
- [x] Launch 50k run on remote GPU1
|
||||
- [x] Record pid / log / SwanLab
|
||||
- [x] Report status back to user
|
||||
23
experiment_suites/2026-04-05-front-rvis-resnet-2cam/PLAN.md
Normal file
23
experiment_suites/2026-04-05-front-rvis-resnet-2cam/PLAN.md
Normal file
@@ -0,0 +1,23 @@
|
||||
# PLAN
|
||||
|
||||
## Goal
|
||||
Train a 50k-step IMF baseline with the original ResNet vision backbone, using `front` + `r_vis` cameras only.
|
||||
|
||||
## Fixed comparison contract
|
||||
- Same hyperparameters as the active top/front and front-only runs
|
||||
- Agent: `resnet_imf_attnres`
|
||||
- Vision backbone mode: `resnet`
|
||||
- `pred_horizon=16`, `num_action_steps=8`
|
||||
- `n_emb=384`, `n_layer=12`, `n_head=1`, `n_kv_head=1`
|
||||
- `inference_steps=1`
|
||||
- `batch_size=80`, `lr=2.5e-4`, cosine warmup 2000
|
||||
- dataset: `/home/droid/sim_dataset/sim_transfer`
|
||||
- cameras: `[r_vis, front]`
|
||||
- rollout every 5 epochs with 5 episodes, headless
|
||||
|
||||
## Important dimension override
|
||||
- Two-camera visual cond dim = `64*2 + 16 = 144`, so set `agent.num_cams=2`, `agent.head.cond_dim=144`.
|
||||
|
||||
## Resource plan
|
||||
- Host: `100.119.99.14`
|
||||
- GPU: `1`
|
||||
@@ -0,0 +1,6 @@
|
||||
# Notes
|
||||
|
||||
- 2026-04-05 10:20:09: remote 2-step smoke passed on `100.119.99.14` GPU1 with `r_vis + front`, batch=80, no OOM.
|
||||
- 2026-04-05 10:20:49: launched main run `imf-resnet-frontrvis-2cam-ph16-ex08-emb384-l12-ms50k-l20g1-20260405-102029`.
|
||||
- 2026-04-05 10:22:03: confirmed training is stable through step 200, latest loss 0.3321.
|
||||
- SwanLab: https://swanlab.cn/@game-loader/roboimi-vla/runs/3fyzjfdcbiq7frtbqv6ss
|
||||
@@ -0,0 +1,55 @@
|
||||
{
|
||||
"suite_name": "2026-04-05-front-rvis-resnet-2cam",
|
||||
"updated_at": "2026-04-05 10:22:03",
|
||||
"phase": "running",
|
||||
"interpretation": {
|
||||
"right_camera_name": "r_vis"
|
||||
},
|
||||
"baseline_reference": {
|
||||
"source_run": "imf-resnet-topfront-2cam-ph16-ex08-emb384-l12-ms50k-5090-20260405-085023",
|
||||
"notes": "Same hyperparameters as the active top/front run, replacing top with r_vis."
|
||||
},
|
||||
"smoke_test": {
|
||||
"status": "passed",
|
||||
"host": "100.119.99.14",
|
||||
"gpu": 1,
|
||||
"run_dir": "/home/droid/roboimi_suite_20260404/runs/smoke-frontrvis-resnet-ph16-ex08-20260405-102001",
|
||||
"batch_size": 80,
|
||||
"max_steps": 2,
|
||||
"note": "2-step remote CUDA smoke passed on L20 GPU1 without OOM."
|
||||
},
|
||||
"main_run": {
|
||||
"status": "running",
|
||||
"host": "100.119.99.14",
|
||||
"gpu": 1,
|
||||
"launch_pid": 159910,
|
||||
"pid": 159913,
|
||||
"run_name": "imf-resnet-frontrvis-2cam-ph16-ex08-emb384-l12-ms50k-l20g1-20260405-102029",
|
||||
"run_dir": "/home/droid/roboimi_suite_20260404/runs/imf-resnet-frontrvis-2cam-ph16-ex08-emb384-l12-ms50k-l20g1-20260405-102029",
|
||||
"log_path": "/home/droid/roboimi_suite_20260404/runs/imf-resnet-frontrvis-2cam-ph16-ex08-emb384-l12-ms50k-l20g1-20260405-102029/train_vla.log",
|
||||
"launch_log": "/home/droid/roboimi_suite_20260404/experiment_suite_launch_logs/imf-resnet-frontrvis-2cam-ph16-ex08-emb384-l12-ms50k-l20g1-20260405-102029.launch.log",
|
||||
"dataset_dir": "/home/droid/sim_dataset/sim_transfer",
|
||||
"camera_names": [
|
||||
"r_vis",
|
||||
"front"
|
||||
],
|
||||
"pred_horizon": 16,
|
||||
"num_action_steps": 8,
|
||||
"head_cond_dim": 144,
|
||||
"head_n_emb": 384,
|
||||
"head_n_layer": 12,
|
||||
"vision_backbone_mode": "resnet",
|
||||
"pretrained_backbone_weights": null,
|
||||
"freeze_backbone": false,
|
||||
"batch_size": 80,
|
||||
"lr": 0.00025,
|
||||
"num_workers": 12,
|
||||
"max_steps": 50000,
|
||||
"rollout_val_freq_epochs": 5,
|
||||
"rollout_num_episodes": 5,
|
||||
"swanlab_url": "https://swanlab.cn/@game-loader/roboimi-vla/runs/3fyzjfdcbiq7frtbqv6ss",
|
||||
"latest_step": 200,
|
||||
"latest_loss": 0.3321,
|
||||
"process_running": true
|
||||
}
|
||||
}
|
||||
73
experiment_suites/2026-04-05-lewm-vit-transfer/manifest.json
Normal file
73
experiment_suites/2026-04-05-lewm-vit-transfer/manifest.json
Normal file
@@ -0,0 +1,73 @@
|
||||
{
|
||||
"date": "2026-04-06",
|
||||
"branch": "feat-imf-attnres-policy",
|
||||
"worktree": "/home/droid/project/roboimi/.worktrees/feat-imf-attnres-policy",
|
||||
"model": "LEWM ViT frozen visual encoder + IMF AttnRes diffusion head",
|
||||
"checkpoint_path": "/home/droid/le-wm/lewm-sim-transfer/pa1w85md8jop6bvol8oxp/checkpoints/epoch=99-step=47800.ckpt",
|
||||
"visual_contract": {
|
||||
"input_camera_names": ["r_vis", "top", "front"],
|
||||
"fused_camera_names": ["front", "top", "r_vis"],
|
||||
"joint_output_dim": 192,
|
||||
"freeze_backbone": true,
|
||||
"dataset_image_resize_shape": null,
|
||||
"eval_image_resize_shape": [256, 256],
|
||||
"fused_short_side_resize": 224
|
||||
},
|
||||
"training_contract": {
|
||||
"pred_horizon": 16,
|
||||
"num_action_steps": 8,
|
||||
"max_steps": 50000,
|
||||
"rollout_val_freq_epochs": 5,
|
||||
"rollout_num_episodes": 10,
|
||||
"batch_size": 80,
|
||||
"lr": 0.00025,
|
||||
"num_workers": 12,
|
||||
"scheduler_type": "cosine",
|
||||
"warmup_steps": 2000,
|
||||
"min_lr": 1e-06,
|
||||
"weight_decay": 1e-05,
|
||||
"grad_clip": 1.0
|
||||
},
|
||||
"verification": {
|
||||
"local_tests": "38 passed",
|
||||
"remote_dataset_shape": [2, 3, 256, 256],
|
||||
"remote_eval_prepared_shape": [3, 256, 256],
|
||||
"remote_smoke_run": {
|
||||
"run_name": "smoke-lewm-imf-rawpath-emb384-20260406-002002",
|
||||
"result": "passed",
|
||||
"details": "2-step train + checkpoint-triggered 1-episode headless rollout succeeded with corrected raw256 path"
|
||||
}
|
||||
},
|
||||
"superseded_runs": [
|
||||
{
|
||||
"run_name": "lewm-vit-imf-sim-transfer-emb384-l12-ph16-ex08-step50k-roll10-5880g0-20260405-201914",
|
||||
"reason": "stopped due to incorrect early per-camera 224 resize"
|
||||
},
|
||||
{
|
||||
"run_name": "lewm-vit-imf-sim-transfer-emb256-l12-ph16-ex08-step50k-roll10-5880g1-20260405-201914",
|
||||
"reason": "stopped due to incorrect early per-camera 224 resize"
|
||||
}
|
||||
],
|
||||
"full_runs": [
|
||||
{
|
||||
"host": "100.73.14.65",
|
||||
"gpu": 0,
|
||||
"run_name": "lewm-vit-imf-raw256fix-sim-transfer-emb384-l12-ph16-ex08-step50k-roll10-5880g0-20260406-002124",
|
||||
"pid": 1058589,
|
||||
"log_path": "/home/droid/roboimi_suite_20260404/experiment_suite_launch_logs/lewm-vit-imf-raw256fix-sim-transfer-emb384-l12-ph16-ex08-step50k-roll10-5880g0-20260406-002124.launch.log",
|
||||
"swanlab_url": "https://swanlab.cn/@game-loader/roboimi-vla/runs/y5tzgqe0u966w9ak41i31",
|
||||
"head_n_emb": 384,
|
||||
"head_n_layer": 12
|
||||
},
|
||||
{
|
||||
"host": "100.73.14.65",
|
||||
"gpu": 1,
|
||||
"run_name": "lewm-vit-imf-raw256fix-sim-transfer-emb256-l12-ph16-ex08-step50k-roll10-5880g1-20260406-002124",
|
||||
"pid": 1058590,
|
||||
"log_path": "/home/droid/roboimi_suite_20260404/experiment_suite_launch_logs/lewm-vit-imf-raw256fix-sim-transfer-emb256-l12-ph16-ex08-step50k-roll10-5880g1-20260406-002124.launch.log",
|
||||
"swanlab_url": "https://swanlab.cn/@game-loader/roboimi-vla/runs/2esr9y7t2dgesstgrn5i6",
|
||||
"head_n_emb": 256,
|
||||
"head_n_layer": 12
|
||||
}
|
||||
]
|
||||
}
|
||||
25
experiment_suites/2026-04-05-lewm-vit-transfer/notes.md
Normal file
25
experiment_suites/2026-04-05-lewm-vit-transfer/notes.md
Normal file
@@ -0,0 +1,25 @@
|
||||
# 2026-04-06 LEWM ViT Transfer Notes
|
||||
|
||||
## Root-cause fix
|
||||
|
||||
The first LEWM runs were stopped because the data path still resized each camera view to `224x224` **before** multiview fusion. That preserved the final tensor shape but broke the original LEWM geometry.
|
||||
|
||||
Corrected path now is:
|
||||
|
||||
- **Training dataset**: keep stored per-view `256x256` images (`data.image_resize_shape=null` at launch; dataset instantiate override is `None` for LEWM)
|
||||
- **Eval rollout input**: resize live MuJoCo `480x640` camera images to `256x256` per view
|
||||
- **Backbone**: fuse `front, top, r_vis` on the LEWM axis, then resize fused short side to `224`
|
||||
|
||||
## Verification
|
||||
|
||||
- Local tests passed (`38 passed` across the focused suite)
|
||||
- Remote check:
|
||||
- dataset sample image shape: `(2, 3, 256, 256)`
|
||||
- eval-prepared live frame shape: `(3, 256, 256)`
|
||||
- Remote smoke passed with real checkpoint:
|
||||
- `smoke-lewm-imf-rawpath-emb384-20260406-002002`
|
||||
|
||||
## Current runs
|
||||
|
||||
- `lewm-vit-imf-raw256fix-sim-transfer-emb384-l12-ph16-ex08-step50k-roll10-5880g0-20260406-002124`
|
||||
- `lewm-vit-imf-raw256fix-sim-transfer-emb256-l12-ph16-ex08-step50k-roll10-5880g1-20260406-002124`
|
||||
19
experiment_suites/2026-04-05-lewm-vit-transfer/status.json
Normal file
19
experiment_suites/2026-04-05-lewm-vit-transfer/status.json
Normal file
@@ -0,0 +1,19 @@
|
||||
{
|
||||
"status": "running",
|
||||
"updated_at": "2026-04-06T00:22:10+08:00",
|
||||
"remote_host": "100.73.14.65",
|
||||
"runs": [
|
||||
{
|
||||
"run_name": "lewm-vit-imf-raw256fix-sim-transfer-emb384-l12-ph16-ex08-step50k-roll10-5880g0-20260406-002124",
|
||||
"pid": 1058589,
|
||||
"gpu": 0,
|
||||
"state": "running"
|
||||
},
|
||||
{
|
||||
"run_name": "lewm-vit-imf-raw256fix-sim-transfer-emb256-l12-ph16-ex08-step50k-roll10-5880g1-20260406-002124",
|
||||
"pid": 1058590,
|
||||
"gpu": 1,
|
||||
"state": "running"
|
||||
}
|
||||
]
|
||||
}
|
||||
@@ -0,0 +1,7 @@
|
||||
# CHECKLIST
|
||||
|
||||
- [x] Create run contract
|
||||
- [x] Remote smoke test passes
|
||||
- [x] Launch 50k main run
|
||||
- [x] Record pid / log / SwanLab
|
||||
- [x] Report status back to user
|
||||
12
experiment_suites/2026-04-05-rvis-only-resnet-1cam/PLAN.md
Normal file
12
experiment_suites/2026-04-05-rvis-only-resnet-1cam/PLAN.md
Normal file
@@ -0,0 +1,12 @@
|
||||
# PLAN
|
||||
|
||||
## Goal
|
||||
Train a 50k-step IMF baseline with the original ResNet vision backbone using r_vis only as the only image conditioning.
|
||||
|
||||
## Fixed comparison contract
|
||||
- same hyperparameters as the active top/front run
|
||||
- cameras: ['r_vis']
|
||||
- num_cams=1
|
||||
- head.cond_dim=80
|
||||
- host: 100.119.99.14
|
||||
- gpu: 3
|
||||
@@ -0,0 +1,6 @@
|
||||
# Notes
|
||||
|
||||
- 2026-04-05 12:58:22: smoke passed for ['r_vis'] on 100.119.99.14 GPU3.
|
||||
- 2026-04-05 12:59:24: launched main run `imf-resnet-rvis-1cam-ph16-ex08-emb384-l12-ms50k-l20g3-20260405-125844`.
|
||||
- 2026-04-05 13:01:20: latest confirmed progress step=400, loss=0.1165.
|
||||
- SwanLab: https://swanlab.cn/@game-loader/roboimi-vla/runs/qnuh7vln9mqomxxldyecq
|
||||
@@ -0,0 +1,47 @@
|
||||
{
|
||||
"suite_name": "2026-04-05-rvis-only-resnet-1cam",
|
||||
"updated_at": "2026-04-05 13:01:20",
|
||||
"phase": "running",
|
||||
"smoke_test": {
|
||||
"status": "passed",
|
||||
"host": "100.119.99.14",
|
||||
"gpu": 3,
|
||||
"run_dir": "/home/droid/roboimi_suite_20260404/runs/smoke-rvisonly-resnet-ph16-ex08-20260405-125812",
|
||||
"batch_size": 80,
|
||||
"max_steps": 2,
|
||||
"note": "2-step remote CUDA smoke passed without OOM."
|
||||
},
|
||||
"main_run": {
|
||||
"status": "running",
|
||||
"host": "100.119.99.14",
|
||||
"gpu": 3,
|
||||
"launch_pid": 164812,
|
||||
"pid": 164816,
|
||||
"run_name": "imf-resnet-rvis-1cam-ph16-ex08-emb384-l12-ms50k-l20g3-20260405-125844",
|
||||
"run_dir": "/home/droid/roboimi_suite_20260404/runs/imf-resnet-rvis-1cam-ph16-ex08-emb384-l12-ms50k-l20g3-20260405-125844",
|
||||
"log_path": "/home/droid/roboimi_suite_20260404/runs/imf-resnet-rvis-1cam-ph16-ex08-emb384-l12-ms50k-l20g3-20260405-125844/train_vla.log",
|
||||
"launch_log": "/home/droid/roboimi_suite_20260404/experiment_suite_launch_logs/imf-resnet-rvis-1cam-ph16-ex08-emb384-l12-ms50k-l20g3-20260405-125844.launch.log",
|
||||
"dataset_dir": "/home/droid/sim_dataset/sim_transfer",
|
||||
"camera_names": [
|
||||
"r_vis"
|
||||
],
|
||||
"pred_horizon": 16,
|
||||
"num_action_steps": 8,
|
||||
"head_cond_dim": 80,
|
||||
"head_n_emb": 384,
|
||||
"head_n_layer": 12,
|
||||
"vision_backbone_mode": "resnet",
|
||||
"pretrained_backbone_weights": null,
|
||||
"freeze_backbone": false,
|
||||
"batch_size": 80,
|
||||
"lr": 0.00025,
|
||||
"num_workers": 12,
|
||||
"max_steps": 50000,
|
||||
"rollout_val_freq_epochs": 5,
|
||||
"rollout_num_episodes": 5,
|
||||
"swanlab_url": "https://swanlab.cn/@game-loader/roboimi-vla/runs/qnuh7vln9mqomxxldyecq",
|
||||
"latest_step": 400,
|
||||
"latest_loss": 0.1165,
|
||||
"process_running": true
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,7 @@
|
||||
# CHECKLIST
|
||||
|
||||
- [x] Create run contract
|
||||
- [x] Remote smoke test passes
|
||||
- [x] Launch 50k main run
|
||||
- [x] Record pid / log / SwanLab
|
||||
- [x] Report status back to user
|
||||
12
experiment_suites/2026-04-05-rvistop-resnet-2cam/PLAN.md
Normal file
12
experiment_suites/2026-04-05-rvistop-resnet-2cam/PLAN.md
Normal file
@@ -0,0 +1,12 @@
|
||||
# PLAN
|
||||
|
||||
## Goal
|
||||
Train a 50k-step IMF baseline with the original ResNet vision backbone using r_vis + top as the only image conditioning.
|
||||
|
||||
## Fixed comparison contract
|
||||
- same hyperparameters as the active top/front run
|
||||
- cameras: ['r_vis', 'top']
|
||||
- num_cams=2
|
||||
- head.cond_dim=144
|
||||
- host: 100.119.99.14
|
||||
- gpu: 2
|
||||
@@ -0,0 +1,6 @@
|
||||
# Notes
|
||||
|
||||
- 2026-04-05 12:58:22: smoke passed for ['r_vis', 'top'] on 100.119.99.14 GPU2.
|
||||
- 2026-04-05 12:59:24: launched main run `imf-resnet-rvistop-2cam-ph16-ex08-emb384-l12-ms50k-l20g2-20260405-125844`.
|
||||
- 2026-04-05 13:01:20: latest confirmed progress step=200, loss=0.2845.
|
||||
- SwanLab: https://swanlab.cn/@game-loader/roboimi-vla/runs/umsm6402eb81et7wx7z4a
|
||||
48
experiment_suites/2026-04-05-rvistop-resnet-2cam/status.json
Normal file
48
experiment_suites/2026-04-05-rvistop-resnet-2cam/status.json
Normal file
@@ -0,0 +1,48 @@
|
||||
{
|
||||
"suite_name": "2026-04-05-rvistop-resnet-2cam",
|
||||
"updated_at": "2026-04-05 13:01:20",
|
||||
"phase": "running",
|
||||
"smoke_test": {
|
||||
"status": "passed",
|
||||
"host": "100.119.99.14",
|
||||
"gpu": 2,
|
||||
"run_dir": "/home/droid/roboimi_suite_20260404/runs/smoke-rvistop-resnet-ph16-ex08-20260405-125812",
|
||||
"batch_size": 80,
|
||||
"max_steps": 2,
|
||||
"note": "2-step remote CUDA smoke passed without OOM."
|
||||
},
|
||||
"main_run": {
|
||||
"status": "running",
|
||||
"host": "100.119.99.14",
|
||||
"gpu": 2,
|
||||
"launch_pid": 164745,
|
||||
"pid": 164749,
|
||||
"run_name": "imf-resnet-rvistop-2cam-ph16-ex08-emb384-l12-ms50k-l20g2-20260405-125844",
|
||||
"run_dir": "/home/droid/roboimi_suite_20260404/runs/imf-resnet-rvistop-2cam-ph16-ex08-emb384-l12-ms50k-l20g2-20260405-125844",
|
||||
"log_path": "/home/droid/roboimi_suite_20260404/runs/imf-resnet-rvistop-2cam-ph16-ex08-emb384-l12-ms50k-l20g2-20260405-125844/train_vla.log",
|
||||
"launch_log": "/home/droid/roboimi_suite_20260404/experiment_suite_launch_logs/imf-resnet-rvistop-2cam-ph16-ex08-emb384-l12-ms50k-l20g2-20260405-125844.launch.log",
|
||||
"dataset_dir": "/home/droid/sim_dataset/sim_transfer",
|
||||
"camera_names": [
|
||||
"r_vis",
|
||||
"top"
|
||||
],
|
||||
"pred_horizon": 16,
|
||||
"num_action_steps": 8,
|
||||
"head_cond_dim": 144,
|
||||
"head_n_emb": 384,
|
||||
"head_n_layer": 12,
|
||||
"vision_backbone_mode": "resnet",
|
||||
"pretrained_backbone_weights": null,
|
||||
"freeze_backbone": false,
|
||||
"batch_size": 80,
|
||||
"lr": 0.00025,
|
||||
"num_workers": 12,
|
||||
"max_steps": 50000,
|
||||
"rollout_val_freq_epochs": 5,
|
||||
"rollout_num_episodes": 5,
|
||||
"swanlab_url": "https://swanlab.cn/@game-loader/roboimi-vla/runs/umsm6402eb81et7wx7z4a",
|
||||
"latest_step": 200,
|
||||
"latest_loss": 0.2845,
|
||||
"process_running": true
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,8 @@
|
||||
# CHECKLIST
|
||||
|
||||
- [x] Confirm baseline hyperparameters from trusted prior run
|
||||
- [x] Confirm local GPU availability
|
||||
- [x] Smoke test with `top/front` cameras only
|
||||
- [x] Launch 50k run
|
||||
- [x] Record pid / run dir / log path / SwanLab URL
|
||||
- [x] Report status back to user
|
||||
30
experiment_suites/2026-04-05-top-front-resnet-2cam/PLAN.md
Normal file
30
experiment_suites/2026-04-05-top-front-resnet-2cam/PLAN.md
Normal file
@@ -0,0 +1,30 @@
|
||||
# PLAN
|
||||
|
||||
## Goal
|
||||
Train a 50k-step IMF baseline with the original ResNet vision backbone (no full-AttnRes vision replacement), using only `top` and `front` cameras as image conditioning.
|
||||
|
||||
## Fixed comparison contract
|
||||
- Agent: `resnet_imf_attnres`
|
||||
- Vision backbone mode: `resnet`
|
||||
- `pred_horizon=16`
|
||||
- `num_action_steps=8`
|
||||
- `n_emb=384`, `n_layer=12`, `n_head=1`, `n_kv_head=1`
|
||||
- `inference_steps=1`
|
||||
- `batch_size=80`, `lr=2.5e-4`, cosine scheduler, warmup 2000
|
||||
- dataset: `/home/droid/project/diana_sim/sim_transfer`
|
||||
- cameras: `[top, front]` only
|
||||
- training budget: `max_steps=50000`
|
||||
- rollout validation: every 5 epochs, 5 episodes, headless
|
||||
|
||||
## Resource plan
|
||||
- Host: local
|
||||
- GPU: RTX 5090 (GPU 0)
|
||||
|
||||
## Execution path
|
||||
1. Run a short 2-step smoke test on GPU with the exact 2-camera config.
|
||||
2. If smoke passes, launch the 50k main run with durable log redirection.
|
||||
3. Record run name, pid, log path, and SwanLab URL into suite status.
|
||||
|
||||
## Fallbacks
|
||||
- If batch 80 OOMs, fall back to batch 64 with scaled lr 2.0e-4.
|
||||
- If dataloader startup is unstable, reduce num_workers from 12 to 8.
|
||||
@@ -0,0 +1,5 @@
|
||||
# Notes
|
||||
|
||||
- 2026-04-05 08:50:04: 2-step smoke test passed locally on RTX 5090 with `top/front` cameras, batch=80, no OOM.
|
||||
- 2026-04-05 08:50:42: launched main run `imf-resnet-topfront-2cam-ph16-ex08-emb384-l12-ms50k-5090-20260405-085023` on local GPU0.
|
||||
- SwanLab: https://swanlab.cn/@game-loader/roboimi-vla/runs/vi77mn5dwd19z4nttxab8
|
||||
@@ -0,0 +1,51 @@
|
||||
{
|
||||
"suite_name": "2026-04-05-top-front-resnet-2cam",
|
||||
"updated_at": "2026-04-05 08:52:12",
|
||||
"phase": "running",
|
||||
"baseline_reference": {
|
||||
"source_run": "imf-p1-ph16-ex08-emb384-l12-ms50k-5880g1-20260404-131223",
|
||||
"best_rollout_avg_reward": 610.8,
|
||||
"best_step": 21874,
|
||||
"notes": "Same IMF baseline as Phase-1 best, but switch cameras from [r_vis, top, front] to [top, front] and keep the original ResNet vision backbone."
|
||||
},
|
||||
"smoke_test": {
|
||||
"status": "passed",
|
||||
"run_dir": "/home/droid/project/roboimi/.worktrees/feat-imf-attnres-policy/runs/smoke-topfront-resnet-ph16-ex08-20260405-085000",
|
||||
"batch_size": 80,
|
||||
"num_workers": 4,
|
||||
"max_steps": 2,
|
||||
"note": "2-step local CUDA smoke passed without OOM using top/front only."
|
||||
},
|
||||
"main_run": {
|
||||
"status": "running",
|
||||
"host": "local",
|
||||
"gpu": 0,
|
||||
"pid": 1693348,
|
||||
"run_name": "imf-resnet-topfront-2cam-ph16-ex08-emb384-l12-ms50k-5090-20260405-085023",
|
||||
"run_dir": "/home/droid/project/roboimi/.worktrees/feat-imf-attnres-policy/runs/imf-resnet-topfront-2cam-ph16-ex08-emb384-l12-ms50k-5090-20260405-085023",
|
||||
"log_path": "/home/droid/project/roboimi/.worktrees/feat-imf-attnres-policy/runs/imf-resnet-topfront-2cam-ph16-ex08-emb384-l12-ms50k-5090-20260405-085023/train_vla.log",
|
||||
"launch_log": "/home/droid/project/roboimi/.worktrees/feat-imf-attnres-policy/experiment_suites/2026-04-05-top-front-resnet-2cam/launch_logs/imf-resnet-topfront-2cam-ph16-ex08-emb384-l12-ms50k-5090-20260405-085023.launch.log",
|
||||
"dataset_dir": "/home/droid/project/diana_sim/sim_transfer",
|
||||
"camera_names": [
|
||||
"top",
|
||||
"front"
|
||||
],
|
||||
"pred_horizon": 16,
|
||||
"num_action_steps": 8,
|
||||
"head_n_emb": 384,
|
||||
"head_n_layer": 12,
|
||||
"vision_backbone_mode": "resnet",
|
||||
"pretrained_backbone_weights": null,
|
||||
"freeze_backbone": false,
|
||||
"batch_size": 80,
|
||||
"lr": 0.00025,
|
||||
"num_workers": 12,
|
||||
"max_steps": 50000,
|
||||
"rollout_val_freq_epochs": 5,
|
||||
"rollout_num_episodes": 5,
|
||||
"swanlab_url": "https://swanlab.cn/@game-loader/roboimi-vla/runs/vi77mn5dwd19z4nttxab8",
|
||||
"latest_step": 500,
|
||||
"latest_loss": 0.0978,
|
||||
"process_running": true
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,7 @@
|
||||
# CHECKLIST
|
||||
|
||||
- [x] Create run contract
|
||||
- [x] Remote smoke test passes
|
||||
- [x] Launch 50k main run
|
||||
- [x] Record pid / log / SwanLab
|
||||
- [x] Report status back to user
|
||||
12
experiment_suites/2026-04-05-top-only-resnet-1cam/PLAN.md
Normal file
12
experiment_suites/2026-04-05-top-only-resnet-1cam/PLAN.md
Normal file
@@ -0,0 +1,12 @@
|
||||
# PLAN
|
||||
|
||||
## Goal
|
||||
Train a 50k-step IMF baseline with the original ResNet vision backbone using top only as the only image conditioning.
|
||||
|
||||
## Fixed comparison contract
|
||||
- same hyperparameters as the active top/front run
|
||||
- cameras: ['top']
|
||||
- num_cams=1
|
||||
- head.cond_dim=80
|
||||
- host: 100.119.99.14
|
||||
- gpu: 4
|
||||
@@ -0,0 +1,6 @@
|
||||
# Notes
|
||||
|
||||
- 2026-04-05 12:58:22: smoke passed for ['top'] on 100.119.99.14 GPU4.
|
||||
- 2026-04-05 12:59:24: launched main run `imf-resnet-top-1cam-ph16-ex08-emb384-l12-ms50k-l20g4-20260405-125844`.
|
||||
- 2026-04-05 13:01:20: latest confirmed progress step=400, loss=0.1233.
|
||||
- SwanLab: https://swanlab.cn/@game-loader/roboimi-vla/runs/egzo29l3z9ftsaunhf025
|
||||
@@ -0,0 +1,47 @@
|
||||
{
|
||||
"suite_name": "2026-04-05-top-only-resnet-1cam",
|
||||
"updated_at": "2026-04-05 13:01:20",
|
||||
"phase": "running",
|
||||
"smoke_test": {
|
||||
"status": "passed",
|
||||
"host": "100.119.99.14",
|
||||
"gpu": 4,
|
||||
"run_dir": "/home/droid/roboimi_suite_20260404/runs/smoke-toponly-resnet-ph16-ex08-20260405-125812",
|
||||
"batch_size": 80,
|
||||
"max_steps": 2,
|
||||
"note": "2-step remote CUDA smoke passed without OOM."
|
||||
},
|
||||
"main_run": {
|
||||
"status": "running",
|
||||
"host": "100.119.99.14",
|
||||
"gpu": 4,
|
||||
"launch_pid": 164808,
|
||||
"pid": 164813,
|
||||
"run_name": "imf-resnet-top-1cam-ph16-ex08-emb384-l12-ms50k-l20g4-20260405-125844",
|
||||
"run_dir": "/home/droid/roboimi_suite_20260404/runs/imf-resnet-top-1cam-ph16-ex08-emb384-l12-ms50k-l20g4-20260405-125844",
|
||||
"log_path": "/home/droid/roboimi_suite_20260404/runs/imf-resnet-top-1cam-ph16-ex08-emb384-l12-ms50k-l20g4-20260405-125844/train_vla.log",
|
||||
"launch_log": "/home/droid/roboimi_suite_20260404/experiment_suite_launch_logs/imf-resnet-top-1cam-ph16-ex08-emb384-l12-ms50k-l20g4-20260405-125844.launch.log",
|
||||
"dataset_dir": "/home/droid/sim_dataset/sim_transfer",
|
||||
"camera_names": [
|
||||
"top"
|
||||
],
|
||||
"pred_horizon": 16,
|
||||
"num_action_steps": 8,
|
||||
"head_cond_dim": 80,
|
||||
"head_n_emb": 384,
|
||||
"head_n_layer": 12,
|
||||
"vision_backbone_mode": "resnet",
|
||||
"pretrained_backbone_weights": null,
|
||||
"freeze_backbone": false,
|
||||
"batch_size": 80,
|
||||
"lr": 0.00025,
|
||||
"num_workers": 12,
|
||||
"max_steps": 50000,
|
||||
"rollout_val_freq_epochs": 5,
|
||||
"rollout_num_episodes": 5,
|
||||
"swanlab_url": "https://swanlab.cn/@game-loader/roboimi-vla/runs/egzo29l3z9ftsaunhf025",
|
||||
"latest_step": 400,
|
||||
"latest_loss": 0.1233,
|
||||
"process_running": true
|
||||
}
|
||||
}
|
||||
Reference in New Issue
Block a user