fix bugs

2025-04-11 13:24:51 +08:00
parent 56ff957581
commit 609cf377cb
2 changed files with 16 additions and 5 deletions
--- a/README.md
+++ b/README.md
@@ -19,8 +19,12 @@ We decouple diffusion transformer into encoder-decoder design, and surpresingly
 ## Visualizations
 ![](./figs/teaser.png)
 ## Checkpoints
-Waiting for release.
+We take the off-shelf [VAE](https://huggingface.co/stabilityai/sd-vae-ft-ema) to encode image into latent space, and train the decoder with DDT.
 | Dataset     | Model             | Params    | FID  | HuggingFace                                              |
 |-------------|-------------------|-----------|------|----------------------------------------------------------|
 | ImageNet256 | DDT-XL/2(22en6de) | 675M | 1.26 | [🤗](https://huggingface.co/MCG-NJU/DDT-XL-22en6de-R256) |
 | ImageNet512 | DDT-XL/2(22en6de) | 675M | 1.28 | [🤗](https://huggingface.co/MCG-NJU/DDT-XL-22en6de-R512) |
 ## Online Demos
 Coming soon.
@@ -30,16 +34,21 @@ We use ADM evaluation suite to report FID.
 # for installation
 pip install -r requirements.txt
 ```
 ```bash
 # for inference
 python main.py predict -c configs/repa_improved_ddt_xlen22de6_256.yaml --ckpt_path=XXX.ckpt
 ```
 ```bash 
 # extract image latent (optional)
 python3 tools/cache_imlatent4.py
 ```
 ```bash
 # for training
 python main.py fit -c configs/repa_improved_ddt_xlen22de6_256.yaml
 ```
-```bash
+
 # for inference
 python main.py predict -c configs/repa_improved_ddt_xlen22de6_256.yaml --ckpt_path=XXX.ckpt
 ```
 ## Reference
 ```bibtex
@ARTICLE{ddt,
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,3 +1,5 @@
 lightning==2.5.0.post0
 omegaconf==2.3.0
 torch==2.5.0
 diffusers==0.30.0
 jsonargparse[signatures]>=4.27.7