diff --git a/.idea/DDT.iml b/.idea/DDT.iml index 8b8c395..0f38725 100644 --- a/.idea/DDT.iml +++ b/.idea/DDT.iml @@ -2,7 +2,7 @@ - + diff --git a/README.md b/README.md index 3b12d43..0942ec0 100644 --- a/README.md +++ b/README.md @@ -1 +1,28 @@ -# DDT +# DDT: Decoupled Diffusion Transformer +## Introduction +We decouple diffusion transformer into encoder-decoder design, and surpresingly that a **more substantial encoder yields performance improvements as model size increases**. +![](./figs/main.png) +## Visualizations +![](./figs/teaser.png) +## Usgae +```bash +# for training +python main.py fit -c configs/repa_improved_ddt_xlen22de6_256.yaml +``` + +```bash +# for inference +python main.py predict -c configs/repa_improved_ddt_xlen22de6_256.yaml --ckpt_path=XXX.ckpt +``` +## Reference +```bibtex +@ARTICLE{ddt, + title = "DDT: Decoupled Diffusion Transformer", + author = "Wang, Shuai and Tian, Zhi and Huang, Weilin and Wang, Limin", + month = apr, + year = 2025, + archivePrefix = "arXiv", + primaryClass = "cs.CV", + eprint = "2504.05741" +} +``` \ No newline at end of file diff --git a/figs/main.png b/figs/main.png new file mode 100644 index 0000000..8e5a459 Binary files /dev/null and b/figs/main.png differ diff --git a/figs/teaser.png b/figs/teaser.png new file mode 100644 index 0000000..d732cac Binary files /dev/null and b/figs/teaser.png differ