# VideoLDM - Latent Diffusion Model을 이용한 고해상도 Text-to-Video 합성

> Clean Markdown view of GeekNews topic #9016. Use the original source for factual precision when an external source URL is present.

## Metadata

- GeekNews HTML: [https://news.hada.io/topic?id=9016](https://news.hada.io/topic?id=9016)
- GeekNews Markdown: [https://news.hada.io/topic/9016.md](https://news.hada.io/topic/9016.md)
- Type: news
- Author: [xguru](https://news.hada.io/@xguru)
- Published: 2023-04-22T10:18:01+09:00
- Updated: 2023-04-22T10:18:01+09:00
- Original source: [research.nvidia.com](https://research.nvidia.com/labs/toronto-ai/VideoLDM/)
- Points: 7
- Comments: 0

## Topic Body

- LDM은 압축된 저차원 Latent 공간에서 Diffusion Model을 학습해서 많은 컴퓨팅 리소스없이도 고해상 이미지 합성이 가능함  
- 이 LDM을 고해상 비디오에 적용한 NVidia 의 논문   
- LDM을 이미지 전용으로 사전학습하고, Temporal Dimension을 도입, 인코딩된 이미지 시퀀스를 미제 조정해서 이미지 생성기를 비디오 생성기로 전환   
- 확산모델 업샘플러를 얼라인하여 temporally consistent한 초고해상 비디오 모델로 바꿈

## Comments


_No public comments on this page._