# AudioGen : 상황 설명 텍스트를 이용한 오디오 생성

> Clean Markdown view of GeekNews topic #7532. Use the original source for factual precision when an external source URL is present.

## Metadata

- GeekNews HTML: [https://news.hada.io/topic?id=7532](https://news.hada.io/topic?id=7532)
- GeekNews Markdown: [https://news.hada.io/topic/7532.md](https://news.hada.io/topic/7532.md)
- Type: news
- Author: [xguru](https://news.hada.io/@xguru)
- Published: 2022-10-04T10:28:48+09:00
- Updated: 2022-10-04T10:28:48+09:00
- Original source: [felixkreuk.github.io](https://felixkreuk.github.io/text2audio_arxiv_samples/)
- Points: 12
- Comments: 0

## Topic Body

- "개가 공원에서 짖음", "바람 부는데 휘파람 부는 소리", "많은 사람들이 환호하는 앞에서 남자가 연설함" 같은 소리를 생성 가능   
- 오디오 생성은 여러가지 도전 과제가 있음   
  - 소리를 내는 객체를 분리하는게 어렵고, 실제 환경의 다양한 녹음 조건으로 더 복잡해지며, 이런 상황에 대한 어노테이션이 부족해서 모델 축적이 어려움   
- 이런 문제를 완화하기 위해 다양한 오디오 샘플을 혼합하고, 모델이 여러 소소를 분리하는 기술을 내부적으로 학습하도록 하는 증강 기술(augmentation technique)을 제안

## Comments


_No public comments on this page._