# llama.cpp 에 전체 CUDA GPU 가속 추가

> Clean Markdown view of GeekNews topic #9390. Use the original source for factual precision when an external source URL is present.

## Metadata

- GeekNews HTML: [https://news.hada.io/topic?id=9390](https://news.hada.io/topic?id=9390)
- GeekNews Markdown: [https://news.hada.io/topic/9390.md](https://news.hada.io/topic/9390.md)
- Type: news
- Author: [xguru](https://news.hada.io/@xguru)
- Published: 2023-06-14T10:46:02+09:00
- Updated: 2023-06-14T10:46:02+09:00
- Original source: [github.com/ggerganov](https://github.com/ggerganov/llama.cpp/pull/1827)
- Points: 8
- Comments: 0

## Topic Body

- 모든 남은 ggml 텐서들에 GPU 가속을 추가하는 PR   
- RTX 3090에서 프롬프트 처리는 2배, 토큰 생성은 1.3~1.8배까지 가속   
- 4090+i9에서 7B q4 모델의 경우 초당 109토큰 생성

## Comments


_No public comments on this page._