# LLaMA-CPU - CPU에서 LLaMA를 실행하는 포크

> Clean Markdown view of GeekNews topic #8655. Use the original source for factual precision when an external source URL is present.

## Metadata

- GeekNews HTML: [https://news.hada.io/topic?id=8655](https://news.hada.io/topic?id=8655)
- GeekNews Markdown: [https://news.hada.io/topic/8655.md](https://news.hada.io/topic/8655.md)
- Type: news
- Author: [xguru](https://news.hada.io/@xguru)
- Published: 2023-03-09T11:20:01+09:00
- Updated: 2023-03-09T11:20:01+09:00
- Original source: [github.com/markasoftware](https://github.com/markasoftware/llama-cpu)
- Points: 4
- Comments: 0

## Topic Body

- 메타의 LLaMA모델을 CPU에서 실행   
- 설정은 거의 비슷   
- 7B 모델로 테스트 했을때, 로딩하기 위해서는 32GiB램에서도 스왑/zram 을 필요로 함   
- 실제로 추론할 때는 약 20GiB 이하의 램만 사용   
- Ryzen 7900X 에서 7B모델은 초당 몇 개의 단어 추론 가능

## Comments


_No public comments on this page._