# 모든 LLM 개발자가 알아야 하는 숫자

> Clean Markdown view of GeekNews topic #9206. Use the original source for factual precision when an external source URL is present.

## Metadata

- GeekNews HTML: [https://news.hada.io/topic?id=9206](https://news.hada.io/topic?id=9206)
- GeekNews Markdown: [https://news.hada.io/topic/9206.md](https://news.hada.io/topic/9206.md)
- Type: news
- Author: [kuroneko](https://news.hada.io/@kuroneko)
- Published: 2023-05-18T10:45:09+09:00
- Updated: 2023-05-18T10:45:09+09:00
- Original source: [github.com/ray-project](https://github.com/ray-project/llm-numbers)
- Points: 42
- Comments: 2

## Topic Body

- LLM을 사용할 때 중요한 숫자에 대한 정리.  
- "간결하게"를 프롬프트에 넣으면 비용을 40~90% 절약 가능.  
- GPT-4에 비해 GPT-3.5 Turbo는 가격이 50배 저렴함.  
- 벡터 검색을 위해 OpenAI 임베딩을 사용하면 GPT-3.5 Turbo보다 20배 저렴함.  
- LLaMa급 LLM을 교육하는 데에는 백만 달러(약 13억 원)가 들어감.  
- GPU별 메모리 크기 - V100: 16GB, A10G: 24GB, A100: 40/80GB H100: 80GB  
- 보통 모델 크기의 2배의 메모리가 필요함 - 7B = 14GB  
- 임베딩 모델은 보통 1GB 이하의 메모리를 사용함  
- LLM 요청을 일괄 처리하면 10배 이상 빨라질 수 있음.  
- 13B 모델은 토큰당 약 1MB가 필요하여, 요청을 일괄 처리하면 메모리 요구가 크게 증가함.

## Comments


### Comment 16133

- Author: xguru
- Created: 2023-05-18T18:12:01+09:00
- Points: 1

짧게 하는것은 많이 시도해 봤는데 글에서 얘기하는 "be consise" 도 한번 넣어봐야겠어요.

### Comment 16147

- Author: wedding
- Created: 2023-05-20T15:01:14+09:00
- Points: 1
- Parent comment: 16133
- Depth: 1

let's think step by step과 조합도 시도해봐야겠네요.