# SlowLlama - Llama2-70b 와 CodeLLama를 M1/M2에서 양자화없이 파인튜닝

> Clean Markdown view of GeekNews topic #11245. Use the original source for factual precision when an external source URL is present.

## Metadata

- GeekNews HTML: [https://news.hada.io/topic?id=11245](https://news.hada.io/topic?id=11245)
- GeekNews Markdown: [https://news.hada.io/topic/11245.md](https://news.hada.io/topic/11245.md)
- Type: news
- Author: [xguru](https://news.hada.io/@xguru)
- Published: 2023-10-09T10:32:01+09:00
- Updated: 2023-10-09T10:32:01+09:00
- Original source: [github.com/okuvshynov](https://github.com/okuvshynov/slowllama)
- Points: 9
- Comments: 0

## Topic Body

- 애플 M1/M2 및 소비자용 nVidia GPU에서 LLama2-70B 같은 모델을 파인튜닝   
- 양자화(quantization)를 사용하는 대신, 포워드/백워드 패스 모두에서 모델의 일부를 SSD또는 메인 메모리로 오프로드 하는 방식   
- 현재 버전을 LoRA를 사용하여 업데이트를 더 작은 매개변수 셋으로 제한   
  - 첫번째 버전은 전체 파인튜닝도 가능했지만 지금은 제거

## Comments


_No public comments on this page._