# Llama 2 Chat 70B, 모델 평가에서 ChatGPT(3.5)를 능가

> Clean Markdown view of GeekNews topic #10090. Use the original source for factual precision when an external source URL is present.

## Metadata

- GeekNews HTML: [https://news.hada.io/topic?id=10090](https://news.hada.io/topic?id=10090)
- GeekNews Markdown: [https://news.hada.io/topic/10090.md](https://news.hada.io/topic/10090.md)
- Type: news
- Author: [xguru](https://news.hada.io/@xguru)
- Published: 2023-07-31T10:17:01+09:00
- Updated: 2023-07-31T10:17:01+09:00
- Original source: [tatsu-lab.github.io](https://tatsu-lab.github.io/alpaca_eval/)
- Points: 10
- Comments: 0

## Topic Body

- Instruction-Following 언어 모델을 자동으로 평가하는 AlpacaEval Leaderboard 기준   
- GPT-4 95.28% > Llama Chat 70B 92.66% > Claude 2 91.36% > ChatGPT 89.37%  
- AlpacaEval 은 AlpacaFarm 평가 세트를 이용하여 GPT-4 가 응답한 내용과 비교하여 자동으로 평가를 진행

## Comments


_No public comments on this page._