# Llama Stack 0.21 릴리즈 - Llama 4 지원

> Clean Markdown view of GeekNews topic #20242. Use the original source for factual precision when an external source URL is present.

## Metadata

- GeekNews HTML: [https://news.hada.io/topic?id=20242](https://news.hada.io/topic?id=20242)
- GeekNews Markdown: [https://news.hada.io/topic/20242.md](https://news.hada.io/topic/20242.md)
- Type: news
- Author: [xguru](https://news.hada.io/@xguru)
- Published: 2025-04-10T09:31:01+09:00
- Updated: 2025-04-10T09:31:01+09:00
- Original source: [github.com/meta-llama](https://github.com/meta-llama/llama-stack)
- Points: 5
- Comments: 0

## Summary

**Llama Stack**은 생성형 AI 애플리케이션을 위한 표준화된 프레임워크로, 다양한 서비스 제공자의 구현체를 기반으로 통일된 API 레이어를 제공합니다. 이 프레임워크는 **추론, RAG, 에이전트, 툴, 안전성, 평가, 텔레메트리** 등을 위한 통합 API를 제공하며, **플러그인 아키텍처**를 통해 다양한 환경을 지원합니다. **서버와 클라이언트 SDK**로 구성되어 있으며, 여러 프로그래밍 언어와 환경에서 사용할 수 있는 다양한 클라이언트 SDK를 제공합니다. 또한, **안전성 API**를 통해 AI 응답의 안전성을 보장하며, 다양한 구현체를 지원합니다.

## Topic Body

- Meta의 **Llama Stack**은 생성형 AI 애플리케이션을 위한 핵심 구성 요소를 표준화한 프레임워크  
- 다양한 서비스 제공자의 구현체를 기반으로 통일된 API 레이어 제공  
- 개발 환경에서 프로덕션 환경으로 전환할 때 **개발자 경험의 일관성** 보장  
- 주요 구성 요소:  
  - **추론, RAG, 에이전트, 툴, 안전성(Safety), 평가(Evals), 텔레메트리(Telemetry)** 등을 위한 통합 API  
  - **플러그인 아키텍처**로 다양한 환경(로컬, 온프레미스, 클라우드, 모바일) 지원  
  - **검증된 배포판(distribution)** 을 통해 빠르고 안정적으로 시작 가능  
  - **CLI 및 SDK(Python, Node.js, iOS, Android)** 등 다양한 개발자 인터페이스 제공  
  - 프로덕션 수준의 애플리케이션 예시 제공  
  
### Llama Stack 작동 방식  
  
- Llama Stack은 **서버 + 클라이언트 SDK**로 구성됨  
  - 서버는 로컬, 온프레미스, 클라우드 등 다양한 환경에 배포 가능  
  - 클라이언트 SDK는 Python, Swift, Node.js, Kotlin 등 지원  
  
### 클라이언트 SDK 목록  
  
- **Python**: [`llama-stack-client-python`](https://github.com/meta-llama/llama-stack-client-python)  
- **Swift**: [`llama-stack-client-swift`](https://github.com/meta-llama/llama-stack-client-swift/tree/latest-release)  
- **Node.js**: [`llama-stack-client-node`](https://github.com/meta-llama/llama-stack-client-node)  
- **Kotlin**: [`llama-stack-client-kotlin`](https://github.com/meta-llama/llama-stack-client-kotlin/tree/latest-release)  
  
### 지원되는 Llama Stack 구현체  
  
#### Inference API  
  
- 다양한 호스팅/로컬 환경의 추론 제공자 지원  
  - Meta Reference, Ollama, Fireworks, Together, NVIDIA NIM, vLLM, TGI, AWS Bedrock, OpenAI, Anthropic, Gemini 등  
  
#### Vector IO API  
  
- 벡터 저장소 인터페이스 제공  
- 지원 구현체:  
  - FAISS, SQLite-Vec, Chroma, Milvus, Postgres(PGVector), Weaviate 등  
  
#### Safety API  
  
- 프롬프트 및 코드 검사 등 AI 응답의 안전성 보장  
- 지원 구현체:  
  - Llama Guard, Prompt Guard, Code Scanner, AWS Bedrock 등  
  
### 개발 리소스  
  
- 빠르게 시작하고 싶다면: [Quick Start](https://llama-stack.readthedocs.io/en/latest/getting_started/index.html)  
- 기여하고 싶다면: [Contributing](https://llama-stack.readthedocs.io/en/latest/contributing/index.html)  
  
Llama Stack은 개발자들이 다양한 AI 기술을 손쉽게 통합하고 배포할 수 있도록 설계된 범용 프레임워크이며, 다양한 환경과 언어를 폭넓게 지원함

## Comments


_No public comments on this page._