▲xguru 6달전 | parent | ★ favorite | on: 알리바바 클라우드, GPU 풀링 시스템 ‘Aegaeon’으로 엔비디아 GPU 사용량 82% 절감(tomshardware.com)논문 Aegaeon: Effective GPU Pooling and Scheduling for Multi-LLM Inference
논문 Aegaeon: Effective GPU Pooling and Scheduling for Multi-LLM Inference