vLLM Project
Easy, fast, and cheap LLM serving for everyone
High-throughput inference with PagedAttention
Open Source