This notebook explores vLLM's usage data in 2024. The range of data is 04-01-2024 to 12-15-2024. The data collection is only a small sample of users, non-identifiable, and typically opt-out.. The real usage is likely a lot bigger of what is shown but we should focus on relative trend.
Over 259 days, we have total 534333432 hours of compute running on vLLM, which is about 85960 GPUs running non-stop!
Key takeaways