Fast, Cheap, and Accurate: Optimizing LLM Inference with vLLM and Quantization Conference - INTERMEDIATE LEVEL Legare Kerrison Red Hat View
Scaling AI on Hybrid Cloud for Production LLM Inference at Scale Conference - INTERMEDIATE LEVEL Roberto Carratala Red Hat View
Fast, Cheap, and Accurate: Optimizing LLM Inference with vLLM and Quantization Similarity score = 0.54 More
Taming GraalVM Reflection with AI Agents: Lessons from Testing 1000 Libraries Similarity score = 0.69 More