Revolutionizing Java-based LLMs: Unleashing the Power of GPUs with TornadoVM
Tools-in-Action (INTERMEDIATE level)
Exec Centre
Recent advancements in the Java Virtual Machine (JVM), such as the Panama and Vector API projects, democratize native-like performance for Java apps. This aligns with the demand for inference of large language models (LLMs) across diverse platforms. JVM progress narrows the performance gap with efficient SIMD vectorization and off-heap data types. Yet, untapped potential lies in leveraging heterogeneous hardware like GPUs and FPGAs. TornadoVM, an open-source tech, aims to fill this gap by enabling OpenJDK and other JDK distributions to offload specific Java app components for parallel execution on diverse hardware.
This presentation illustrates how TornadoVM synergizes with the latest OpenJDK advancements, including the Panama API and Vector API, achieving high-performing inference Java-based LLM implementations. Emphasizing accessibility and community involvement in the open-source domain, our goal is to clarify TornadoVM's fundamental concepts. Subsequently, we'll demonstrate enhancing a fully Java-written LLM implementation with GPU support using TornadoVM. Additionally, we'll experiment with off-the-shelf LLM models to highlight the collaboration between TornadoVM and Java.
Michalis Papadimitriou
University of Manchester
Michalis Papadimitriou is a Research Fellow at the University of Manchester and a staff software engineer on the TornadoVM team.
His primary expertise lies in open-source software, hardware abstractions for high-level languages, and compiler optimizations for GPU computing.
Michalis is dedicated to enable GPU acceleration for Machine Learning workloads within the JVM through the TornadoVM framework.
Prior to joining the University of Manchester, he contributed to various software stacks at Huawei Technologies and in the Apache TVM through OctoML.