FPGA-Accelerated Neural Network Inference via a μkernel-Enabled Bytecode Interpreter

Citations

SCOPUS

0

초록

AI inference accelerators are typically complex and model-specific, often requiring hardware redesign for each new network. This paper presents an FPGA-based interpreter and hardware architecture that directly executes VM bytecode generated by the IREE compiler. Our interpreter supports IREE's μkernel option to accelerate instruction processing, and the hardware augments an RV32IM RISC-V core with dedicated floating-point addition and multiplication units. To accommodate large-scale models, we employ a dual-channel memory architecture. Implemented on a Xilinx FPGA, the system successfully performs end-to-end AI inference. In matrix multiplication benchmarks, the μkeruel-enabled interpreter achieves a 34% performance improvement over the non-μkernel design. The proposed architecture can support a wide range of neural networks without requiring any hardware redesign. © 2025 IEEE.

키워드

AI inferenceFPGAIREERISC-V
제목
FPGA-Accelerated Neural Network Inference via a μkernel-Enabled Bytecode Interpreter
저자
Park, SangcheolPark, SuhwanKang, Jin-KuKim, Yongwoo
DOI
10.1109/ISOCC66390.2025.11329603
발행일
2025
유형
Conference paper
저널명
International SoC Design Conference 2025, ISOCC 2025 - Proceedings of Technical Papers