FPGA-Accelerated Neural Network Inference via a μkernel-Enabled Bytecode Interpreter

Park, Sangcheol; Park, Suhwan; Kang, Jin-Ku; Kim, Yongwoo

doi:10.1109/ISOCC66390.2025.11329603

상세 보기

FPGA-Accelerated Neural Network Inference via a μkernel-Enabled Bytecode Interpreter

Park, Sangcheol;
Park, Suhwan;
Kang, Jin-Ku;
Kim, Yongwoo

Citations

SCOPUS

0

초록

AI inference accelerators are typically complex and model-specific, often requiring hardware redesign for each new network. This paper presents an FPGA-based interpreter and hardware architecture that directly executes VM bytecode generated by the IREE compiler. Our interpreter supports IREE's μkernel option to accelerate instruction processing, and the hardware augments an RV32IM RISC-V core with dedicated floating-point addition and multiplication units. To accommodate large-scale models, we employ a dual-channel memory architecture. Implemented on a Xilinx FPGA, the system successfully performs end-to-end AI inference. In matrix multiplication benchmarks, the μkeruel-enabled interpreter achieves a 34% performance improvement over the non-μkernel design. The proposed architecture can support a wide range of neural networks without requiring any hardware redesign. © 2025 IEEE.

키워드

AI inference; FPGA; IREE; RISC-V

제목: FPGA-Accelerated Neural Network Inference via a μkernel-Enabled Bytecode Interpreter

저자: Park, Sangcheol; Park, Suhwan; Kang, Jin-Ku; Kim, Yongwoo

DOI: 10.1109/ISOCC66390.2025.11329603

발행일: 2025

유형: Conference paper

저널명: International SoC Design Conference 2025, ISOCC 2025 - Proceedings of Technical Papers