An IREE Compiler-Based SoC Design for Efficient on-Device AI Inference Acceleration

Citations

WEB OF SCIENCE

0
Citations

SCOPUS

0

초록

As on-device inference gains traction for privacy protection and low-latency AI services in edge environments, the demand for flexible and lightweight SoC architecture is increasing. This paper proposes a novel SoC design that directly executes IREE (Intermediate Representation Execution Environment) VM Bytecode through an interpreter-based approach implemented on a RISC-V core. The system is validated on a Zynq ZC706 FPGA board and consists of a compiler-hosted PS (Processing System) and an interpreterexecuting PL (Programmable Logic). To improve performance, DMA and cache modules are integrated to optimize data movement and memory access during inference. The proposed system successfully executes various machine learning models, including MUL, MMT, and MNIST, and achieves up to 29 % reduction in inference time depending on the model characteristics. This work demonstrates the feasibility of compiler-driven, reconfigurable inference architectures for embedded AI applications without the need for model-specific hardware redesign. © 2025 IEEE.

키워드

Bytecode InterpreterFPGA ImplementationOn-Device InferenceSoC Design
제목
An IREE Compiler-Based SoC Design for Efficient on-Device AI Inference Acceleration
저자
Park, SuhwanPark, SangcheolKang, Jin-KuKim, Yongwoo
DOI
10.1109/APCCAS67402.2025.11377061
발행일
2025
유형
Proceedings Paper
저널명
Proceedings - 2025 21st IEEE Asia Pacific Conference on Circuits and Systems, APCCAS 2025