DNN Model Partitioning in AI-Based Mobile Services

Citations

SCOPUS

0

초록

Through the advancement of wireless network technology, real-time mobile vision applications such as object detection and image analysis in mobile devices are being used in various fields. Such applications leverage mobile edge computing (MEC) to utilize high-accuracy deep learning models, but show low QoE due to network overhead. To tackle this, deep model partitioning has emerged that splits processing for inference between a mobile device and MEC server. Existing works proposed deep learning model partitioning algorithms to improve one or two metrics among end-to-end latency, energy consumption, and frame per second (fps) to enhance the QoE of mobile vision applications. In this paper, we propose an algorithm to jointly control (i) the model partitioing point, (ii) the number of frames to be processed among the input frames, and (iii) the GPU clock frequency of the mobile device to improve the performance of the above three metrics. With trace-driven simulation, we verify that our RT-DMP can save 90.2% of energy consumption than mobile processing algorithm, and improve processed fps by 91.8% compared to MEC algorithm. © 2022, Korean Institute of Communications and Information Sciences. All rights reserved.

키워드

DNN model partitioningmobile edge computingmobile vision applicationoptimizationquality of user experience
제목
DNN Model Partitioning in AI-Based Mobile Services
저자
Lim, Jeong-AKim, Yeongjin
DOI
10.7840/kics.2022.47.6.818
발행일
2022
유형
Article
저널명
한국통신학회논문지
47
6
페이지
818 ~ 825