Compressing Neural Networks on Limited Computing Resources

Lee, Seunghyun; Lee, Dongjun; Hyun, Minju; Kim, Heeje; Song, Byung Cheol

doi:10.1109/ACCESS.2025.3567102

상세 보기

Compressing Neural Networks on Limited Computing Resources

Lee, Seunghyun;
Lee, Dongjun;
Hyun, Minju;
Kim, Heeje;
Song, Byung Cheol

Citations

WEB OF SCIENCE

0

Citations

SCOPUS

0

초록

Network compression is a crucial technique for applying deep learning models to edge or mobile devices. However, the cost of achieving higher benchmark performance through compression is continuously increasing, making network compression a significant burden-especially for small industries focused on developing compact models. Specifically, existing network compression techniques often require extensive computational resources, rendering them impractical for edge devices and small-scale applications. To democratize network compression, we propose a general-purpose framework that combines novel filter pruning and knowledge distillation techniques. First, unlike conventional filter pruning methods based on static heuristics and costly neural architecture search (NAS)-based approaches, our method leverages meta-learning for rapid and fine examination of the importance of each gate. This enables rapid and stable sub-network discovery, significantly improving the pruning process. Second, to minimize the computational cost of knowledge distillation, we introduce a synthetic teacher assistant that leverages precomputed fixed knowledge-referring to the stored feature maps/logits of the teacher network. By leveraging fixed knowledge, we mitigate the cost incurred by the teacher network and facilitate the transmission of fixed knowledge to the student via synthetic teacher assistants, thereby preventing distribution collapse. Our proposed framework dramatically reduces the compression overhead while maintaining high accuracy, achieving a 55.2% reduction in FLOPs of ResNet-50 trained on ImageNet while preserving 76.2% top-1 accuracy with only 199 GPU hours-significantly lower than previous state-of-the-art methods. Overall, our framework democratizes deep learning compression by offering a cost-effective and computationally feasible solution, enabling broader adoption in low-resource environments.

키워드

Costs; Knowledge engineering; Logic gates; Filtering algorithms; Graphics processing units; Computational modeling; Accuracy; Image coding; Deep learning; Workstations; Deep neural network compression; filter pruning; knowledge transfer; efficient neural networks; lightweight deep learning models

제목: Compressing Neural Networks on Limited Computing Resources

저자: Lee, Seunghyun; Lee, Dongjun; Hyun, Minju; Kim, Heeje; Song, Byung Cheol

DOI: 10.1109/ACCESS.2025.3567102

발행일: 2025

유형: Article

저널명: IEEE Access

권: 13

페이지: 80063 ~ 80075