Network Splitting Techniques and Their Optimization for Lightweight Ternary Neural Networks

  • Karimah, Hasna Nur
  • Prihatiningrum, Novi
  • Gong, Young-Ho
  • Jin, Jonghoon
  • Seo, Yeongkyo
Citations

WEB OF SCIENCE

0
Citations

SCOPUS

0

초록

To run a high-performing deep convolutional neural network (CNN), substantial memory and computational resources are typically required. To address this, we propose an optimization method of a ternary neural network (TNN) by applying network splitting techniques to achieve an even more lightweight model. TNN offers a favorable trade-off between accuracy and computational saving compared to a binary quantized network, which often suffers from higher accuracy loss due to extreme quantization. Our network splitting technique combines grouped convolution and pointwise convolution, where the convolution operations are computed in separate groups and then the features are fused together in the later step. Our proposed network splitting technique has the advantage of being easily implemented with lightweight hardware design. For example, when implementing Processing-In-Memory (PIM) hardware, each convolution layer can be set to the same size, enabling the design of lightweight neural network accelerators by eliminating the need for analog-to-digital conversion. As a result, our experiments show that the proposed method can save up to 4.53x memory compression with minimal impact on the accuracy.

키워드

ternary neural networknetwork quantizationnetwork splittinggrouped convolution
제목
Network Splitting Techniques and Their Optimization for Lightweight Ternary Neural Networks
저자
Karimah, Hasna NurPrihatiningrum, NoviGong, Young-HoJin, JonghoonSeo, Yeongkyo
DOI
10.3390/electronics14183651
발행일
2025-09
유형
Article
저널명
ELECTRONICS
14
18