Zero-Shot Knowledge Distillation Using Label-Free Adversarial Perturbation With Taylor Approximation

Citations

WEB OF SCIENCE

4
Citations

SCOPUS

5

초록

Knowledge distillation (KD) is one of the most effective neural network light-weighting techniques when training data is available. However, KD is seldom applicable to an environment where it is difficult or impossible to access training data. To solve this problem, a complete zero-shot KD (C-ZSKD) based on adversarial learning has been recently proposed, but the so-called biased sample generation problem limits the performance of C-ZSKD. To overcome this limitation, this paper proposes a novel C-ZSKD algorithm that utilizes a label-free adversarial perturbation. The proposed adversarial perturbation derives a constraint of the squared norm of gradient style by using the convolution of probability distributions and the 2nd order Taylor series approximation. The constraint serves to increase the variance of the adversarial sample distribution, which makes the student model learn the decision boundary of the teacher model more accurately without labeled data. Through analysis of the distribution of adversarial samples on the embedded space, this paper also provides an insight into the characteristics of adversarial samples that are effective for adversarial learning-based C-ZSKD.

키워드

Perturbation methodsTraining dataTrainingProbability distributionGeneratorsNeural networksConvolutionZero-shot learningknowledge distillationadversarial learning
제목
Zero-Shot Knowledge Distillation Using Label-Free Adversarial Perturbation With Taylor Approximation
저자
Lee, Kang IlLee, SeunghyunSong, Byung Cheol
DOI
10.1109/ACCESS.2021.3066513
발행일
2021
유형
Article
저널명
IEEE Access
9
페이지
45454 ~ 45461