DTKD: Diffusion-to-Transformer Heterogeneous Knowledge Distillation for Efficient and Perceptually Enhanced Super-Resolution

Park, Jeong Hyeok; Song, Byung Cheol

doi:10.3390/electronics15101986

상세 보기

DTKD: Diffusion-to-Transformer Heterogeneous Knowledge Distillation for Efficient and Perceptually Enhanced Super-Resolution

Park, Jeong Hyeok;
Song, Byung Cheol

Citations

WEB OF SCIENCE

0

Citations

SCOPUS

0

초록

Single-image super-resolution (SISR) aims to reconstruct high-resolution (HR) images from low-resolution (LR) inputs and remains fundamentally ill-posed due to the inherent ambiguity of missing high-frequency details. While diffusion-based SR models achieve superior perceptual quality through iterative denoising, their multi-step sampling process results in substantial computational cost and latency. In contrast, transformer-based SR models offer efficient single-forward inference but are typically optimized for distortion-oriented objectives, limiting perceptual realism. In this paper, we propose DTKD, a diffusion-to-transformer heterogeneous knowledge distillation framework that transfers the perceptual prior of a diffusion teacher into an efficient transformer student. To effectively bridge the representational gap between generative diffusion outputs and deterministic transformer reconstructions, we introduce a frequency-group-aware distillation loss based on two-level discrete wavelet transform (DWT). The loss decomposes images into structured frequency sub-bands and assigns non-uniform weights to emphasize discrepancy-sensitive mid-frequency components. Furthermore, we adopt a progressive scheduling strategy that gradually increases the distillation weight during training to stabilize optimization and balance structural fidelity with perceptual enhancement. Extensive experiments on real-world SR benchmarks demonstrate that the proposed framework consistently improves perceptual quality over a standalone transformer student while maintaining transformer-level inference efficiency. Ablation studies further validate the importance of moderate frequency decomposition, discrepancy-aware weighting, and progressive distillation scheduling. These results suggest that heterogeneous distillation provides an effective and practical approach for transferring diffusion-based generative priors into efficient super-resolution models.

키워드

diffusion; image super-resolution; knowledge distillation; transformer

제목: DTKD: Diffusion-to-Transformer Heterogeneous Knowledge Distillation for Efficient and Perceptually Enhanced Super-Resolution

저자: Park, Jeong Hyeok; Song, Byung Cheol

DOI: 10.3390/electronics15101986

발행일: 2026-05

유형: Article

저널명: ELECTRONICS

권: 15

호: 10