DTKD: Diffusion-to-Transformer Heterogeneous Knowledge Distillation for Efficient and Perceptually Enhanced Super-Resolution

Citations

WEB OF SCIENCE

0
Citations

SCOPUS

0

초록

Single-image super-resolution (SISR) aims to reconstruct high-resolution (HR) images from low-resolution (LR) inputs and remains fundamentally ill-posed due to the inherent ambiguity of missing high-frequency details. While diffusion-based SR models achieve superior perceptual quality through iterative denoising, their multi-step sampling process results in substantial computational cost and latency. In contrast, transformer-based SR models offer efficient single-forward inference but are typically optimized for distortion-oriented objectives, limiting perceptual realism. In this paper, we propose DTKD, a diffusion-to-transformer heterogeneous knowledge distillation framework that transfers the perceptual prior of a diffusion teacher into an efficient transformer student. To effectively bridge the representational gap between generative diffusion outputs and deterministic transformer reconstructions, we introduce a frequency-group-aware distillation loss based on two-level discrete wavelet transform (DWT). The loss decomposes images into structured frequency sub-bands and assigns non-uniform weights to emphasize discrepancy-sensitive mid-frequency components. Furthermore, we adopt a progressive scheduling strategy that gradually increases the distillation weight during training to stabilize optimization and balance structural fidelity with perceptual enhancement. Extensive experiments on real-world SR benchmarks demonstrate that the proposed framework consistently improves perceptual quality over a standalone transformer student while maintaining transformer-level inference efficiency. Ablation studies further validate the importance of moderate frequency decomposition, discrepancy-aware weighting, and progressive distillation scheduling. These results suggest that heterogeneous distillation provides an effective and practical approach for transferring diffusion-based generative priors into efficient super-resolution models.

키워드

diffusionimage super-resolutionknowledge distillationtransformer
제목
DTKD: Diffusion-to-Transformer Heterogeneous Knowledge Distillation for Efficient and Perceptually Enhanced Super-Resolution
저자
Park, Jeong HyeokSong, Byung Cheol
DOI
10.3390/electronics15101986
발행일
2026-05
유형
Article
저널명
ELECTRONICS
15
10