Text-Guided Patch Scoring and Local Distortion Guidance for Image Quality Assessment

  • Park, Juyong
  • Song, Jihun
  • Kim, Gye Wan
  • Hyun, Yoonsuk
Citations

SCOPUS

0

초록

Image Quality Assessment (IQA) aims to provide objective quality scores of images by imitating the Human Visual System (HVS). Several IQA studies have produced promising results with the patch-wise prediction, which applies the weighted averaging of patch scores to predict image quality scores. However, these studies have implemented the patchwise prediction using only visual information. With the success of Vision-Language Models (VLMs), we aim to develop a patch-wise prediction specialized for VLMs. To achieve this, we propose Text-Guided Patch Scoring via Multi-Level Features of Vision-Language Models for NR-IQA (TeMuIQA). Specifically, our model aggregates multi-level features from the image encoder of the frozen VLM and leverages the patch-wise prediction with text-guided patch scoring (TPS). TeMu-IQA achieves state-of-the-art performance on various IQA datasets, even with a few trainable parameters, and exhibits consistency by maintaining superior performance across diverse VLMs. Moreover, we propose Local Distortion Guidance (LDG), a novel methodology that addresses the over-localized problem of the patch-wise prediction where the image's overall structural characteristic is not sufficiently considered. By generating a locally distorted image and guiding it to receive a lower quality score than the original, LDG strengthens the model's ability to reflect the image's logical coherence in its quality assessment. © 2025 IEEE.

키워드

image generationimage quality assessment
제목
Text-Guided Patch Scoring and Local Distortion Guidance for Image Quality Assessment
저자
Park, JuyongSong, JihunKim, Gye WanHyun, Yoonsuk
DOI
10.1109/CVPRW67362.2025.00083
발행일
2025
유형
Conference paper
저널명
IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
페이지
781 ~ 790