Text-Guided Patch Scoring and Local Distortion Guidance for Image Quality Assessment

Park, Juyong; Song, Jihun; Kim, Gye Wan; Hyun, Yoonsuk

doi:10.1109/CVPRW67362.2025.00083

상세 보기

Text-Guided Patch Scoring and Local Distortion Guidance for Image Quality Assessment

Park, Juyong;
Song, Jihun;
Kim, Gye Wan;
Hyun, Yoonsuk

Citations

SCOPUS

0

초록

Image Quality Assessment (IQA) aims to provide objective quality scores of images by imitating the Human Visual System (HVS). Several IQA studies have produced promising results with the patch-wise prediction, which applies the weighted averaging of patch scores to predict image quality scores. However, these studies have implemented the patchwise prediction using only visual information. With the success of Vision-Language Models (VLMs), we aim to develop a patch-wise prediction specialized for VLMs. To achieve this, we propose Text-Guided Patch Scoring via Multi-Level Features of Vision-Language Models for NR-IQA (TeMuIQA). Specifically, our model aggregates multi-level features from the image encoder of the frozen VLM and leverages the patch-wise prediction with text-guided patch scoring (TPS). TeMu-IQA achieves state-of-the-art performance on various IQA datasets, even with a few trainable parameters, and exhibits consistency by maintaining superior performance across diverse VLMs. Moreover, we propose Local Distortion Guidance (LDG), a novel methodology that addresses the over-localized problem of the patch-wise prediction where the image's overall structural characteristic is not sufficiently considered. By generating a locally distorted image and guiding it to receive a lower quality score than the original, LDG strengthens the model's ability to reflect the image's logical coherence in its quality assessment. © 2025 IEEE.

키워드

image generation; image quality assessment

제목: Text-Guided Patch Scoring and Local Distortion Guidance for Image Quality Assessment

저자: Park, Juyong; Song, Jihun; Kim, Gye Wan; Hyun, Yoonsuk

DOI: 10.1109/CVPRW67362.2025.00083

발행일: 2025

유형: Conference paper

저널명: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops

페이지: 781 ~ 790