TY - JOUR
T1 - IQA Vision Transformed
T2 - A Survey of Transformer Architectures in Perceptual Image Quality Assessment
AU - Rehman, Mobeen Ur
AU - Nizami, Imran Fareed
AU - Ullah, Farman
AU - Hussain, Irfan
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2024
Y1 - 2024
N2 - In an era dominated by visual content, perceptual image quality assessment (IQA) is crucial for enhancing user experiences and driving technological advancements across various domains. This survey paper reviews the integration of Vision Transformers (ViTs) into both no-reference (NR) and full-reference (FR) IQA methods, highlighting their promise as alternatives to traditional techniques. ViTs leverage attention mechanisms to focus selectively on relevant image patches, showing promise in aligning more closely with human perceptual errors. We identify key limitations of conventional IQA methods and track the evolution from early learning-based approaches to contemporary deep learning models, with a specific focus on ViTs.We discuss the performance of Transformer-based models in capturing image distortions and their strong correlation with subjective IQA metrics. We also discuss potential breakthroughs, including the development of hybrid architectures combining Capsule Networks and Transformers, adaptive IQA through meta-learning, and scalable solutions using quantum-inspired computing. These advancements promise to enhance perceptual quality assessment, with substantial implications for industries such as medical imaging, multimedia applications, and beyond. This study aims to set the groundwork for future research in transformer-based methodologies, offering new insights into the transformative impact of these models on IQA.
AB - In an era dominated by visual content, perceptual image quality assessment (IQA) is crucial for enhancing user experiences and driving technological advancements across various domains. This survey paper reviews the integration of Vision Transformers (ViTs) into both no-reference (NR) and full-reference (FR) IQA methods, highlighting their promise as alternatives to traditional techniques. ViTs leverage attention mechanisms to focus selectively on relevant image patches, showing promise in aligning more closely with human perceptual errors. We identify key limitations of conventional IQA methods and track the evolution from early learning-based approaches to contemporary deep learning models, with a specific focus on ViTs.We discuss the performance of Transformer-based models in capturing image distortions and their strong correlation with subjective IQA metrics. We also discuss potential breakthroughs, including the development of hybrid architectures combining Capsule Networks and Transformers, adaptive IQA through meta-learning, and scalable solutions using quantum-inspired computing. These advancements promise to enhance perceptual quality assessment, with substantial implications for industries such as medical imaging, multimedia applications, and beyond. This study aims to set the groundwork for future research in transformer-based methodologies, offering new insights into the transformative impact of these models on IQA.
KW - Attention Mechanisms
KW - Capsule Networks
KW - Cross-Domain Evaluation
KW - Deep Learning for IQA
KW - Hybrid Architectures
KW - Meta-Learning for IQA
KW - Multimedia Applications
KW - Perceptual Image Quality Assessment (IQA)
KW - Quantum-Inspired Computing
KW - Transformer Architectures
KW - Vision Transformers (ViTs)
UR - http://www.scopus.com/inward/record.url?scp=85210961436&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2024.3506273
DO - 10.1109/ACCESS.2024.3506273
M3 - Article
AN - SCOPUS:85210961436
SN - 2169-3536
JO - IEEE Access
JF - IEEE Access
ER -