TY - JOUR
T1 - Multiscale convolutional transformer for robust detection of aquaculture defects
AU - Khan, Wilayat
AU - Hassan, Taimur
AU - Rehman, Mobeen Ur
AU - Alsaffar, Mohammad
AU - Hussain, Irfan
N1 - Publisher Copyright:
© 2025
PY - 2025/5/10
Y1 - 2025/5/10
N2 - Accurate identification of aquatic defects is paramount for ensuring the safety of marine life within aquaculture environment. However, due to large disparities between photographic and underwater imagery, conventional deep learning models, employed to monitor aquatic defects, produces inadequate recognition performance. Furthermore, they require extensive amount of ground truth supervision on large-scale datasets which limits their scalability in the real-world. To overcome these issues, this paper proposes a novel convolutional transformer architecture that combines multi-scale convolutional feature representations with the attentional projections to robustly recognize aquatic defects from the underwater imagery irrespective of the background clutter, color distortion and scanner specifications. Moreover, unlike the conventional fully supervised methods, the proposed model leverages self-supervision through its prior-learned experiences to perform the aquatic defects extraction tasks across different datasets without incurring additional ground truth labeling and re-training costs. The proposed model consistently outperforms state-of-the-art methods by achieving superior mean average precision scores of 0.72, 0.74, 0.80, and 0.82 across NDv1, NDv2, LABUST, and KU datasets, respectively. These results reflect the effectiveness of the proposed approach in accurately identifying and delineating aquaculture defects across diverse underwater environments.
AB - Accurate identification of aquatic defects is paramount for ensuring the safety of marine life within aquaculture environment. However, due to large disparities between photographic and underwater imagery, conventional deep learning models, employed to monitor aquatic defects, produces inadequate recognition performance. Furthermore, they require extensive amount of ground truth supervision on large-scale datasets which limits their scalability in the real-world. To overcome these issues, this paper proposes a novel convolutional transformer architecture that combines multi-scale convolutional feature representations with the attentional projections to robustly recognize aquatic defects from the underwater imagery irrespective of the background clutter, color distortion and scanner specifications. Moreover, unlike the conventional fully supervised methods, the proposed model leverages self-supervision through its prior-learned experiences to perform the aquatic defects extraction tasks across different datasets without incurring additional ground truth labeling and re-training costs. The proposed model consistently outperforms state-of-the-art methods by achieving superior mean average precision scores of 0.72, 0.74, 0.80, and 0.82 across NDv1, NDv2, LABUST, and KU datasets, respectively. These results reflect the effectiveness of the proposed approach in accurately identifying and delineating aquaculture defects across diverse underwater environments.
KW - Aquaculture
KW - Biofouling
KW - Self-supervised learning
KW - Transformers
KW - Vegetation
UR - https://www.scopus.com/pages/publications/85218124438
U2 - 10.1016/j.eswa.2025.126820
DO - 10.1016/j.eswa.2025.126820
M3 - Article
AN - SCOPUS:85218124438
SN - 0957-4174
VL - 273
JO - Expert Systems with Applications
JF - Expert Systems with Applications
M1 - 126820
ER -