Skip to main navigation Skip to search Skip to main content

Lightweight image super-resolution based on retentive network

    • Shanghai Jiao Tong University
    • Assiut University

    Research output: Contribution to journalArticlepeer-review

    1 Scopus citations

    Abstract

    Recent advancements have demonstrated the effectiveness of retentive networks in natural language processing and high-level vision tasks. However, their potential in low-level vision, such as image super-resolution (SR), remains underexplored. In this paper, we introduce RetNetSR, a novel retentive network architecture designed specifically for image super-resolution. Our approach leverages a spatial prior derived from the Manhattan distance to enhance the Self-Attention mechanism, effectively translating the temporal decay concept of RetNet into the spatial domain. At the core of RetNetSR is the Manhattan Self-Attention module, which integrates Self-Attention with multi-layer perceptrons and depthwise convolution. Building on this, we propose the Manhattan Self-Attention Block and the Manhattan Self-Attention Group, the latter further enriched with 3 3 convolutions and Enhanced Spatial Attention modules for more effective deep feature extraction. Experimental evaluations across multiple runs on standard image SR benchmarks show that RetNetSR consistently achieves competitive or superior results, with statistical analysis (paired t-test, p 0.05) confirming that improvements—such as raising PSNR from 32.76 to 32.86 on Urban100 (2 scale) and SSIM from 0.9151 to 0.9156 on Manga109 (4 scale) are statistically significant.

    Original languageBritish English
    Article number130
    JournalNeural Computing and Applications
    Volume38
    Issue number5
    DOIs
    StatePublished - Mar 2026

    Keywords

    • Explicit spatial prior
    • Image super-resolution
    • Manhattan self-attention
    • Retentive network
    • Transformer model

    Fingerprint

    Dive into the research topics of 'Lightweight image super-resolution based on retentive network'. Together they form a unique fingerprint.

    Cite this