A Multiplier-Free RNS-Based CNN Accelerator Exploiting Bit-Level Sparsity

Vasilis Sakellariou, Vassilis Paliouras, Ioannis Kouretas, Hani Saleh, Thanos Stouraitis

    Research output: Contribution to journalArticlepeer-review

    3 Scopus citations

    Abstract

    In this work, a Residue Numbering System (RNS)-based Convolutional Neural Network (CNN) accelerator utilizing a multiplier-free distributed-arithmetic Processing Element (PE) is proposed. A method for maximizing the utilization of the arithmetic hardware resources is presented. It leads to an increase of the system's throughput, by exploiting bit-level sparsity within the weight vectors. The proposed PE design takes advantage of the properties of RNS and Canonical Signed Digit (CSD) encoding to achieve higher energy efficiency and effective processing rate, without requiring any compression mechanism or introducing any approximation. An extensive design space exploration for various parameters (RNS base, PE micro-architecture, encoding) using analytical models as well as experimental results from CNN benchmarks is conducted and the various trade-offs are analyzed. A complete end-to-end RNS accelerator is developed based on the proposed PE. The introduced accelerator is compared to traditional binary and RNS counterparts as well as to other state-of-the-art systems. Implementation results in a 22-nm process show that the proposed PE can lead to 1.85× and 1.54× more energy-efficient processing compared to binary and conventional RNS, respectively, with a 1.88× maximum increase of effective throughput for the employed benchmarks. Compared to a state-of-the-art, all-digital, RNS-based system, the proposed accelerator is 8.87× and 1.11× more energy- and area-efficient, respectively.

    Original languageBritish English
    Pages (from-to)667-683
    Number of pages17
    JournalIEEE Transactions on Emerging Topics in Computing
    Volume12
    Issue number2
    DOIs
    StatePublished - 1 Apr 2024

    Keywords

    • AI hardware accelerator
    • canonical signed digit
    • RNS

    Fingerprint

    Dive into the research topics of 'A Multiplier-Free RNS-Based CNN Accelerator Exploiting Bit-Level Sparsity'. Together they form a unique fingerprint.

    Cite this