Efficient CNN Inference using Spatial Local Input Data Similarity

  • Fatmah Ali Alantali

Student thesis: Master's Thesis

Abstract

With the continuous rise in the demand for autonomous systems many edge devices need to implement Artificial Intelligence (AI) techniques for a wide range of applications. However, these edge devices face many challenges when executing AI algorithms, especially when they need to meet real-time requirement with limited energy and computing resource. The Convolutional Neural Networks (CNNs) are a prevalent machine learning method that revolutionized computer vision and achieved the state-of-the-art performance for application such as image processing, object recognition, and video classification. However, the intensive processing of CNN hinders its adaptation in IoT nodes. This research focuses on exploiting the error tolerable nature of Deep Learning Systems and the similarity of spatially associated inputs. In this work, computational reuse and value prediction via skipping partial processing of the multiply-and-accumulate (MAC) operations for adjacent data and equating its value to previously computed ones is used to improve the throughput and the energy efficiency. The proposed techniques target minimal similarity matching and clustering of the input data. Two methods were proposed and tested; the first method is exploiting Spatial Locality in Input Data (SLID) where no overhead is incurred. This method saves up to 34.9%, 49.84%, and 31.5% of MAC operations while reducing the accuracy by 8%, 3.7%, and 5.0% for LeNet, CIFAR-10, and AlexNet models respectively. The second method involved a preprocessing step and minor comparisons to mitigate the accuracy loss in later layers. The method provides savings of 26.2%, 31.9%, and 33.15% of MAC operations and reduces accuracy by 1.9%, 1.6%, and 6.6% for the three aforementioned models, respectively. A simulation of the accuracy using RRAM technology has been explored for the realization of the proposed approach. In addition, mobile platform with Raspberry Pi is used to confirm the saving on actual hardware.
Date of AwardDec 2021
Original languageAmerican English

Keywords

  • CNN
  • accelerator
  • convolutional
  • computation reuse
  • input similarity
  • preprocessing
  • approximate computing
  • in memory computing
  • RRAM.

Cite this

'