TY - JOUR
T1 - SLID
T2 - Exploiting Spatial Locality in Input Data as a Computational Reuse Method for Efficient CNN
AU - Alantali, Fatmah
AU - Halawani, Yasmin
AU - Mohammad, Baker
AU - Al-Qutayri, Mahmoud
N1 - Funding Information:
This work was supported in part by the Khalifa University, Abu Dhabi, United Arab Emirates, through the CIRA Project under Grant CIRA-2018-026 and the System-on-Chip Center under Grant RC2-2018-020.
Publisher Copyright:
© 2013 IEEE.
PY - 2021
Y1 - 2021
N2 - Convolutional Neural Networks (CNNs) revolutionized computer vision and reached the state-of-the-art performance for image processing, object recognition, and video classification. Even though CNN inference is notoriously compute-intensive, as convolutions account for >90% of the total operation tasks, the ability to tradeoff between accuracy, performance, power, and latency to meet target application makes it an open research topic. This paper proposes the Spatial Locality Input Data (SLID) method for computational reuse during the inference stage for a pre-trained network. The method exploits input data spatial locality via skipping partial processing of the multiply-and-accumulate (MAC) operations for adjacent data and equating its value to previously computed ones. SLID improves the throughput of resource-constrained devices (Internet-of-Things, edge devices) and accelerates computations during the inference phase by reducing the number of MAC operations. Such approximate computing schema does not require a similarity quantification step nor any modification for the training stage. The computational data reuse was evaluated on three well-known distinctive CNN structures and data sets with alternating layer selections: LeNet, CIFAR-10, and AlexNet. The computational data reuse method saves up to 34.9%, 49.84%, and 31.5% of MAC operations while reducing the accuracy by 8%, 3.7%, and 5.0% for the three models mentioned earlier, respectively. Besides, the proposed method saves on memory access by eliminating data fetching of skipped inputs. Furthermore, filter size, strides, and padding on the accuracy and savings of operations are analyzed. SLID is the first work to exploit the input spatial locality for savings on CNN convolution operations with minimal accuracy loss and without memory or computational overhead. This makes it a great option to support intelligence at the edge.
AB - Convolutional Neural Networks (CNNs) revolutionized computer vision and reached the state-of-the-art performance for image processing, object recognition, and video classification. Even though CNN inference is notoriously compute-intensive, as convolutions account for >90% of the total operation tasks, the ability to tradeoff between accuracy, performance, power, and latency to meet target application makes it an open research topic. This paper proposes the Spatial Locality Input Data (SLID) method for computational reuse during the inference stage for a pre-trained network. The method exploits input data spatial locality via skipping partial processing of the multiply-and-accumulate (MAC) operations for adjacent data and equating its value to previously computed ones. SLID improves the throughput of resource-constrained devices (Internet-of-Things, edge devices) and accelerates computations during the inference phase by reducing the number of MAC operations. Such approximate computing schema does not require a similarity quantification step nor any modification for the training stage. The computational data reuse was evaluated on three well-known distinctive CNN structures and data sets with alternating layer selections: LeNet, CIFAR-10, and AlexNet. The computational data reuse method saves up to 34.9%, 49.84%, and 31.5% of MAC operations while reducing the accuracy by 8%, 3.7%, and 5.0% for the three models mentioned earlier, respectively. Besides, the proposed method saves on memory access by eliminating data fetching of skipped inputs. Furthermore, filter size, strides, and padding on the accuracy and savings of operations are analyzed. SLID is the first work to exploit the input spatial locality for savings on CNN convolution operations with minimal accuracy loss and without memory or computational overhead. This makes it a great option to support intelligence at the edge.
KW - accelerator
KW - approximate computing
KW - CNN
KW - computation reuse
KW - in-memory computing
KW - input similarity
KW - RRAM
UR - http://www.scopus.com/inward/record.url?scp=85103912300&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2021.3071409
DO - 10.1109/ACCESS.2021.3071409
M3 - Article
AN - SCOPUS:85103912300
SN - 2169-3536
VL - 9
SP - 57179
EP - 57187
JO - IEEE Access
JF - IEEE Access
M1 - 9395591
ER -