TY - GEN
T1 - Efficient CNN Hardware Architecture Based on Linear Approximation and Computation Reuse Technique
AU - Tolba, Mohammed F.
AU - Saleh, Hani
AU - Al-Qutayri, Mahmoud
AU - Hroub, Ayman
AU - Stouraitis, Thanos
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Large deep neural network (DNN) models pose significant computational and memory challenges, particularly when deploying them on edge devices. To address this, techniques such as pruning, quantization, data sparsity, and data reuse have been applied to DNNs, mitigating memory and computational complexity at the cost of some accuracy loss. This paper introduces an efficient hardware accelerator tailored for Convolutional Neural Networks (CNNs). The proposed architecture is the result of a co-optimized approach encompassing both algorithms and hardware. It leverages linear approximation of pre-trained network weights with minimal accuracy loss. A novel computational reuse method is presented to curtail the number of multiplication and addition operations and memory accesses, seamlessly integrated into the dedicated elements within the CNN design. To validate the effectiveness of this architecture, we conducted experiments on a gem5-based RISCV simulator, employing the VGG16 model for the CIFAR 100 dataset and the AlexNet model for the TinyImageNet dataset. The results showcased an impressive speedup of approximately 2× on AlexNet compared to the reference model. Additionally, our proposed CNN design was successfully implemented on the Xilinx Kintex 7 Field Programmable Gate Array (FPGA), achieving a notable reduction in hardware resource utilization compared to prior research efforts. This work serves as a versatile framework for evaluating diverse trade-offs involving accuracy, latency, power consumption, and cost across different CNN architectures.
AB - Large deep neural network (DNN) models pose significant computational and memory challenges, particularly when deploying them on edge devices. To address this, techniques such as pruning, quantization, data sparsity, and data reuse have been applied to DNNs, mitigating memory and computational complexity at the cost of some accuracy loss. This paper introduces an efficient hardware accelerator tailored for Convolutional Neural Networks (CNNs). The proposed architecture is the result of a co-optimized approach encompassing both algorithms and hardware. It leverages linear approximation of pre-trained network weights with minimal accuracy loss. A novel computational reuse method is presented to curtail the number of multiplication and addition operations and memory accesses, seamlessly integrated into the dedicated elements within the CNN design. To validate the effectiveness of this architecture, we conducted experiments on a gem5-based RISCV simulator, employing the VGG16 model for the CIFAR 100 dataset and the AlexNet model for the TinyImageNet dataset. The results showcased an impressive speedup of approximately 2× on AlexNet compared to the reference model. Additionally, our proposed CNN design was successfully implemented on the Xilinx Kintex 7 Field Programmable Gate Array (FPGA), achieving a notable reduction in hardware resource utilization compared to prior research efforts. This work serves as a versatile framework for evaluating diverse trade-offs involving accuracy, latency, power consumption, and cost across different CNN architectures.
KW - AI accelerator
KW - approximate computing
KW - computational reuse
KW - Deep neural network
KW - Hardware acceleration
UR - http://www.scopus.com/inward/record.url?scp=85183318272&partnerID=8YFLogxK
U2 - 10.1109/ICM60448.2023.10378935
DO - 10.1109/ICM60448.2023.10378935
M3 - Conference contribution
AN - SCOPUS:85183318272
T3 - Proceedings of the International Conference on Microelectronics, ICM
SP - 7
EP - 10
BT - 2023 International Conference on Microelectronics, ICM 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2023 International Conference on Microelectronics, ICM 2023
Y2 - 17 November 2023 through 20 November 2023
ER -