TY - GEN
T1 - Enhanced CNN Performance without Retraining Via Weight Approximation and Data Reuse
AU - Mohammed Tolba, Mohammed
AU - Saleh, Hani
AU - Mohammad, Baker
AU - Al Qutayri, Mahmoud
AU - Stouraitis, Thanos
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - This paper introduces an efficient CNN algorithm to address key limitations in Deep Neural Networks (DNNs) used for image recognition, focusing particularly on model size and retraining time. Traditional methods often require significant training durations; however, applying approximation techniques during retraining can exacerbate these time demands. We present an approach that enhances approximation techniques while eliminating the need for model retraining, thus enabling DNN compression with minimal accuracy loss. The proposed method integrates three core strategies: weight arrangement, approximation, and data reuse. The DNN weights are initially arranged in ascending order to optimize subsequent operations. During inference, the approximation is applied to reduce the model size and minimize computational complexity by reducing the number of operations required for each multiply-accumulate (MAC) unit. Then, the original weights are replaced with the approximated values, enabling the reuse of computations and data across different sets of weights. As a result, the method significantly reduces memory access, computational demands, and energy consumption. Experimental results on the CIFAR-10 and TinyImageNet datasets demonstrate that our method achieves a model reduction rate of approximately 198.6× while maintaining a minimal loss in accuracy. The proposed technique bypasses the need for retraining, offering a practical solution to the growing complexity of DNN models in modern applications.
AB - This paper introduces an efficient CNN algorithm to address key limitations in Deep Neural Networks (DNNs) used for image recognition, focusing particularly on model size and retraining time. Traditional methods often require significant training durations; however, applying approximation techniques during retraining can exacerbate these time demands. We present an approach that enhances approximation techniques while eliminating the need for model retraining, thus enabling DNN compression with minimal accuracy loss. The proposed method integrates three core strategies: weight arrangement, approximation, and data reuse. The DNN weights are initially arranged in ascending order to optimize subsequent operations. During inference, the approximation is applied to reduce the model size and minimize computational complexity by reducing the number of operations required for each multiply-accumulate (MAC) unit. Then, the original weights are replaced with the approximated values, enabling the reuse of computations and data across different sets of weights. As a result, the method significantly reduces memory access, computational demands, and energy consumption. Experimental results on the CIFAR-10 and TinyImageNet datasets demonstrate that our method achieves a model reduction rate of approximately 198.6× while maintaining a minimal loss in accuracy. The proposed technique bypasses the need for retraining, offering a practical solution to the growing complexity of DNN models in modern applications.
KW - approximate computing
KW - computational reuse
KW - Deep neural network
KW - Hardware acceleration
UR - https://www.scopus.com/pages/publications/105010647139
U2 - 10.1109/ISCAS56072.2025.11044254
DO - 10.1109/ISCAS56072.2025.11044254
M3 - Conference contribution
AN - SCOPUS:105010647139
T3 - Proceedings - IEEE International Symposium on Circuits and Systems
BT - ISCAS 2025 - IEEE International Symposium on Circuits and Systems, Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2025 IEEE International Symposium on Circuits and Systems, ISCAS 2025
Y2 - 25 May 2025 through 28 May 2025
ER -