Schnöll, D., Wess, M., Bittner, M., Götzinger, M., & Jantsch, A. (2023). Fast, Quantization Aware DNN Training for Efficient HW Implementation. In 2023 26th Euromicro Conference on Digital System Design (DSD) (pp. 700–707). https://doi.org/10.1109/DSD60849.2023.00100
2023 26th Euromicro Conference on Digital System Design (DSD)
-
ISBN:
979-8-3503-4419-6
-
Datum (veröffentlicht):
2023
-
Veranstaltungsname:
26th Euromicro Conference on Digital System Design (DSD 2023)
en
Veranstaltungszeitraum:
6-Sep-2023 - 8-Sep-2023
-
Veranstaltungsort:
Golem, Durres, Albanien
-
Umfang:
8
-
Peer Reviewed:
Ja
-
Keywords:
Convolution; hardware-friendly; Neural networks; Quantization (signal); Quantization Aware Training; Training
en
Abstract:
Quantization of Deep Neural Networks is a central technique to reduce the computation load in embedded devices. Even in quantized Deep Neural Networks (DNNs), the scaler/rescaler following a convolution or dense layer often requires a high bit width multiplication and a shift. Previous work has proposed to remove the multiplier by restricting the quantization method. We propose a Quantisation Aware Training (QAT) approach, which explicitly models the rescaler during training, eliminating the limitations of quantization functions and achieving a 30-35% improvement in training time and a significant reduction in memory requirements compared to the state-of-the-art. GitHub: https://github.com/embedded-machine-learning/FastQATforPOTRescaler
en
Projekt (extern):
Christian Doppler Forschungsgesells
-
Forschungsschwerpunkte:
Mathematical and Algorithmic Foundations: 60% Computer Science Foundations: 30% Computational System Design: 10%