Wess, M., Schnöll, D., Dallinger, D., Bittner, M., & Jantsch, A. (2024). Conformal Prediction Based Confidence for Latency Estimation of DNN Accelerators: A Black-Box Approach. IEEE Access, 12, 109847–109860. https://doi.org/10.1109/ACCESS.2024.3439850
Today, there exists a large number of different embedded hardware platforms for accelerating the inference of Deep Neural Networks (DNNs). To enable rapid application development, a number of prediction frameworks have been proposed to estimate the DNN inference latency on a wide range of hardware platforms. This work presents a novel smart padding benchmarking method, which allows the profiling of hardware platforms without requiring detailed per-layer reports. To mitigate the measurement inaccuracies inherent in the black-box approach, a confidence framework comprising three metrics has been developed. These metrics not only enhance the interpretation of prediction results but also significantly contribute to the refinement of the estimation framework itself, as they facilitate to improve the coverage of the training dataset for relevant layers and detect weaknesses in the training dataset. Empirical results demonstrate the method's robustness, with average prediction errors minimized to below 10% for smart padding benchmarking-based ANNETTE predictions for the Jetson Xavier, NXP i.MX93, and NXP i.MX8M+.