Putra, R. V. W., Hanif, M. A., & Shafique, M. (2023). An Off-Chip Memory Access Optimization for Embedded Deep Learning Systems. In S. Pasricha & M. Shafique (Eds.), Embedded Machine Learning for Cyber-Physical, IoT, and Edge Computing : Hardware Architectures (pp. 175–198). Springer. https://doi.org/10.1007/978-3-031-19568-6_6
E191-02 - Forschungsbereich Embedded Computing Systems
-
Published in:
Embedded Machine Learning for Cyber-Physical, IoT, and Edge Computing : Hardware Architectures
-
ISBN:
978-3-031-19568-6
-
Date (published):
1-Oct-2023
-
Number of Pages:
24
-
Publisher:
Springer, Cham
-
Peer reviewed:
Yes
-
Keywords:
deep learning; hardware accelerator; off-chip dram accesses; data partitioning and scheduling; energy efficiency; embedded systems
en
Abstract:
Implementations of Deep Neural Networks (DNNs) or Deep Learning (DL) for embedded applications may improve the users’ quality of life, as DL has become a prominent solution for many machine learning (ML) tasks, like personalized healthcare assistance. Such implementations require high energy efficiency since embedded applications usually have tight operational constraints, such as small memory and low operational power/energy. Therefore, specialized hardware accelerators are typically employed to expedite the DL inference. However, previous works have shown that DL accelerators still suffer from high energy consumption from the DRAM-based off-chip memory accesses, thereby hindering the embedded DL implementations. In this chapter, we discuss our design methodology for optimizing the energy consumption of DRAM accesses for the DL accelerators targeting embedded applications. Our design methodology employs an exploration technique to find the data partitioning and scheduling that offer minimum DRAM accesses for the given DNN model and exploits the low latency DRAMs to efficiently perform data accesses that incur minimum DRAM access energy.
en
Research Areas:
Computer Engineering and Software-Intensive Systems: 100%