An Off-Chip Memory Access Optimization for Embedded Deep Learning Systems

Putra, Rachmad Vidya Wicaksana; Hanif, Muhammad Abdullah; Shafique, Muhammad

doi:10.1007/978-3-031-19568-6_6

Datensatz Zitierlink:

http://hdl.handle.net/20.500.12708/191922

Titel:

An Off-Chip Memory Access Optimization for Embedded Deep Learning Systems

Zitat:

Putra, R. V. W., Hanif, M. A., & Shafique, M. (2023). An Off-Chip Memory Access Optimization for Embedded Deep Learning Systems. In S. Pasricha & M. Shafique (Eds.), Embedded Machine Learning for Cyber-Physical, IoT, and Edge Computing : Hardware Architectures (pp. 175–198). Springer. https://doi.org/10.1007/978-3-031-19568-6_6

Verlags-DOI:

10.1007/978-3-031-19568-6_6

Publikationstyp:

Buchbeitrag - Beitrag in Sammelband

Sprache:

Englisch

Autor_innen:

Putra, Rachmad Vidya Wicaksana
Hanif, Muhammad Abdullah
Shafique, Muhammad

Organisationseinheit:

E191-02 - Forschungsbereich Embedded Computing Systems

Erschienen in:

Embedded Machine Learning for Cyber-Physical, IoT, and Edge Computing : Hardware Architectures

ISBN:

978-3-031-19568-6

Datum (veröffentlicht):

1-Okt-2023

Umfang:

Verlag:

Springer, Cham

Keywords:

deep learning; hardware accelerator; off-chip dram accesses; data partitioning and scheduling; energy efficiency; embedded systems

Abstract:

Implementations of Deep Neural Networks (DNNs) or Deep Learning (DL) for embedded applications may improve the users’ quality of life, as DL has become a prominent solution for many machine learning (ML) tasks, like personalized healthcare assistance. Such implementations require high energy efficiency since embedded applications usually have tight operational constraints, such as small memory and low operational power/energy. Therefore, specialized hardware accelerators are typically employed to expedite the DL inference. However, previous works have shown that DL accelerators still suffer from high energy consumption from the DRAM-based off-chip memory accesses, thereby hindering the embedded DL implementations. In this chapter, we discuss our design methodology for optimizing the energy consumption of DRAM accesses for the DL accelerators targeting embedded applications. Our design methodology employs an exploration technique to find the data partitioning and scheduling that offer minimum DRAM accesses for the given DNN model and exploits the low latency DRAMs to efficiently perform data accesses that incur minimum DRAM access energy.

Forschungsschwerpunkte:

Computer Engineering and Software-Intensive Systems: 100%

Wissenschaftszweig:

1020 - Informatik: 100%

Enthalten in den Sammlungen:

Book Contribution

Zur Langanzeige

Seiten Aufrufe

142

aufgerufen am 15.01.2024

Download(s)

aufgerufen am 15.01.2024

Google Scholar^TM

Check

Seiten Aufrufe

Download(s)

Google ScholarTM

Google Scholar^TM