Enabling DNN Acceleration With Data and Model Parallelization Over Ubiquitous End Devices

Huang, Yakun; Qiao, Xiuquan; Lai, Wenhai; Dustdar, Schahram; Zhang, Jianwei; Li, Jiulin

doi:10.1109/JIOT.2021.3112715

Record link:

http://hdl.handle.net/20.500.12708/81445

Title:

Enabling DNN Acceleration With Data and Model Parallelization Over Ubiquitous End Devices

Citation:

Huang, Y., Qiao, X., Lai, W., Dustdar, S., Zhang, J., & Li, J. (2022). Enabling DNN Acceleration With Data and Model Parallelization Over Ubiquitous End Devices. IEEE Internet of Things Journal, 9(16), 15053–15065. https://doi.org/10.1109/JIOT.2021.3112715

Publisher DOI:

10.1109/JIOT.2021.3112715

Publication Type:

Article - Original Research Article

Language:

English

Authors:

Huang, Yakun
Qiao, Xiuquan
Lai, Wenhai
Dustdar, Schahram
Zhang, Jianwei
Li, Jiulin

Organisational Unit:

E194-02 - Forschungsbereich Distributed Systems

Journal:

IEEE Internet of Things Journal

ISSN:

2327-4662

Date (published):

2022

Number of Pages:

Publisher:

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Peer reviewed:

Yes

Keywords:

Collaborative inference; cross-platform; deep learning (DL); distributed deep neural network (DNN); ubiquitous end devices

Abstract:

Deep neural network (DNN) shows great promise in providing more intelligence to ubiquitous end devices. However, the existing partition-offloading schemes adopt data-parallel or model-parallel collaboration between devices and the cloud, which does not make full use of the resources of end devices for deep-level parallel execution. This article proposes eDDNN (i.e., enabling Distributed DNN), a collaborative inference scheme over heterogeneous end devices using cross-platform Web technology, moving the computation close to ubiquitous end devices, improving resource utilization, and reducing the computing pressure of data centers. eDDNN implements D2D communication and collaborative inference among heterogeneous end devices with WebRTC protocol, divides the data and corresponding DNN model into pieces simultaneously, and then executes inference almost independently by establishing a layer dependency table. Besides, eDDNN provides a dynamic allocation algorithm based on deep reinforcement learning to minimize latency. We conduct experiments on various data sets and DNNs and further employ eDDNN into a mobile Web AR application to illustrate the effectiveness. The results show that eDDNN can achieve the latency decrease by 2.98 × , reduce mobile energy by 1.8 × , and relieve the computing pressure of the edge server by 2.57 × , against a typical partition-offloading approach.

Project (external):

National Key Research and Development Program of China
Funds for International Cooperation and Exchange of NSFC
111 Project

Project ID:

Grant 2019YFF0301500
Grant 61720106007
Grant B18008

Research Areas:

Information Systems Engineering: 100%

Science Branch:

1020 - Informatik: 100%

Appears in Collections:

Article

Show full item record

Page view(s)

272

checked on Nov 23, 2023

Google Scholar^TM

Check

Page view(s)

Google ScholarTM

Google Scholar^TM