<div class="csl-bib-body">
<div class="csl-entry">Reisinger, M. (2022). <i>System support & orchestration mechanisms for distributed DNN inference</i> [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2022.87400</div>
</div>
-
dc.identifier.uri
https://doi.org/10.34726/hss.2022.87400
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/20215
-
dc.description.abstract
The use of edge computing as a platform for distributed DNN inference is an active area of research. Recent work proposes new neural network architectures that facilitate the distribution of DNN workloads in such environments. In addition to the classifier on a DNN’s final layer, these architectures introduce side-exit classifiers at intermediate layers. With this approach it is possible to obtain inference results at earlier points in the network and thereby reduce the compute overhead, which is critical for the operation on more constrained devices. This thesis follows a recent line of research, that uses this novel architecture to shift DNN computations towards less powerful devices at the edge of the network, to improve user experience. In contrast to related work, which is more focused on algorithmic aspects to optimize the distributed execution of DNNs, this thesis puts a focus on the design aspects that enable the implementation of an extensible orchestration framework for distributing inference of feed-forward DNN models. Each host in the compute hierarchy operates a runtime environment that offers APIs for orchestration and execution of DNNs, as well as a component for monitoring the node’s resource levels and network conditions. Compute nodes are required to register with a central controller, which maintains a global view on the compute hierarchy. Finally, a scheduler decides about the deployment and orchestration of a given DNN model over the available compute resources. From a software architecture perspective, the scheduler offers a plugin framework, that allows system users to implement and apply their own algorithms for custom placement policies. The system also readily comes with a number of strategies, that aim to minimize end- to-end latency of the DNN inference. We show the optimal placement of layers in the described system landscapes to be an NP-hard combinatorial optimization problem, with respect to minimizing latency. Therefore, we provide an exact algorithm, in the form of an integer linear program, that solves the placement problem to optimality, as well as heuristic approaches for bigger problem instances. Finally, experimental studies evaluate a prototypical system implementation in simulation- based scenarios and on a physical test-bed. On simulated compute hierarchies, the exact placement clearly outperforms the traditional cloud-centric placement. A feasibility study on a physical test-bed confirms that the system is able to identify efficient placements based on monitored environmental conditions.
en
dc.language
English
-
dc.language.iso
en
-
dc.rights.uri
http://rightsstatements.org/vocab/InC/1.0/
-
dc.subject
Artificial Intelligence
en
dc.subject
Deep Neural Networks
en
dc.subject
Edge Computing
en
dc.subject
Fog Computing
en
dc.title
System support & orchestration mechanisms for distributed DNN inference
en
dc.type
Thesis
en
dc.type
Hochschulschrift
de
dc.rights.license
In Copyright
en
dc.rights.license
Urheberrechtsschutz
de
dc.identifier.doi
10.34726/hss.2022.87400
-
dc.contributor.affiliation
TU Wien, Österreich
-
dc.rights.holder
Matthias Reisinger
-
dc.publisher.place
Wien
-
tuw.version
vor
-
tuw.thesisinformation
Technische Universität Wien
-
dc.contributor.assistant
Frangoudis, Pantelis
-
tuw.publication.orgunit
E194 - Institut für Information Systems Engineering