<div class="csl-bib-body">
<div class="csl-entry">Loghmani, M. R. (2020). <i>Object classification for robot vision through RGB-D recognition and domain adaptation</i> [Dissertation, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2020.80401</div>
</div>
-
dc.identifier.uri
https://doi.org/10.34726/hss.2020.80401
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/14974
-
dc.description
Abweichender Titel nach Übersetzung der Verfasserin/des Verfassers
-
dc.description.abstract
Object recognition, or object classification, is an essential skill for robot visual perception systems since it constitutes the foundation for higher-level tasks like object detection, pose estimation and manipulation. Nonetheless, recognizing objects in unconstrained environments remains arduous with robots facing challenges such as intra-class variation, occlusion, clutter, view point variation, and changes in light and scale. Deep convolutional neural networks (CNNs) have revolutionized object classification and computer vision as a whole. However, standard computer vision benchmarks often fail to address all the challenges of robot vision. This results in the development of classification models that perform poorly when deployed on a robot in-the-wild. In this thesis, we perform a systematic study of object recognition for robot vision and propose algorithmic innovations that tackle different aspects of this multifaceted problem. We first collect a robot-centric dataset called autonomous robot indoor dataset and test the performance of well-known CNN architectures on it. This evaluation indicates two main lines of research for more reliable and robust object recognition: (i) the integration of geometric information as depth data with the standard RGB data, and (ii) the use of domain adaptation to bridge the gap between the training (source) data and the real (target) data the robot encounters. To combine RGB and depth data, we propose recurrent convolutional fusion: a novel architecture that extracts features from different layers of a two-stream CNN and combines them using a recurrent neural network. To perform domain adaptation on RGB-D data, we propose a multi-task learning method that, in addition to the standard recognition task, learns to predict the relative rotation between the RGB and depth image of a sample. We go one step further and consider the more realistic problem of open set domain adaptation (OSDA), that requires to adapt two domains when the target contains not only the known classes of the source, but also unknown classes. We propose positive-unlabeled reconstruction encoding, an algorithm that uses the theoretical framework of positive-unlabeled learning and a novel loss based on sample reconstruction to recognize the unknown classes of the target. We further improve upon this algorithm by proposing rotation-based openset that performs both the adaptation and the known/unknown recognition using the self-supervised task of relative rotation. Extensive quantitative and qualitative experiments on standard benchmarks and newly collected datasets empirically validate our algorithmic contributions. These methods push the state of the art in RGB-D object recognition and domain adaptation and brings us closer to build robotic systems with human-like recognition performance.
en
dc.language
English
-
dc.language.iso
en
-
dc.rights.uri
http://rightsstatements.org/vocab/InC/1.0/
-
dc.subject
Robot
en
dc.subject
computer Vision
en
dc.subject
recognition
en
dc.subject
real-world
en
dc.title
Object classification for robot vision through RGB-D recognition and domain adaptation
en
dc.title.alternative
Objektklassifizierung für Robotersehen mit RGB-D Erkennung und Domain Adaptierung
de
dc.type
Thesis
en
dc.type
Hochschulschrift
de
dc.rights.license
In Copyright
en
dc.rights.license
Urheberrechtsschutz
de
dc.identifier.doi
10.34726/hss.2020.80401
-
dc.contributor.affiliation
TU Wien, Österreich
-
dc.rights.holder
Mohammad Reza Loghmani
-
dc.publisher.place
Wien
-
tuw.version
vor
-
tuw.thesisinformation
Technische Universität Wien
-
dc.contributor.assistant
Patten, Timothy Michael
-
tuw.publication.orgunit
E376 - Institut für Automatisierungs- und Regelungstechnik