Title: Object classification for robot vision through RGB-D recognition and domain adaptation
Other Titles: Objektklassifizierung für Robotersehen mit RGB-D Erkennung und Domain Adaptierung
Language: English
Authors: Loghmani, Mohammad Reza 
Qualification level: Doctoral
Advisor: Vincze, Markus 
Assisting Advisor: Patten, Timothy Michael 
Issue Date: 2020
Number of Pages: 95
Qualification level: Doctoral
Object recognition, or object classification, is an essential skill for robot visual perception systems since it constitutes the foundation for higher-level tasks like object detection, pose estimation and manipulation. Nonetheless, recognizing objects in unconstrained environments remains arduous with robots facing challenges such as intra-class variation, occlusion, clutter, view point variation, and changes in light and scale. Deep convolutional neural networks (CNNs) have revolutionized object classification and computer vision as a whole. However, standard computer vision benchmarks often fail to address all the challenges of robot vision. This results in the development of classification models that perform poorly when deployed on a robot in-the-wild. In this thesis, we perform a systematic study of object recognition for robot vision and propose algorithmic innovations that tackle different aspects of this multifaceted problem. We first collect a robot-centric dataset called autonomous robot indoor dataset and test the performance of well-known CNN architectures on it. This evaluation indicates two main lines of research for more reliable and robust object recognition: (i) the integration of geometric information as depth data with the standard RGB data, and (ii) the use of domain adaptation to bridge the gap between the training (source) data and the real (target) data the robot encounters. To combine RGB and depth data, we propose recurrent convolutional fusion: a novel architecture that extracts features from different layers of a two-stream CNN and combines them using a recurrent neural network. To perform domain adaptation on RGB-D data, we propose a multi-task learning method that, in addition to the standard recognition task, learns to predict the relative rotation between the RGB and depth image of a sample. We go one step further and consider the more realistic problem of open set domain adaptation (OSDA), that requires to adapt two domains when the target contains not only the known classes of the source, but also unknown classes. We propose positive-unlabeled reconstruction encoding, an algorithm that uses the theoretical framework of positive-unlabeled learning and a novel loss based on sample reconstruction to recognize the unknown classes of the target. We further improve upon this algorithm by proposing rotation-based openset that performs both the adaptation and the known/unknown recognition using the self-supervised task of relative rotation. Extensive quantitative and qualitative experiments on standard benchmarks and newly collected datasets empirically validate our algorithmic contributions. These methods push the state of the art in RGB-D object recognition and domain adaptation and brings us closer to build robotic systems with human-like recognition performance.
Keywords: Robot; computer Vision; recognition; real-world
URI: https://doi.org/10.34726/hss.2020.80401
DOI: 10.34726/hss.2020.80401
Library ID: AC15671506
Organisation: E376 - Institut für Automatisierungs- und Regelungstechnik 
Publication Type: Thesis
Appears in Collections:Thesis

Files in this item:

Show full item record

Page view(s)

checked on Apr 26, 2021


checked on Apr 26, 2021

Google ScholarTM


Items in reposiTUm are protected by copyright, with all rights reserved, unless otherwise indicated.