Title: Object Classification for Robot Vision through RGB-D Recognition and Domain Adaptation
Language: English
Authors: Loghmani, Mohammad Reza 
Keywords: Robot; computer Vision; recognition; real-world
Advisor: Vincze, Markus 
Assisting Advisor: Patten, Timothy Michael 
Issue Date: 2020
Number of Pages: 108
Qualification level: Doctoral
Object recognition, or object classification, is an essential skill for robot visual perceptionsystems since it constitutes the foundation for higher-level tasks like object detection,pose estimation and manipulation. Nonetheless, recognizing objects in unconstrainedenvironments remains arduous with robots facing challenges such as intra-class variation,occlusion, clutter, viewpoint variation, and changes in light and scale.Deep convolutional neural networks (CNNs) have revolutionized object classificationand computer vision as a whole. However, standard computer vision benchmarks oftenfail to address all the challenges of robot vision. This results in the development ofclassification models that perform poorly when deployed on a robot in-the-wild.In this thesis, we perform a systematic study of object recognition for robot visionand propose algorithmic innovations that tackle different aspects of this multifacetedproblem. We first collect a robot-centric dataset called autonomous robot indoor datasetand test the performance of well-known CNN architectures on it. This evaluationindicates two main lines of research for more reliable and robust object recognition: (i)the integration of geometric information as depth data with the standard RGB data,and (ii) the use of domain adaptation to bridge the gap between the training (source)data and the real (target) data the robot encounters. To combine RGB and depth data,we propose recurrent convolutional fusion: a novel architecture that extracts featuresfrom different layers of a two-stream CNN and combines them using a recurrent neuralnetwork. To perform domain adaptation on RGB-D data, we propose a multi-tasklearning method that, in addition to the standard recognition task, learns to predictthe relative rotation between the RGB and depth image of a sample. We go one stepfurther and consider the more realistic problem of open set domain adaptation (OSDA),that requires to adapt two domains when the target contains not only the known classesof the source, but also unknown classes. We propose positive-unlabeled reconstructionencoding, an algorithm that uses the theoretical framework of positive-unlabeled learningand a novel loss based on sample reconstruction to recognize the unknown classes ofthe target. We further improve upon this algorithm by proposing rotation-based openset that performs both the adaptation and the known/unknown recognition using theself-supervised task of relative rotation.Extensive quantitative and qualitative experiments on standard benchmarks andnewly collected datasets empirically validate our algorithmic contributions. Thesemethods push the state of the art in RGB-D object recognition and domain adaptationand brings us closer to build robotic systems with human-like recognition performance.
URI: https://doi.org/10.34726/hss.2020.80401
DOI: 10.34726/hss.2020.80401
Library ID: AC15671506
Organisation: E376 - Institut für Automatisierungs- und Regelungstechnik 
Publication Type: Thesis
Appears in Collections:Hochschulschrift | Thesis

Files in this item:

Show full item record

Google ScholarTM


Items in reposiTUm are protected by copyright, with all rights reserved, unless otherwise indicated.