Object classification for robot vision through RGB-D recognition and domain adaptation

Loghmani, Mohammad Reza

doi:10.34726/hss.2020.80401

DC Element

Wert

Sprache

dc.contributor.advisor

Vincze, Markus

dc.contributor.author

Loghmani, Mohammad Reza

dc.date.accessioned

2020-07-20T13:53:11Z

dc.date.issued

2020

dc.date.submitted

2020-06

dc.identifier.citation

<div class="csl-bib-body"> <div class="csl-entry">Loghmani, M. R. (2020). <i>Object classification for robot vision through RGB-D recognition and domain adaptation</i> [Dissertation, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2020.80401</div> </div>

dc.identifier.uri

https://doi.org/10.34726/hss.2020.80401

dc.identifier.uri

http://hdl.handle.net/20.500.12708/14974

dc.description

Abweichender Titel nach Übersetzung der Verfasserin/des Verfassers

dc.description.abstract

Object recognition, or object classification, is an essential skill for robot visual perception systems since it constitutes the foundation for higher-level tasks like object detection, pose estimation and manipulation. Nonetheless, recognizing objects in unconstrained environments remains arduous with robots facing challenges such as intra-class variation, occlusion, clutter, view point variation, and changes in light and scale. Deep convolutional neural networks (CNNs) have revolutionized object classification and computer vision as a whole. However, standard computer vision benchmarks often fail to address all the challenges of robot vision. This results in the development of classification models that perform poorly when deployed on a robot in-the-wild. In this thesis, we perform a systematic study of object recognition for robot vision and propose algorithmic innovations that tackle different aspects of this multifaceted problem. We first collect a robot-centric dataset called autonomous robot indoor dataset and test the performance of well-known CNN architectures on it. This evaluation indicates two main lines of research for more reliable and robust object recognition: (i) the integration of geometric information as depth data with the standard RGB data, and (ii) the use of domain adaptation to bridge the gap between the training (source) data and the real (target) data the robot encounters. To combine RGB and depth data, we propose recurrent convolutional fusion: a novel architecture that extracts features from different layers of a two-stream CNN and combines them using a recurrent neural network. To perform domain adaptation on RGB-D data, we propose a multi-task learning method that, in addition to the standard recognition task, learns to predict the relative rotation between the RGB and depth image of a sample. We go one step further and consider the more realistic problem of open set domain adaptation (OSDA), that requires to adapt two domains when the target contains not only the known classes of the source, but also unknown classes. We propose positive-unlabeled reconstruction encoding, an algorithm that uses the theoretical framework of positive-unlabeled learning and a novel loss based on sample reconstruction to recognize the unknown classes of the target. We further improve upon this algorithm by proposing rotation-based openset that performs both the adaptation and the known/unknown recognition using the self-supervised task of relative rotation. Extensive quantitative and qualitative experiments on standard benchmarks and newly collected datasets empirically validate our algorithmic contributions. These methods push the state of the art in RGB-D object recognition and domain adaptation and brings us closer to build robotic systems with human-like recognition performance.

dc.language

English

dc.language.iso

dc.rights.uri

http://rightsstatements.org/vocab/InC/1.0/

dc.subject

Robot

dc.subject

computer Vision

dc.subject

recognition

dc.subject

real-world

dc.title

Object classification for robot vision through RGB-D recognition and domain adaptation

dc.title.alternative

Objektklassifizierung für Robotersehen mit RGB-D Erkennung und Domain Adaptierung

dc.type

Thesis

dc.type

Hochschulschrift

dc.rights.license

In Copyright

dc.rights.license

Urheberrechtsschutz

dc.identifier.doi

10.34726/hss.2020.80401

dc.contributor.affiliation

TU Wien, Österreich

dc.rights.holder

Mohammad Reza Loghmani

dc.publisher.place

Wien

tuw.version

vor

tuw.thesisinformation

Technische Universität Wien

dc.contributor.assistant

Patten, Timothy Michael

tuw.publication.orgunit

E376 - Institut für Automatisierungs- und Regelungstechnik

dc.type.qualificationlevel

Doctoral

dc.identifier.libraryid

AC15671506

dc.description.numberOfPages

dc.thesistype

Dissertation

dc.thesistype

Dissertation

tuw.author.orcid

0000-0002-2687-7877

dc.rights.identifier

In Copyright

dc.rights.identifier

Urheberrechtsschutz

tuw.advisor.staffStatus

staff

tuw.assistant.staffStatus

staff

item.languageiso639-1

item.openairetype

doctoral thesis

item.grantfulltext

open

item.fulltext

with Fulltext

item.cerifentitytype

Publications

item.mimetype

application/pdf

item.openairecristype

http://purl.org/coar/resource_type/c_db06

item.openaccessfulltext

Open Access

crisitem.author.dept

E376-02 - Forschungsbereich Komplexe Dynamische Systeme

crisitem.author.parentorg

E376 - Institut für Automatisierungs- und Regelungstechnik

Enthalten in den Sammlungen:

Thesis

Volltext (Version of Record (published version))

Adobe PDF

(3.41 MB)

Urheberrechtsschutz

Zur Kurzanzeige

Seiten Aufrufe

324

aufgerufen am 01.12.2023

Download(s)

299

aufgerufen am 01.12.2023

Google Scholar^TM

Check

Seiten Aufrufe

Download(s)

Google ScholarTM

Google Scholar^TM