Title: 6D pose estimation of objects using limited training data
Other Titles: SixD
Language: English
Authors: Park, Kiru 
Qualification level: Doctoral
Advisor: Vincze, Markus 
Assisting Advisor: Patten, Timothy Michael 
Issue Date: 2020
Number of Pages: 109
Qualification level: Doctoral
Abstract: 
Pose estimation of objects is an important task to understand the surrounding environment for interacting with the objects in robot manipulation and augmented reality applications. Major computer vision tasks, such as object detection and classification, have significantly improved using Convolutions Neural Networks (CNN). Likewise, recent pose estimation methods using CNN have achieved high performance using a large amount of training data, which is, however, difficult to obtain from real environments.This thesis presents multiple methods that overcome the limited source of training in practical scenarios while solving common challenges in object pose estimation. Symmetry and occlusion of objects are the most common challenges that make estimations inaccurate. This thesis introduces a method that regresses pixel-wise coordinates of an object while resolving ambiguous views from symmetric poses with a novel loss function in the training process. Coordinates of occluded regions are also predicted regardless of visibility, which makes the method robust to occlusion. The method shows state-of-the-art performance in the evaluations using only a limited number of real images. Nevertheless, annotating object poses in images is a difficult and time-consuming task, which prevents pose estimation methods from learning a new object from real scenes that are clutter. This thesis introduces an approach that leverages a few cluttered images of an object to learn its appearances in arbitrary poses. The novel refinement step updates pose annotations of input images to reduce pose errors that are common if poses are self-annotated by camera tracking or manually annotated by humans. Evaluations present the generated images from the method lead to state-of-the-art performance compared to methods using 13 times the number of real training images. Domains such as retail shops face new objects very often. Thus, it is inefficient to train pose estimators for new objects every time. Furthermore, it is difficult to build precise 3D models of all instances in real-world environments. A template-based method in this thesis tackles these practical challenges by estimating poses of a new object using previous observations of the same or similar objects. The nearest observations are used to determine the object’s locations, segmentation masks, and poses. The method is further extended to predict dense correspondences between the nearest observation and a target object for transferring grasp poses from similar experiences. Evaluations using public datasets show the template-based method performs better than baseline methods for segmentation and pose estimation tasks. Grasp experiments using a robot show the benefit of leveraging successful grasp experiences that significantly improve the grasp performance for familiar objects.
Keywords: Roboter; Bildverarbeitung; Objekterkennung; Posebestimmung
Robot; computer Vision; recognition; pose estimation
URI: https://doi.org/10.34726/hss.2020.85042
http://hdl.handle.net/20.500.12708/16270
DOI: 10.34726/hss.2020.85042
Library ID: AC16079462
Organisation: E376 - Institut für Automatisierungs- und Regelungstechnik 
Publication Type: Thesis
Hochschulschrift
Appears in Collections:Thesis

Files in this item:

Show full item record

Page view(s)

47
checked on Apr 28, 2021

Download(s)

104
checked on Apr 28, 2021

Google ScholarTM

Check


Items in reposiTUm are protected by copyright, with all rights reserved, unless otherwise indicated.