Richtsfeld, A. (2013). Robust object detection for robotics using perceptual organization in 2D and 3D [Dissertation, Technische Universität Wien]. reposiTUm. http://hdl.handle.net/20.500.12708/160128
E376 - Institut für Automatisierungs- und Regelungstechnik
-
Date (published):
2013
-
Number of Pages:
92
-
Keywords:
Objektdetektion; Bildverarbeitung; Robotik
de
Object detection; Computer Vision; Robotics
en
Abstract:
Since robots conquered the assembly lines in factories, the focus of robotics research changed from simple pick and place tasks to sophisticated robotic solutions. The interest in mobile, domestic robotics increased and received great attention in the last decade. Bringing robots from assembly lines to our households means bringing them from a clearly structured workspace into an unknown environment with many uncertainties. Hence, the need for robustly working perception methods, which are able to perceive information from cluttered environments increased. The focus of this thesis lies in robust object detection from visual input data, first for 2D color image data and subsequently for range image data (RGB-D) by exploiting perceptual grouping. Perceptual grouping is a generic technique to organize visual primitives into meaningful groupings and is inspired by human perception. Object detection for 2D image data initially starts with extraction of edge primitives. A grouping framework for object detection by hierarchical data abstraction of visual input is introduced. Extracted edge primitives from 2D color images get grouped and parametrized at several data abstraction levels. Incremental indexing is used to iteratively group lower-level primitives to higher-level entities according to perceptual grouping rules, and finally ends in the detection of proto-objects, such as cuboids, cones, cylinders and spheres. Incremental indexing leads to anytime processing, a behavior of a system, delivering the best results generated so far whenever processing is stopped. It additionally avoids usage of parameters and thresholds. Furthermore, the proposed indexing method allows to easily integrate an attention mechanism to the vision system. Processing of the grouping system ends with a 3D model reconstruction by exploiting knowledge about the environment (supporting planes). An application of the system in a general applicable computer vision framework for mobile robotics is shown. Object detection for range image data (RGB-D) again is implemented in a hierarchical framework with several data abstraction levels, but starts from initially extracted surface primitives. After pre-segmentation of the image, surface patches get modeled as planes or B-spline surfaces. Model Selection with Minimum Description Length (MDL) is used to find for each surface patch the model that fits and therefore represents the data best. Inspired from the rules of perceptual grouping, relations between surface patches are defined and organized in a structured feature vector. To support the detection of a wide range of object types, feature vectors between surface models are learned with a support vector machine (SVM). With this method the importance of each relation for grouping is learned and the SVM predicts after a training stage the grouping of surface models. To satisfy global scene properties a graph-cut algorithm is finally employed to produce object hypotheses. The generality of the approach is shown when using different datasets with objects of different size, shape and appearance. The implemented hierarchical framework structure allows to use the developed system in different type of applications and in different environments and is therefore perfectly suitable for usage in mobile and domestic robotics.