The thesis deals with the problem of invariance in image representations for object categorybased image classification. The variations in object images caused by various factors such as changes in scale, position and orientation of the objects makes object category-based image classification a challenging task. The so-called invariant local descriptors are widely used to represent the images as a set of image patch descriptions to achieve robustness to such image variations. However, these descriptors are calculated locally and thus are unable to capture the global image structure. To this end, the work presented in this thesis aims to develop discriminative global image representations by deriving the relationships of object features in a way that is highly insensitive to several image variations such as changes in scale, position or in-plane rotations. These global image representations are applicable to object images having minimal or no background clutter as well as those with severe background clutter. The task of image-based classification of ancient coins is taken as a motivating example for the first group as they are imaged on homogeneous background. Nevertheless, coin images have challenging variations which are differences in object scale, position and orientation. A global invariant representation is developed to cope with such variations where, as a first step, the coin images are automatically segmented, cropped and normalized to acquire scale- and translation-invariance. The segmented image is then partitioned into circular regions from which rotation-invariant local features are sampled, thus achieving rotation-invariance both locally and globally. It is shown that the circular partitioning of the segmented image proves more robust to image rotations than other partitioning strategies such rectangular and radial-polar. In case of severe background clutter in the images, automatic object image segmentation is hard to achieve, as for instance for the natural images of butterflies which may also contain flowers, leaves and trees. In this case, a global image representation is achieved by deriving the invariant geometric relationships of the local invariant features. For this purpose, triangulation is performed among the positions of the local features in the 2D image space. Since the angles and side ratios of a triangle are scale- and rotation-invariant, the global image representation based on the triangulation of local features is also invariant to changes in object scale, position and orientations. The trained object model is made more discriminating by using the local features from the foreground and rejecting the background information. In the presence of image variations caused by changes in object scale, position and orientation the image representation based on invariant geometric relationships of the local features resulted in improved performance.
Abweichender Titel laut Übersetzung der Verfasserin/des Verfassers