Part-based Grouping and Recognition: A Model-Guided Approach
The recovery of generic solid parts is a fundamental step towards the realization of general-purpose vision systems. This thesis investigates issues in grouping, segmentation and recognition of parts from two-dimensional edge images. A new paradigm of part-based grouping of features is introduced that bridges the classical grouping and model-based approaches with the purpose of directly recovering parts from real images, and part-like models are used that both yield low theoretical complexity and reliably recover part-plausible groups of features. The part-like models used are statistical point distribution models, whose training set is built using random deformable superellipse. The computational approach that is proposed to perform model-guided part-based grouping consists of four distinct stages. In the first stage, codons, contour portions of similar curvature, are extracted from the raw edge image. They are considered to be indivisible image features because they have the desirable property of belonging either to single parts or joints. In the second stage, small seed groups (currently pairs, but further extension are proposed) of codons are found that give enough structural information for part hypotheses to be created. The third stage consists in initialising and pre-shaping the models to all the seed groups and then performing a full fitting to a large neighbourhood of the pre-shaped model. The concept of pre-shaping to a few significant features is a relatively new concept in deformable model fitting that has helped to dramatically increase robustness. The initialisations of the part models to the seed groups is performed by the first direct least-square ellipse fitting algorithm, which has been jointly discovered during this research; a full theoretical proof of the method is provided. The last stage pertains to the global filtering of all the hypotheses generated by the previous stages according to the Minimum Description Length criterion: the small number of grouping hypotheses that survive this filtering stage are the most economical representation of the image in terms of the part-like models. The filtering is performed by the maximisation of a boolean quadratic function by a genetic algorithm, which has resulted in the best trade-off between speed and robustness. Finally, images of parts can have a pronounced 3D structure, with ends or sides clearly visible. In order to recover this important information, the part-based grouping method is extended by employing parametrically deformable aspects models which, starting from the initial position provided by the previous stages, are fitted to the raw image by simulated annealing. These models are inspired by deformable superquadrics but are built by geometric construction, which render them two order of magnitudes faster to generate than in previous works. A large number of experiments is provided that validate the approach and, since several new issues have been opened by it, some future work is proposed.