2 research outputs found

    A gaussian mixture-based approach to synthesizing nonlinear feature functions for automated object detection

    Get PDF
    Feature design is an important part to identify objects of interest into a known number of categories or classes in object detection. Based on the depth-first search for higher order feature functions, the technique of automated feature synthesis is generally considered to be a process of creating more effective features from raw feature data during the run of the algorithms. This dynamic synthesis of nonlinear feature functions is a challenging problem in object detection. This thesis presents a combinatorial approach of genetic programming and the expectation maximization algorithm (GP-EM) to synthesize nonlinear feature functions automatically in order to solve the given tasks of object detection. The EM algorithm investigates the use of Gaussian mixture which is able to model the behaviour of the training samples during an optimal GP search strategy. Based on the Gaussian probability assumption, the GP-EM method is capable of performing simultaneously dynamic feature synthesis and model-based generalization. The EM part of the approach leads to the application of the maximum likelihood (ML) operation that provides protection against inter-cluster data separation and thus exhibits improved convergence. Additionally, with the GP-EM method, an innovative technique, called the histogram region of interest by thresholds (HROIBT), is introduced for diagnosing protein conformation defects (PCD) from microscopic imagery. The experimental results show that the proposed approach improves the detection accuracy and efficiency of pattern object discovery, as compared to single GP-based feature synthesis methods and also a number of other object detection systems. The GP-EM method projects the hyperspace of the raw data onto lower-dimensional spaces efficiently, resulting in faster computational classification processes

    Street Scenes : towards scene understanding in still images

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.Includes bibliographical references (p. 171-182).This thesis describes an effort to construct a scene understanding system that is able to analyze the content of real images. While constructing the system we had to provide solutions to many of the fundamental questions that every student of object recognition deals with daily. These include the choice of data set, the choice of success measurement, the representation of the image content, the selection of inference engine, and the representation of the relations between objects. The main test-bed for our system is the CBCL StreetScenes data base. It is a carefully labeled set of images, much larger than any similar data set available at the time it was collected. Each image in this data set was labeled for 9 common classes such as cars, pedestrians, roads and trees. Our system represents each image using a set of features that are based on a model of the human visual system constructed in our lab. We demonstrate that this biologically motivated image representation, along with its extensions, constitutes an effective representation for object detection, facilitating unprecedented levels of detection accuracy. Similarly to biological vision systems, our system uses hierarchical representations.(cont.) We therefore explore the possible ways of combining information across the hierarchy into the final perception. Our system is trained using standard machine learning machinery, which was first applied to computer vision in earlier work of Prof. Poggio and others. We demonstrate how the same standard methods can be used to model relations between objects in images as well, capturing context information. The resulting system detects and localizes, using a unified set of tools and image representations, compact objects such as cars, amorphous objects such as trees and roads, and the relations between objects within the scene. The same representation also excels in identifying objects in clutter without scanning the image. Much of the work presented in the thesis was devoted to a rigorous comparison of our system to alternative object recognition systems. The results of these experiments support the effectiveness of simple feed-forward systems for the basic tasks involved in scene understanding. We make our results fully available to the public by publishing our code and data sets in hope that others may improve and extend our results.by Stanley Michael Bileschi.Ph.D
    corecore