891 research outputs found

    Real-time target and pose recognition for 3-D graphical overlay

    Get PDF
    Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1997.Includes bibliographical references (leaves 47-48).by Jeffrey M. Levine.M.Eng

    Statistical Analysis of Dynamic Actions

    Get PDF
    Real-world action recognition applications require the development of systems which are fast, can handle a large variety of actions without a priori knowledge of the type of actions, need a minimal number of parameters, and necessitate as short as possible learning stage. In this paper, we suggest such an approach. We regard dynamic activities as long-term temporal objects, which are characterized by spatio-temporal features at multiple temporal scales. Based on this, we design a simple statistical distance measure between video sequences which captures the similarities in their behavioral content. This measure is nonparametric and can thus handle a wide range of complex dynamic actions. Having a behavior-based distance measure between sequences, we use it for a variety of tasks, including: video indexing, temporal segmentation, and action-based video clustering. These tasks are performed without prior knowledge of the types of actions, their models, or their temporal extents

    Evaluation of sets of oriented and non-oriented receptive fields as local descriptors

    Get PDF
    Local descriptors are increasingly used for the task of object recognition because of their perceived robustness with respect to occlusions and to global geometrical deformations. We propose a performance criterion for a local descriptor based on the tradeoff between selectivity and invariance. In this paper, we evaluate several local descriptors with respect to selectivity and invariance. The descriptors that we evaluated are Gaussian derivatives up to the third order, gray image patches, and Laplacian-based descriptors with either three scales or one scale filters. We compare selectivity and invariance to several affine changes such as rotation, scale, brightness, and viewpoint. Comparisons have been made keeping the dimensionality of the descriptors roughly constant. The overall results indicate a good performance by the descriptor based on a set of oriented Gaussian filters. It is interesting that oriented receptive fields similar to the Gaussian derivatives as well as receptive fields similar to the Laplacian are found in primate visual cortex

    Rotation Invariant Object Recognition from One Training Example

    Get PDF
    Local descriptors are increasingly used for the task of object recognition because of their perceived robustness with respect to occlusions and to global geometrical deformations. Such a descriptor--based on a set of oriented Gaussian derivative filters-- is used in our recognition system. We report here an evaluation of several techniques for orientation estimation to achieve rotation invariance of the descriptor. We also describe feature selection based on a single training image. Virtual images are generated by rotating and rescaling the image and robust features are selected. The results confirm robust performance in cluttered scenes, in the presence of partial occlusions, and when the object is embedded in different backgrounds

    Human activity recognition for the use in intelligent spaces

    Get PDF
    The aim of this Graduation Project is to develop a generic biological inspired activity recognition system for the use in intelligent spaces. Intelligent spaces form the context for this project. The goal is to develop a working prototype that can learn and recognize human activities from a limited training set in all kinds of spaces and situations. For testing purposes, the office environment is chosen as subject for the intelligent space. The purpose of the intelligent space, in this case the office, is left out of the scope of the project. The scope is limited to the perceptive system of the intelligent space. The notion is that the prototype should not be bound to a specific space, but it should be a generic perceptive system able to cope in any given space within the build environment. The fact that no space is the same, developing a prototype without any domain knowledge in which it can learn and recognize activities, is the main challenge of this project. In al layers of the prototype, the data processing is kept as abstract and low level as possible to keep it as generic as possible. This is done by using local features, scale invariant descriptors and by using hidden Markov models for pattern recognition. The novel approach of the prototype is that it combines structure as well as motion features in one system making it able to train and recognize a variety of activities in a variety of situations. From rhythmic expressive actions with a simple cyclic pattern to activities where the movement is subtle and complex like typing and reading, can all be trained and recognized. The prototype has been tested on two very different data sets. The first set in which the videos are shot in a controlled environment in which simple actions were performed. The second set in which videos are shot in a normal office where daily office activities are captured and categorized afterwards. The prototype has given some promising results proving it can cope with very different spaces, actions and activities. The aim of this Graduation Project is to develop a generic biological inspired activity recognition system for the use in intelligent spaces. Intelligent spaces form the context for this project. The goal is to develop a working prototype that can learn and recognize human activities from a limited training set in all kinds of spaces and situations. For testing purposes, the office environment is chosen as subject for the intelligent space. The purpose of the intelligent space, in this case the office, is left out of the scope of the project. The scope is limited to the perceptive system of the intelligent space. The notion is that the prototype should not be bound to a specific space, but it should be a generic perceptive system able to cope in any given space within the build environment. The fact that no space is the same, developing a prototype without any domain knowledge in which it can learn and recognize activities, is the main challenge of this project. In al layers of the prototype, the data processing is kept as abstract and low level as possible to keep it as generic as possible. This is done by using local features, scale invariant descriptors and by using hidden Markov models for pattern recognition. The novel approach of the prototype is that it combines structure as well as motion features in one system making it able to train and recognize a variety of activities in a variety of situations. From rhythmic expressive actions with a simple cyclic pattern to activities where the movement is subtle and complex like typing and reading, can all be trained and recognized. The prototype has been tested on two very different data sets. The first set in which the videos are shot in a controlled environment in which simple actions were performed. The second set in which videos are shot in a normal office where daily office activities are captured and categorized afterwards. The prototype has given some promising results proving it can cope with very different spaces, actions and activities

    Shape Representation in Primate Visual Area 4 and Inferotemporal Cortex

    Get PDF
    The representation of contour shape is an essential component of object recognition, but the cortical mechanisms underlying it are incompletely understood, leaving it a fundamental open question in neuroscience. Such an understanding would be useful theoretically as well as in developing computer vision and Brain-Computer Interface applications. We ask two fundamental questions: “How is contour shape represented in cortex and how can neural models and computer vision algorithms more closely approximate this?” We begin by analyzing the statistics of contour curvature variation and develop a measure of salience based upon the arc length over which it remains within a constrained range. We create a population of V4-like cells – responsive to a particular local contour conformation located at a specific position on an object’s boundary – and demonstrate high recognition accuracies classifying handwritten digits in the MNIST database and objects in the MPEG-7 Shape Silhouette database. We compare the performance of the cells to the “shape-context” representation (Belongie et al., 2002) and achieve roughly comparable recognition accuracies using a small test set. We analyze the relative contributions of various feature sensitivities to recognition accuracy and robustness to noise. Local curvature appears to be the most informative for shape recognition. We create a population of IT-like cells, which integrate specific information about the 2-D boundary shapes of multiple contour fragments, and evaluate its performance on a set of real images as a function of the V4 cell inputs. We determine the sub-population of cells that are most effective at identifying a particular category. We classify based upon cell population response and obtain very good results. We use the Morris-Lecar neuronal model to more realistically illustrate the previously explored shape representation pathway in V4 – IT. We demonstrate recognition using spatiotemporal patterns within a winnerless competition network with FitzHugh-Nagumo model neurons. Finally, we use the Izhikevich neuronal model to produce an enhanced response in IT, correlated with recognition, via gamma synchronization in V4. Our results support the hypothesis that the response properties of V4 and IT cells, as well as our computer models of them, function as robust shape descriptors in the object recognition process

    Melanoma Recognition using Kernel Classifiers

    Get PDF
    Melanoma is the most deadly skin cancer. Early diagnosis is a current challenge for clinicians. Current algorithms for skin lesions classification focus mostly on segmentation and feature extraction. This paper instead puts the emphasis on the learning process, proposing two kernel-based classifiers: support vector machines, and spin glass-Markov random fields. We benchmarked these algorithms against a state-of-the-art method on melanoma recognition. We show with extensive experiments that the support vector machine approach outperforms the other methods, proving to be an effective classification algorithm for computer assisted diagnosis of melanoma

    Invariance of visual operations at the level of receptive fields

    Get PDF
    Receptive field profiles registered by cell recordings have shown that mammalian vision has developed receptive fields tuned to different sizes and orientations in the image domain as well as to different image velocities in space-time. This article presents a theoretical model by which families of idealized receptive field profiles can be derived mathematically from a small set of basic assumptions that correspond to structural properties of the environment. The article also presents a theory for how basic invariance properties to variations in scale, viewing direction and relative motion can be obtained from the output of such receptive fields, using complementary selection mechanisms that operate over the output of families of receptive fields tuned to different parameters. Thereby, the theory shows how basic invariance properties of a visual system can be obtained already at the level of receptive fields, and we can explain the different shapes of receptive field profiles found in biological vision from a requirement that the visual system should be invariant to the natural types of image transformations that occur in its environment.Comment: 40 pages, 17 figure
    • …
    corecore