1,773 research outputs found

    Dynamic texture recognition using time-causal and time-recursive spatio-temporal receptive fields

    Full text link
    This work presents a first evaluation of using spatio-temporal receptive fields from a recently proposed time-causal spatio-temporal scale-space framework as primitives for video analysis. We propose a new family of video descriptors based on regional statistics of spatio-temporal receptive field responses and evaluate this approach on the problem of dynamic texture recognition. Our approach generalises a previously used method, based on joint histograms of receptive field responses, from the spatial to the spatio-temporal domain and from object recognition to dynamic texture recognition. The time-recursive formulation enables computationally efficient time-causal recognition. The experimental evaluation demonstrates competitive performance compared to state-of-the-art. Especially, it is shown that binary versions of our dynamic texture descriptors achieve improved performance compared to a large range of similar methods using different primitives either handcrafted or learned from data. Further, our qualitative and quantitative investigation into parameter choices and the use of different sets of receptive fields highlights the robustness and flexibility of our approach. Together, these results support the descriptive power of this family of time-causal spatio-temporal receptive fields, validate our approach for dynamic texture recognition and point towards the possibility of designing a range of video analysis methods based on these new time-causal spatio-temporal primitives.Comment: 29 pages, 16 figure

    Real-time target and pose recognition for 3-D graphical overlay

    Get PDF
    Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1997.Includes bibliographical references (leaves 47-48).by Jeffrey M. Levine.M.Eng

    A Neural Model for Self Organizing Feature Detectors and Classifiers in a Network Hierarchy

    Full text link
    Many models of early cortical processing have shown how local learning rules can produce efficient, sparse-distributed codes in which nodes have responses that are statistically independent and low probability. However, it is not known how to develop a useful hierarchical representation, containing sparse-distributed codes at each level of the hierarchy, that incorporates predictive feedback from the environment. We take a step in that direction by proposing a biologically plausible neural network model that develops receptive fields, and learns to make class predictions, with or without the help of environmental feedback. The model is a new type of predictive adaptive resonance theory network called Receptive Field ARTMAP, or RAM. RAM self organizes internal category nodes that are tuned to activity distributions in topographic input maps. Each receptive field is composed of multiple weight fields that are adapted via local, on-line learning, to form smooth receptive ftelds that reflect; the statistics of the activity distributions in the input maps. When RAM generates incorrect predictions, its vigilance is raised, amplifying subtractive inhibition and sharpening receptive fields until the error is corrected. Evaluation on several classification benchmarks shows that RAM outperforms a related (but neurally implausible) model called Gaussian ARTMAP, as well as several standard neural network and statistical classifters. A topographic version of RAM is proposed, which is capable of self organizing hierarchical representations. Topographic RAM is a model for receptive field development at any level of the cortical hierarchy, and provides explanations for a variety of perceptual learning data.Defense Advanced Research Projects Agency and Office of Naval Research (N00014-95-1-0409

    Evaluation of sets of oriented and non-oriented receptive fields as local descriptors

    Get PDF
    Local descriptors are increasingly used for the task of object recognition because of their perceived robustness with respect to occlusions and to global geometrical deformations. We propose a performance criterion for a local descriptor based on the tradeoff between selectivity and invariance. In this paper, we evaluate several local descriptors with respect to selectivity and invariance. The descriptors that we evaluated are Gaussian derivatives up to the third order, gray image patches, and Laplacian-based descriptors with either three scales or one scale filters. We compare selectivity and invariance to several affine changes such as rotation, scale, brightness, and viewpoint. Comparisons have been made keeping the dimensionality of the descriptors roughly constant. The overall results indicate a good performance by the descriptor based on a set of oriented Gaussian filters. It is interesting that oriented receptive fields similar to the Gaussian derivatives as well as receptive fields similar to the Laplacian are found in primate visual cortex

    Statistical Analysis of Dynamic Actions

    Get PDF
    Real-world action recognition applications require the development of systems which are fast, can handle a large variety of actions without a priori knowledge of the type of actions, need a minimal number of parameters, and necessitate as short as possible learning stage. In this paper, we suggest such an approach. We regard dynamic activities as long-term temporal objects, which are characterized by spatio-temporal features at multiple temporal scales. Based on this, we design a simple statistical distance measure between video sequences which captures the similarities in their behavioral content. This measure is nonparametric and can thus handle a wide range of complex dynamic actions. Having a behavior-based distance measure between sequences, we use it for a variety of tasks, including: video indexing, temporal segmentation, and action-based video clustering. These tasks are performed without prior knowledge of the types of actions, their models, or their temporal extents

    Rotation Invariant Object Recognition from One Training Example

    Get PDF
    Local descriptors are increasingly used for the task of object recognition because of their perceived robustness with respect to occlusions and to global geometrical deformations. Such a descriptor--based on a set of oriented Gaussian derivative filters-- is used in our recognition system. We report here an evaluation of several techniques for orientation estimation to achieve rotation invariance of the descriptor. We also describe feature selection based on a single training image. Virtual images are generated by rotating and rescaling the image and robust features are selected. The results confirm robust performance in cluttered scenes, in the presence of partial occlusions, and when the object is embedded in different backgrounds

    Invariance of visual operations at the level of receptive fields

    Get PDF
    Receptive field profiles registered by cell recordings have shown that mammalian vision has developed receptive fields tuned to different sizes and orientations in the image domain as well as to different image velocities in space-time. This article presents a theoretical model by which families of idealized receptive field profiles can be derived mathematically from a small set of basic assumptions that correspond to structural properties of the environment. The article also presents a theory for how basic invariance properties to variations in scale, viewing direction and relative motion can be obtained from the output of such receptive fields, using complementary selection mechanisms that operate over the output of families of receptive fields tuned to different parameters. Thereby, the theory shows how basic invariance properties of a visual system can be obtained already at the level of receptive fields, and we can explain the different shapes of receptive field profiles found in biological vision from a requirement that the visual system should be invariant to the natural types of image transformations that occur in its environment.Comment: 40 pages, 17 figure
    • …
    corecore