46,060 research outputs found

    Dynamic texture recognition using time-causal and time-recursive spatio-temporal receptive fields

    Full text link
    This work presents a first evaluation of using spatio-temporal receptive fields from a recently proposed time-causal spatio-temporal scale-space framework as primitives for video analysis. We propose a new family of video descriptors based on regional statistics of spatio-temporal receptive field responses and evaluate this approach on the problem of dynamic texture recognition. Our approach generalises a previously used method, based on joint histograms of receptive field responses, from the spatial to the spatio-temporal domain and from object recognition to dynamic texture recognition. The time-recursive formulation enables computationally efficient time-causal recognition. The experimental evaluation demonstrates competitive performance compared to state-of-the-art. Especially, it is shown that binary versions of our dynamic texture descriptors achieve improved performance compared to a large range of similar methods using different primitives either handcrafted or learned from data. Further, our qualitative and quantitative investigation into parameter choices and the use of different sets of receptive fields highlights the robustness and flexibility of our approach. Together, these results support the descriptive power of this family of time-causal spatio-temporal receptive fields, validate our approach for dynamic texture recognition and point towards the possibility of designing a range of video analysis methods based on these new time-causal spatio-temporal primitives.Comment: 29 pages, 16 figure

    Space Object Identification Using Spatio-Temporal Pattern Recognition

    Get PDF
    This thesis is part of a research effort to automate the task of characterizing space objects or satellites based on a sequence of images. The goal is to detect space object anomalies. Two algorithms are considered - the feature space trajectory neural network (FST NN) and hidden Markov model (HMM) classifier. The FST NN was first presented by Leonard Neiberg and David P. Casasent in 1994 as a target identification tool. Kenneth H. Fielding and Dennis W. Ruck recently applied the hidden Markov model classifier to a 3D moving light display identification problem and a target recognition problem, using time history information to improve classification results. Time sequenced images produced by a simulation program are used for developing and testing the anomaly detection algorithms. A variety of features are tested for this problem. Features are derived from the two dimensional (2D) discrete Fourier transform (DFT) with various normalization schemes applied. The FST NN is found to be more robust than the HMM classifier. Both algorithms are capable of achieving perfect classification, but when shot noise is added to the images or when the image sample spacing is increased, the FST NN continues to perform well while the HMM performance declines. A new test is presented that measures how wdl a test sequence matches other sequences in the database. The FST NN is based strictly on feature space distance; but if the order of the sequence is important, the new test is useful. (AN

    ART-EMAP: A Neural Network Architecture for Object Recognition by Evidence Accumulation

    Full text link
    A new neural network architecture is introduced for the recognition of pattern classes after supervised and unsupervised learning. Applications include spatio-temporal image understanding and prediction and 3-D object recognition from a series of ambiguous 2-D views. The architecture, called ART-EMAP, achieves a synthesis of adaptive resonance theory (ART) and spatial and temporal evidence integration for dynamic predictive mapping (EMAP). ART-EMAP extends the capabilities of fuzzy ARTMAP in four incremental stages. Stage 1 introduces distributed pattern representation at a view category field. Stage 2 adds a decision criterion to the mapping between view and object categories, delaying identification of ambiguous objects when faced with a low confidence prediction. Stage 3 augments the system with a field where evidence accumulates in medium-term memory (MTM). Stage 4 adds an unsupervised learning process to fine-tune performance after the limited initial period of supervised network training. Each ART-EMAP stage is illustrated with a benchmark simulation example, using both noisy and noise-free data. A concluding set of simulations demonstrate ART-EMAP performance on a difficult 3-D object recognition problem.Advanced Research Projects Agency (ONR N00014-92-J-4015); National Science Foundation (IRI-90-00530); Office of Naval Research (N00014-91-J-4100); Air Force Office of Scientific Research (90-0083

    Enhanced tracking and recognition of moving objects by reasoning about spatio-temporal continuity.

    Get PDF
    A framework for the logical and statistical analysis and annotation of dynamic scenes containing occlusion and other uncertainties is presented. This framework consists of three elements; an object tracker module, an object recognition/classification module and a logical consistency, ambiguity and error reasoning engine. The principle behind the object tracker and object recognition modules is to reduce error by increasing ambiguity (by merging objects in close proximity and presenting multiple hypotheses). The reasoning engine deals with error, ambiguity and occlusion in a unified framework to produce a hypothesis that satisfies fundamental constraints on the spatio-temporal continuity of objects. Our algorithm finds a globally consistent model of an extended video sequence that is maximally supported by a voting function based on the output of a statistical classifier. The system results in an annotation that is significantly more accurate than what would be obtained by frame-by-frame evaluation of the classifier output. The framework has been implemented and applied successfully to the analysis of team sports with a single camera. Key words: Visua

    Spatio-temporal interactive fusion based visual object tracking method

    Get PDF
    Visual object tracking tasks often struggle with utilizing inter-frame correlation information and handling challenges like local occlusion, deformations, and background interference. To address these issues, this paper proposes a spatio-temporal interactive fusion (STIF) based visual object tracking method. The goal is to fully utilize spatio-temporal background information, enhance feature representation for object recognition, improve tracking accuracy, adapt to object changes, and reduce model drift. The proposed method incorporates feature-enhanced networks in both temporal and spatial dimensions. It leverages spatio-temporal background information to extract salient features that contribute to improved object recognition and tracking accuracy. Additionally, the model’s adaptability to object changes is enhanced, and model drift is minimized. A spatio-temporal interactive fusion network is employed to learn a similarity metric between the memory frame and the query frame by utilizing feature enhancement. This fusion network effectively filters out stronger feature representations through the interactive fusion of information. The proposed tracking method is evaluated on four challenging public datasets. The results demonstrate that the method achieves state-of-the-art (SOTA) performance and significantly improves tracking accuracy in complex scenarios affected by local occlusion, deformations, and background interference. Finally, the method achieves a remarkable success rate of 78.8% on TrackingNet, a large-scale tracking dataset

    The 4-D approach to visual control of autonomous systems

    Get PDF
    Development of a 4-D approach to dynamic machine vision is described. Core elements of this method are spatio-temporal models oriented towards objects and laws of perspective projection in a foward mode. Integration of multi-sensory measurement data was achieved through spatio-temporal models as invariants for object recognition. Situation assessment and long term predictions were allowed through maintenance of a symbolic 4-D image of processes involving objects. Behavioral capabilities were easily realized by state feedback and feed-foward control
    • …
    corecore