119,889 research outputs found

    Statistical Analysis of Dynamic Actions

    Get PDF
    Real-world action recognition applications require the development of systems which are fast, can handle a large variety of actions without a priori knowledge of the type of actions, need a minimal number of parameters, and necessitate as short as possible learning stage. In this paper, we suggest such an approach. We regard dynamic activities as long-term temporal objects, which are characterized by spatio-temporal features at multiple temporal scales. Based on this, we design a simple statistical distance measure between video sequences which captures the similarities in their behavioral content. This measure is nonparametric and can thus handle a wide range of complex dynamic actions. Having a behavior-based distance measure between sequences, we use it for a variety of tasks, including: video indexing, temporal segmentation, and action-based video clustering. These tasks are performed without prior knowledge of the types of actions, their models, or their temporal extents

    On-the-go machine vision sensing of cotton plant geometric parameters: first results

    Get PDF
    Plant geometrical parameters such as internode length (i.e. the distance between successive branches on the main stem) indicate water stress in cotton. This paper describes a machine vision system that has been designed to measure internode length for the purpose of determining real-time cotton plant irrigation requirement. The imaging system features an enclosure which continuously traverses the crop canopy and forces the flexible upper main stem of individual plants against a glass panel at the front of the enclosure, hence allowing images of the plant to be captured in a fixed object plane. Subsequent image processing of selected video sequences enabled detection of the main stem in 88% of frames. However, node detection was subject to a high false detection rate due to leaf edges present in the images. Manual identification of nodes in the acquired imagery enabled measurement of internode lengths with 3% standard error

    Appearance-Based Tracking and Face Identification in Video Sequences.

    Full text link
    We present a technique for face recognition in videos. We are able to recognise a face in a video sequence, given a single gallery image. By assuming that the face is in an approximately frontal position, we jointly model changes in facial appearance caused by identity and illumination. The identity of a face is described by a vector of appearance parameters. We use an angular distance to measure the similarity of faces and a probabilistic procedure to accumulate evidence for recognition along the sequence. We achieve 93.8% recognition success in a set of 65 sequences of 6 subjects from the La Cascia and Sclaroff database

    Gesture Recognition Aplication based on Dynamic Time Warping (DTW) FOR Omni-Wheel Mobile Robot

    Get PDF
    This project presents of the movement of omni-wheel robot moves in the trajectory obtained from the gesture recognition system based on Dynamic Time Warping. Single camera is used as the input of the system, which is also a reference to the movement of the omni-wheel robot. Some systems for gesture recognition have been developed using various methods and different approaches. The movement of the omni-wheel robot using the method of Dynamic Time Wrapping (DTW) which has the advantage able to calculate the distance of two data vectors with different lengths. By using this method we can measure the similarity between two sequences at different times and speeds. Dynamic Time Warping to compare the two parameters at varying times and speeds. Application of DTW widely applied in video, audio, graphics, etc. Due to data that can be changed in a linear manner so that it can be analyzed with DTW. In short can find the most suitable value by minimizing the difference between two multidimensional signals that have been compressed. DTW method is expected to gesture recognition system to work optimally, have a high enough value of accuracy and processing time is realtime

    UCMCTrack: Multi-Object Tracking with Uniform Camera Motion Compensation

    Full text link
    Multi-object tracking (MOT) in video sequences remains a challenging task, especially in scenarios with significant camera movements. This is because targets can drift considerably on the image plane, leading to erroneous tracking outcomes. Addressing such challenges typically requires supplementary appearance cues or Camera Motion Compensation (CMC). While these strategies are effective, they also introduce a considerable computational burden, posing challenges for real-time MOT. In response to this, we introduce UCMCTrack, a novel motion model-based tracker robust to camera movements. Unlike conventional CMC that computes compensation parameters frame-by-frame, UCMCTrack consistently applies the same compensation parameters throughout a video sequence. It employs a Kalman filter on the ground plane and introduces the Mapped Mahalanobis Distance (MMD) as an alternative to the traditional Intersection over Union (IoU) distance measure. By leveraging projected probability distributions on the ground plane, our approach efficiently captures motion patterns and adeptly manages uncertainties introduced by homography projections. Remarkably, UCMCTrack, relying solely on motion cues, achieves state-of-the-art performance across a variety of challenging datasets, including MOT17, MOT20, DanceTrack and KITTI. More details and code are available at https://github.com/corfyi/UCMCTrackComment: Accepted to AAAI 202

    Face detection and clustering for video indexing applications

    Get PDF
    This paper describes a method for automatically detecting human faces in generic video sequences. We employ an iterative algorithm in order to give a confidence measure for the presence or absence of faces within video shots. Skin colour filtering is carried out on a selected number of frames per video shot, followed by the application of shape and size heuristics. Finally, the remaining candidate regions are normalized and projected into an eigenspace, the reconstruction error being the measure of confidence for presence/absence of face. Following this, the confidence score for the entire video shot is calculated. In order to cluster extracted faces into a set of face classes, we employ an incremental procedure using a PCA-based dissimilarity measure in con-junction with spatio-temporal correlation. Experiments were carried out on a representative broadcast news test corpus

    Automated Top View Registration of Broadcast Football Videos

    Full text link
    In this paper, we propose a novel method to register football broadcast video frames on the static top view model of the playing surface. The proposed method is fully automatic in contrast to the current state of the art which requires manual initialization of point correspondences between the image and the static model. Automatic registration using existing approaches has been difficult due to the lack of sufficient point correspondences. We investigate an alternate approach exploiting the edge information from the line markings on the field. We formulate the registration problem as a nearest neighbour search over a synthetically generated dictionary of edge map and homography pairs. The synthetic dictionary generation allows us to exhaustively cover a wide variety of camera angles and positions and reduce this problem to a minimal per-frame edge map matching procedure. We show that the per-frame results can be improved in videos using an optimization framework for temporal camera stabilization. We demonstrate the efficacy of our approach by presenting extensive results on a dataset collected from matches of football World Cup 2014
    • 

    corecore