Search CORE

12,261 research outputs found

AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions

Author: Gu Chunhui
Li Yeqing
Malik Jitendra
Pantofaru Caroline
Ricco Susanna
Ross David A.
Schmid Cordelia
Sukthankar Rahul
Sun Chen
Toderici George
Vijayanarasimhan Sudheendra
Vondrick Carl
Publication venue
Publication date: 30/04/2018
Field of study

This paper introduces a video dataset of spatio-temporally localized Atomic Visual Actions (AVA). The AVA dataset densely annotates 80 atomic visual actions in 430 15-minute video clips, where actions are localized in space and time, resulting in 1.58M action labels with multiple labels per person occurring frequently. The key characteristics of our dataset are: (1) the definition of atomic visual actions, rather than composite actions; (2) precise spatio-temporal annotations with possibly multiple annotations for each person; (3) exhaustive annotation of these atomic actions over 15-minute video clips; (4) people temporally linked across consecutive segments; and (5) using movies to gather a varied set of action representations. This departs from existing datasets for spatio-temporal action recognition, which typically provide sparse annotations for composite actions in short video clips. We will release the dataset publicly. AVA, with its realistic scene and action complexity, exposes the intrinsic difficulty of action recognition. To benchmark this, we present a novel approach for action localization that builds upon the current state-of-the-art methods, and demonstrates better performance on JHMDB and UCF101-24 categories. While setting a new state of the art on existing datasets, the overall results on AVA are low at 15.6% mAP, underscoring the need for developing new approaches for video understanding.Comment: To appear in CVPR 2018. Check dataset page https://research.google.com/ava/ for detail

arXiv.org e-Print Archive

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Object classification methods for application in FPGA based vehicle video detector

Author: Wiesław PAMUŁA
Publication venue: Silesian University of Technology
Publication date: 01/01/2009
Field of study

The paper presents a discussion of properties of object classification methods utilized in processing video streams from a camera. Methods based on feature extraction, model fitting and invariant determination are evaluated. Petri nets are used for modelling the processing flow. Data objects and transitions are defined which are suitable for efficient implementation in FPGA circuits. Processing characteristics and problems of the implementations are shown. An invariant based method is assessed as most suitable for application in a vehicle video detector

Directory of Open Access Journals

DRSP : Dimension Reduction For Similarity Matching And Pruning Of Time Series Data Streams

Author: Patnaik L M
Samartha T V
Srikantaiah K C
Venugopal K R
Vishwanath R H
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 01/01/2013
Field of study

Similarity matching and join of time series data streams has gained a lot of relevance in today's world that has large streaming data. This process finds wide scale application in the areas of location tracking, sensor networks, object positioning and monitoring to name a few. However, as the size of the data stream increases, the cost involved to retain all the data in order to aid the process of similarity matching also increases. We develop a novel framework to addresses the following objectives. Firstly, Dimension reduction is performed in the preprocessing stage, where large stream data is segmented and reduced into a compact representation such that it retains all the crucial information by a technique called Multi-level Segment Means (MSM). This reduces the space complexity associated with the storage of large time-series data streams. Secondly, it incorporates effective Similarity Matching technique to analyze if the new data objects are symmetric to the existing data stream. And finally, the Pruning Technique that filters out the pseudo data object pairs and join only the relevant pairs. The computational cost for MSM is O(l*ni) and the cost for pruning is O(DRF*wsize*d), where DRF is the Dimension Reduction Factor. We have performed exhaustive experimental trials to show that the proposed framework is both efficient and competent in comparison with earlier works.Comment: 20 pages,8 figures, 6 Table

arXiv.org e-Print Archive

ePrints@Bangalore University