Search CORE

9,531 research outputs found

VSCAN: An Enhanced Video Summarization using Density-based Spatial Clustering

Author: A. Girgensohn
B.T. Truong
F.J. Aherne
H.M. Blanken
M. Furini
M. Parimala
M. Singha
M.J. Swain
P. Mundur
R.S. Stanković
S. Pfeiffer
S.E.F. Avila de
T. Kailath
T. Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

In this paper, we present VSCAN, a novel approach for generating static video summaries. This approach is based on a modified DBSCAN clustering algorithm to summarize the video content utilizing both color and texture features of the video frames. The paper also introduces an enhanced evaluation method that depends on color and texture features. Video Summaries generated by VSCAN are compared with summaries generated by other approaches found in the literature and those created by users. Experimental results indicate that the video summaries generated by VSCAN have a higher quality than those generated by other approaches.Comment: arXiv admin note: substantial text overlap with arXiv:1401.3590 by other authors without attributio

arXiv.org e-Print Archive

Crossref

A generic framework for video understanding applied to group behavior recognition

Author: Boulay Bernard
Bremond François
Zaidenberg Sofia
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/06/2012
Field of study

This paper presents an approach to detect and track groups of people in video-surveillance applications, and to automatically recognize their behavior. This method keeps track of individuals moving together by maintaining a spacial and temporal group coherence. First, people are individually detected and tracked. Second, their trajectories are analyzed over a temporal window and clustered using the Mean-Shift algorithm. A coherence value describes how well a set of people can be described as a group. Furthermore, we propose a formal event description language. The group events recognition approach is successfully validated on 4 camera views from 3 datasets: an airport, a subway, a shopping center corridor and an entrance hall.Comment: (20/03/2012

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Safe, Remote-Access Swarm Robotics Research on the Robotarium

Author: Ames Aaron
Diaz-Mercado Yancy
Egerstedt Magnus
Feron Eric
Glotfelter Paul
Mote Mark
Pickem Daniel
Wang Li
Publication venue
Publication date: 03/04/2016
Field of study

This paper describes the development of the Robotarium -- a remotely accessible, multi-robot research facility. The impetus behind the Robotarium is that multi-robot testbeds constitute an integral and essential part of the multi-agent research cycle, yet they are expensive, complex, and time-consuming to develop, operate, and maintain. These resource constraints, in turn, limit access for large groups of researchers and students, which is what the Robotarium is remedying by providing users with remote access to a state-of-the-art multi-robot test facility. This paper details the design and operation of the Robotarium as well as connects these to the particular considerations one must take when making complex hardware remotely accessible. In particular, safety must be built in already at the design phase without overly constraining which coordinated control programs the users can upload and execute, which calls for minimally invasive safety routines with provable performance guarantees.Comment: 13 pages, 7 figures, 3 code samples, 72 reference

arXiv.org e-Print Archive

Caltech Authors

Statistical framework for video decoding complexity modeling and prediction

Author: Andreopoulos Y
Kontorinis N
Van Der Schaar M
Publication venue
Publication date: 08/04/2009
Field of study

Video decoding complexity modeling and prediction is an increasingly important issue for efficient resource utilization in a variety of applications, including task scheduling, receiver-driven complexity shaping, and adaptive dynamic voltage scaling. In this paper we present a novel view of this problem based on a statistical framework perspective. We explore the statistical structure (clustering) of the execution time required by each video decoder module (entropy decoding, motion compensation, etc.) in conjunction with complexity features that are easily extractable at encoding time (representing the properties of each module's input source data). For this purpose, we employ Gaussian mixture models (GMMs) and an expectation-maximization algorithm to estimate the joint execution-time - feature probability density function (PDF). A training set of typical video sequences is used for this purpose in an offline estimation process. The obtained GMM representation is used in conjunction with the complexity features of new video sequences to predict the execution time required for the decoding of these sequences. Several prediction approaches are discussed and compared. The potential mismatch between the training set and new video content is addressed by adaptive online joint-PDF re-estimation. An experimental comparison is performed to evaluate the different approaches and compare the proposed prediction scheme with related resource prediction schemes from the literature. The usefulness of the proposed complexity-prediction approaches is demonstrated in an application of rate-distortion-complexity optimized decoding

UCL Discovery

Rejection-Cascade of Gaussians: Real-time adaptive background subtraction framework

Author: B Valentine
BR Kiran
J Zuo
T Bouwmans
Publication venue
Publication date: 16/11/2019
Field of study

Background-Foreground classification is a well-studied problem in computer vision. Due to the pixel-wise nature of modeling and processing in the algorithm, it is usually difficult to satisfy real-time constraints. There is a trade-off between the speed (because of model complexity) and accuracy. Inspired by the rejection cascade of Viola-Jones classifier, we decompose the Gaussian Mixture Model (GMM) into an adaptive cascade of Gaussians(CoG). We achieve a good improvement in speed without compromising the accuracy with respect to the baseline GMM model. We demonstrate a speed-up factor of 4-5x and 17 percent average improvement in accuracy over Wallflowers surveillance datasets. The CoG is then demonstrated to over the latent space representation of images of a convolutional variational autoencoder(VAE). We provide initial results over CDW-2014 dataset, which could speed up background subtraction for deep architectures.Comment: Accepted for National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG 2019

arXiv.org e-Print Archive

Crossref

Image mining: trends and developments

Author: Hsu Wynne
Lee Mong Li
Zhang Ji
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2002
Field of study

[Abstract]: Advances in image acquisition and storage technology have led to tremendous growth in very large and detailed image databases. These images, if analyzed, can reveal useful information to the human users. Image mining deals with the extraction of implicit knowledge, image data relationship, or other patterns not explicitly stored in the images. Image mining is more than just an extension of data mining to image domain. It is an interdisciplinary endeavor that draws upon expertise in computer vision, image processing, image retrieval, data mining, machine learning, database, and artificial intelligence. In this paper, we will examine the research issues in image mining, current developments in image mining, particularly, image mining frameworks, state-of-the-art techniques and systems. We will also identify some future research directions for image mining

University of Southern Queensland ePrints