Search CORE

356 research outputs found

Fusion for Audio-Visual Laughter Detection

Author: Reuderink B.
Publication venue: Centre for Telematics and Information Technology, University of Twente
Publication date: 01/01/2007
Field of study

Laughter is a highly variable signal, and can express a spectrum of emotions. This makes the automatic detection of laughter a challenging but interesting task. We perform automatic laughter detection using audio-visual data from the AMI Meeting Corpus. Audio-visual laughter detection is performed by combining (fusing) the results of a separate audio and video classifier on the decision level. The video-classifier uses features based on the principal components of 20 tracked facial points, for audio we use the commonly used PLP and RASTA-PLP features. Our results indicate that RASTA-PLP features outperform PLP features for laughter detection in audio. We compared hidden Markov models (HMMs), Gaussian mixture models (GMMs) and support vector machines (SVM) based classifiers, and found that RASTA-PLP combined with a GMM resulted in the best performance for the audio modality. The video features classified using a SVM resulted in the best single-modality performance. Fusion on the decision-level resulted in laughter detection with a significantly better performance than single-modality classification

University of Twente Research Information

Learning Deep Representations of Appearance and Motion for Anomalous Event Detection

Author: Ricci Elisa
Sebe Nicu
Song Jingkuan
Xu Dan
Yan Yan
Publication venue
Publication date: 01/01/2015
Field of study

We present a novel unsupervised deep learning framework for anomalous event detection in complex video scenes. While most existing works merely use hand-crafted appearance and motion features, we propose Appearance and Motion DeepNet (AMDN) which utilizes deep neural networks to automatically learn feature representations. To exploit the complementary information of both appearance and motion patterns, we introduce a novel double fusion framework, combining both the benefits of traditional early fusion and late fusion strategies. Specifically, stacked denoising autoencoders are proposed to separately learn both appearance and motion features as well as a joint representation (early fusion). Based on the learned representations, multiple one-class SVM models are used to predict the anomaly scores of each input, which are then integrated with a late fusion strategy for final anomaly detection. We evaluate the proposed method on two publicly available video surveillance datasets, showing competitive performance with respect to state of the art approaches.Comment: Oral paper in BMVC 201

arXiv.org e-Print Archive

Crossref

Insights from Classifying Visual Concepts with Multiple Kernel Learning

Author: Binder Alexander
Brefeld Ulf
Kawanabe Motoaki
Kloft Marius
Müller Christina
Müller Klaus-Robert
Nakajima Shinichi
Samek Wojciech
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2011
Field of study

Combining information from various image features has become a standard technique in concept recognition tasks. However, the optimal way of fusing the resulting kernel functions is usually unknown in practical applications. Multiple kernel learning (MKL) techniques allow to determine an optimal linear combination of such similarity matrices. Classical approaches to MKL promote sparse mixtures. Unfortunately, so-called 1-norm MKL variants are often observed to be outperformed by an unweighted sum kernel. The contribution of this paper is twofold: We apply a recently developed non-sparse MKL variant to state-of-the-art concept recognition tasks within computer vision. We provide insights on benefits and limits of non-sparse MKL and compare it against its direct competitors, the sum kernel SVM and the sparse MKL. We report empirical results for the PASCAL VOC 2009 Classification and ImageCLEF2010 Photo Annotation challenge data sets. About to be submitted to PLoS ONE.Comment: 18 pages, 8 tables, 4 figures, format deviating from plos one submission format requirements for aesthetic reason

arXiv.org e-Print Archive

TUbiblio

Directory of Open Access Journals

Fraunhofer-ePrints

PubMed Central

Artifact removal from electroencephalograms using a hybrid BSS-SVM algorithm

Author: Jonathon Chambers (1251609)
Leor Shoker (7209422)
Saeid Sanei (7207403)
Publication venue
Publication date: 01/01/2005
Field of study

Artifacts such as eye blinks and heart rhythm (ECG) cause the main interfering signals within electroencephalogram (EEG) measurements. Therefore, we propose a method for artifact removal based on exploitation of certain carefully chosen statistical features of independent components extracted from the EEGs, by fusing support vector machines (SVMs) and blind source separation (BSS). We use the second-order blind identification (SOBI) algorithm to separate the EEG into statistically independent sources and SVMs to identify the artifact components and thereby to remove such signals. The remaining independent components are remixed to reproduce the artifact-free EEGs. Objective and subjective assessment of the simulation results shows that the algorithm is successful in mitigating the interference within EEGs

Loughborough University Institutional Repository

Online Research @ Cardiff

Surrey Research Insight

GOGMA: Globally-Optimal Gaussian Mixture Alignment

Author: Campbell Dylan
Petersson Lars
Publication venue
Publication date: 01/03/2016
Field of study

Gaussian mixture alignment is a family of approaches that are frequently used for robustly solving the point-set registration problem. However, since they use local optimisation, they are susceptible to local minima and can only guarantee local optimality. Consequently, their accuracy is strongly dependent on the quality of the initialisation. This paper presents the first globally-optimal solution to the 3D rigid Gaussian mixture alignment problem under the L2 distance between mixtures. The algorithm, named GOGMA, employs a branch-and-bound approach to search the space of 3D rigid motions SE(3), guaranteeing global optimality regardless of the initialisation. The geometry of SE(3) was used to find novel upper and lower bounds for the objective function and local optimisation was integrated into the scheme to accelerate convergence without voiding the optimality guarantee. The evaluation empirically supported the optimality proof and showed that the method performed much more robustly on two challenging datasets than an existing globally-optimal registration solution.Comment: Manuscript in press 2016 IEEE Conference on Computer Vision and Pattern Recognitio

arXiv.org e-Print Archive

Crossref

Comparison of different classification algorithms for underwater target discrimination

Author: Azimi-Sadjadi Mahmood R.
Li Donghui
Robinson Marc
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2004
Field of study

Includes bibliographical references.Classification of underwater targets from the acoustic backscattered signals is considered here. Several different classification algorithms are tested and benchmarked not only for their performance but also to gain insight to the properties of the feature space. Results on a wideband 80-kHz acoustic backscattered data set collected for six different objects are presented in terms of the receiver operating characteristic (ROC) and robustness of the classifiers wrt reverberation.This work was supported by the Office of Naval Research, Biosonar Program, under Grant N00014-99-1-0166 and Grant N00014-01-1-0307. Data and technical support were provided by the NSWC, Coastal Systems Station, Panama City, FL

Mountain Scholar (Digital Collections of Colorado and Wyoming)