Search CORE

570 research outputs found

Classification of Overlapped Audio Events Based on AT, PLSA, and the Combination of Them

Author: Cheng C. F.
Fang J.
Leng Y.
Li D. W.
Li S.
Sun C. L.
Wan H. L.
Xu X. Y.
Publication venue: 'Brno University of Technology'
Publication date: 01/06/2015
Field of study

Audio event classification, as an important part of Computational Auditory Scene Analysis, has attracted much attention. Currently, the classification technology is mature enough to classify isolated audio events accurately, but for overlapped audio events, it performs much worse. While in real life, most audio documents would have certain percentage of overlaps, and so the overlap classification problem is an important part of audio classification. Nowadays, the work on overlapped audio event classification is still scarce, and most existing overlap classification systems can only recognize one audio event for an overlap. In this paper, in order to deal with overlaps, we innovatively introduce the author-topic (AT) model which was first proposed for text analysis into audio classification, and innovatively combine it with PLSA (Probabilistic Latent Semantic Analysis). We propose 4 systems, i.e. AT, PLSA, AT-PLSA and PLSA-AT, to classify overlaps. The 4 proposed systems have the ability to recognize two or more audio events for an overlap. The experimental results show that the 4 systems perform well in classifying overlapped audio events, whether it is the overlap in training set or the overlap out of training set. Also they perform well in classifying isolated audio events

Directory of Open Access Journals

Digital library of Brno University of Technology

Language modeling and transcription of the TED corpus lectures

Author: Cettolo M.
Federico M.
Leeuwis E.
Publication venue: IEEE
Publication date: 01/01/2003
Field of study

Transcribing lectures is a challenging task, both in acoustic and in language modeling. In this work, we present our first results on the automatic transcription of lectures from the TED corpus, recently released by ELRA and LDC. In particular, we concentrated our effort on language modeling. Baseline acoustic and language models were developed using respectively 8 hours of TED transcripts and various types of texts: conference proceedings, lecture transcripts, and conversational speech transcripts. Then, adaptation of the language model to single speakers was investigated by exploiting different kinds of information: automatic transcripts of the talk, the title of the talk, the abstract and, finally, the paper. In the last case, a 39.2% WER was achieved

Archivio della ricerca - Fondazione Bruno Kessler

University of Twente Research Information

Context based multimedia information retrieval

Author: Mølgaard Lasse Lohilahti
Publication venue: Technical University of Denmark
Publication date: 01/12/2009
Field of study

Online Research Database In Technology

Smartphone picture organization: a hierarchical approach

Author: Dimiccoli Mariella
Lonn Stefan
Radeva Petia
Publication venue: 'Elsevier BV'
Publication date: 01/01/2019
Field of study

We live in a society where the large majority of the population has a camera-equipped smartphone. In addition, hard drives and cloud storage are getting cheaper and cheaper, leading to a tremendous growth in stored personal photos. Unlike photo collections captured by a digital camera, which typically are pre-processed by the user who organizes them into event-related folders, smartphone pictures are automatically stored in the cloud. As a consequence, photo collections captured by a smartphone are highly unstructured and because smartphones are ubiquitous, they present a larger variability compared to pictures captured by a digital camera. To solve the need of organizing large smartphone photo collections automatically, we propose here a new methodology for hierarchical photo organization into topics and topic-related categories. Our approach successfully estimates latent topics in the pictures by applying probabilistic Latent Semantic Analysis, and automatically assigns a name to each topic by relying on a lexical database. Topic-related categories are then estimated by using a set of topic-specific Convolutional Neuronal Networks. To validate our approach, we ensemble and make public a large dataset of more than 8,000 smartphone pictures from 40 persons. Experimental results demonstrate major user satisfaction with respect to state of the art solutions in terms of organization.Peer ReviewedPreprin

arXiv.org e-Print Archive

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Digital.CSIC

Recommended from our members

Audio-Based Semantic Concept Classification for Consumer Video

Author: Ellis Daniel P. W.
Lee Keansub
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2010
Field of study

This paper presents a novel method for automatically classifying consumer video clips based on their soundtracks. We use a set of 25 overlapping semantic classes, chosen for their usefulness to users, viability of automatic detection and of annotator labeling, and sufficiency of representation in available video collections. A set of 1873 videos from real users has been annotated with these concepts. Starting with a basic representation of each video clip as a sequence of mel-frequency cepstral coefficient (MFCC) frames, we experiment with three clip-level representations: single Gaussian modeling, Gaussian mixture modeling, and probabilistic latent semantic analysis of a Gaussian component histogram. Using such summary features, we produce support vector machine (SVM) classifiers based on the Kullback-Leibler, Bhattacharyya, or Mahalanobis distance measures. Quantitative evaluation shows that our approaches are effective for detecting interesting concepts in a large collection of real-world consumer video clips

Columbia University Academic Commons

Automatic Term Identification for Bibliometric Mapping

Author: Buter R.K. (Reindert)
Eck N.J.P. (Nees Jan) van
Noyons E.C.M. (Ed)
Waltman L. (Ludo)
Publication venue: Eck, N.J.P. (Nees Jan) van
Publication date: 03/12/2008
Field of study

A term map is a map that visualizes the structure of a scientific field by showing the relations between important terms in the field. The terms shown in a term map are usually selected manually with the help of domain experts. Manual term selection has the disadvantages of being subjective and labor-intensive. To overcome these disadvantages, we propose a methodology for automatic term identification and we use this methodology to select the terms to be included in a term map. To evaluate the proposed methodology, we use it to construct a term map of the field of operations research. The quality of the map is assessed by a number of operations research experts. It turns out that in general the proposed methodology performs quite well

Erasmus University Digital Repository

Biomedical text mining: State-of-the-art, open problems and future challenges

Author: Holzinger Andreas
Schantl Johannes
Schroettner Miriam
Seifert Christin
Verspoor Karin
Publication venue: Springer
Publication date: 01/01/2014
Field of study

University of Twente Research Information