Search CORE

939 research outputs found

Multi-label Ferns for Efficient Recognition of Musical Instruments in Recordings

Author: A.A. Wieczorkowska
D. Niewiadomy
E..z. Kubera
J.G.A. Barbedo
K. Kashino
L. Breiman
S. Essid
T. Kitahara
W. Jiang
Publication venue
Publication date: 01/01/2014
Field of study

In this paper we introduce multi-label ferns, and apply this technique for automatic classification of musical instruments in audio recordings. We compare the performance of our proposed method to a set of binary random ferns, using jazz recordings as input data. Our main result is obtaining much faster classification and higher F-score. We also achieve substantial reduction of the model size

arXiv.org e-Print Archive

Crossref

Adaptive Multi-Class Audio Classification in Noisy In-Vehicle Environment

Author: Alsaadan Haitham
Eun Yongsoon
Won Myounggyu
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 21/03/2017
Field of study

With ever-increasing number of car-mounted electric devices and their complexity, audio classification is increasingly important for the automotive industry as a fundamental tool for human-device interactions. Existing approaches for audio classification, however, fall short as the unique and dynamic audio characteristics of in-vehicle environments are not appropriately taken into account. In this paper, we develop an audio classification system that classifies an audio stream into music, speech, speech+music, and noise, adaptably depending on driving environments including highway, local road, crowded city, and stopped vehicle. More than 420 minutes of audio data including various genres of music, speech, speech+music, and noise are collected from diverse driving environments. The results demonstrate that the proposed approach improves the average classification accuracy up to 166%, and 64% for speech, and speech+music, respectively, compared with a non-adaptive approach in our experimental settings

arXiv.org e-Print Archive

University of Memphis Digital Commons

Effect of nano black rice husk ash on the chemical and physical properties of porous concrete pavement

Author: Ali Mohamad Idris
Arshad Mohd Fadzil
Awang Haryati
Hainin Mohd Rosli
Mohd Yusak Mohd Ibrahim
Putra Jaya Ramadhansyah
Wan Ibrahim Mohd Haziman
Publication venue: 'Southwest Jiaotong University'
Publication date: 01/01/2018
Field of study

Black rice husk is a waste from this agriculture industry. It has been found that majority inorganic element in rice husk is silica. In this study, the effect of Nano from black rice husk ash (BRHA) on the chemical and physical properties of concrete pavement was investigated. The BRHA produced from uncontrolled burning at rice factory was taken. It was then been ground using laboratory mill with steel balls and steel rods. Four different grinding grades of BRHA were examined. A rice husk ash dosage of 10% by weight of binder was used throughout the experiments. The chemical and physical properties of the Nano BRHA mixtures were evaluated using fineness test, X-ray Fluorescence spectrometer (XRF) and X-ray diffraction (XRD). In addition, the compressive strength test was used to evaluate the performance of porous concrete pavement. Generally, the results show that the optimum grinding time was 63 hours. The result also indicated that the use of Nano black rice husk ash ground for 63hours produced concrete with good strengt

UTHM Institutional Repository

Indexing of fictional video content for event detection and summarisation

Author: Lee Hyowon
Lehane Bart
O'Connor Noel E.
Smeaton Alan F.
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2007
Field of study

This paper presents an approach to movie video indexing that utilises audiovisual analysis to detect important and meaningful temporal video segments, that we term events. We consider three event classes, corresponding to dialogues, action sequences, and montages, where the latter also includes musical sequences. These three event classes are intuitive for a viewer to understand and recognise whilst accounting for over 90% of the content of most movies. To detect events we leverage traditional filmmaking principles and map these to a set of computable low-level audiovisual features. Finite state machines (FSMs) are used to detect when temporal sequences of specific features occur. A set of heuristics, again inspired by filmmaking conventions, are then applied to the output of multiple FSMs to detect the required events. A movie search system, named MovieBrowser, built upon this approach is also described. The overall approach is evaluated against a ground truth of over twenty-three hours of movie content drawn from various genres and consistently obtains high precision and recall for all event classes. A user experiment designed to evaluate the usefulness of an event-based structure for both searching and browsing movie archives is also described and the results indicate the usefulness of the proposed approach

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

Irish Universities

DCU Online Research Access Service

Data-Driven Audio Feature Space Clustering for Automatic Sound Recognition in Radio Broadcast News

Author: Alexandros Lazaridis
Casey M.
Dempster A. P.
Eyben F.
Iosif Mporas
Nikos Fakotakis
Perperis T.
Theodoros Theodorou
Wollmer M.
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 23/12/2016
Field of study

This is an Open Access article published by World Scientific Publishing Company. It is distributed under the terms of the Creative Commons Attribution 4.0 (CC-BY) License. Further distribution of this work is permitted, provided the original work is properly cited. T. Theodorou, I. Mpoas, A. Lazaridis, N. Fakotakis, 'Data-Driven Audio Feature Space Clustering for Automatic Sound Recognition in Radio Broadcast News', International Journal on Artificial Intelligence Tools, Vol. 26 (2), April 2017, 1750005 (13 pages), DOI: 10.1142/S021821301750005. © The Author(s).In this paper we describe an automatic sound recognition scheme for radio broadcast news based on principal component clustering with respect to the discrimination ability of the principal components. Specifically, streams of broadcast news transmissions, labeled based on the audio event, are decomposed using a large set of audio descriptors and project into the principal component space. A data-driven algorithm clusters the relevance of the components. The component subspaces are used by sound type classifier. This methodology showed that the k-nearest neighbor and the artificial intelligent network provide good results. Also, this methodology showed that discarding unnecessary dimension works in favor on the outcome, as it hardly deteriorates the effectiveness of the algorithms.Peer reviewe

Crossref

University of Hertfordshire Research Archive

Dual shots detection

Author: Juhár Jozef
Vozáriková Eva
Čižmár Anton
Publication venue: Vysoká škola báňská - Technická univerzita Ostrava
Publication date: 01/01/2012
Field of study

The identification of a special kind of acoustic events such as dual gunshots and single gunshots in the traffic background is described in this work. The recognition of dangerous sounds may help to prevent the abnormal or criminal activities that happened near to the public transport stations. Therefore in this paper the methodology of dual shots detection in a noisy background was developed and evaluated. For this purpose, we investigated various feature extraction methods and combinations of different feature sets. These approaches were evaluated by the widely used classification technique based on the Hidden Markov Models

Directory of Open Access Journals

DSpace at VSB Technical University of Ostrava

Analysis and recognition of similar environmental sounds

Author: Rodeia José Pedro dos Santos
Publication venue: FCT - UNL
Publication date: 01/01/2009
Field of study

Dissertação apresentada na Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa para obtenção do grau de Mestre em Engenharia InformáticaHumans have the ability to identify sound sources just by hearing a sound. Adapting the same problem to computers is called (automatic) sound recognition. Several sound recognizers have been developed throughout the years. The accuracy provided by these recognizers is influenced by the features they use and the classification method implemented. While there are many approaches in sound feature extraction and in sound classification, most have been used to classify sounds with very different characteristics. Here, we implemented a similar sound recognizer. This recognizer uses sounds with very similar properties making the recognition process harder. Therefore, we will use both temporal and spectral properties of the sound. These properties will be extracted using the Intrinsic Structures Analysis (ISA) method, which uses Independent Component Analysis and Principal Component Analysis. We will implement the classification method based on k-Nearest Neighbor algorithm. Here we prove that the features extracted in this way are powerful in sound recognition. We tested our recognizer with several sets of features the ISA method retrieves, and achieved great results. We, finally, did a user study to compare human performance distinguishing similar sounds against our recognizer. The study allowed us to conclude the sounds are in fact really similar and difficult to distinguish and that our recognizer has much more ability than humans to identify them

Repositório da Universidade Nova de Lisboa

Speaker segmentation and clustering

Author: Ajmera
Ajmera
Almpanidis
Barras
Bimbot
Campbell
Campbell
Cettolo
Constantine Kotropoulos
Delacourt
Deller
Fiscus
Gales
Garofolo
Godfrey
Graff
Graff
Graff
Hansen
Harb
Hess
Huang
Jain
Kim
Know
Lapidot
Lu
Manjunath
Margarita Kotti
Meignier
Oppenheim
Pellom
Reynolds
Sondhi
Tranter
Vassiliki Moschou
Ververidis
Wang
Wu
Wu
Zhou
Zhu
Publication venue: 'Elsevier BV'
Publication date: 01/01/2008
Field of study

This survey focuses on two challenging speech processing topics, namely: speaker segmentation and speaker clustering. Speaker segmentation aims at finding speaker change points in an audio stream, whereas speaker clustering aims at grouping speech segments based on speaker characteristics. Model-based, metric-based, and hybrid speaker segmentation algorithms are reviewed. Concerning speaker clustering, deterministic and probabilistic algorithms are examined. A comparative assessment of the reviewed algorithms is undertaken, the algorithm advantages and disadvantages are indicated, insight to the algorithms is offered, and deductions as well as recommendations are given. Rich transcription and movie analysis are candidate applications that benefit from combined speaker segmentation and clustering. © 2007 Elsevier B.V. All rights reserved

CiteSeerX

Crossref

Spiral - Imperial College Digital Repository