Search CORE

216 research outputs found

A survey on artificial intelligence-based acoustic source identification

Author: Ahmad Iftekhar
Habibi Daryoush
Islam Kazi Y.
Phung Quoc Viet
Zaheer Ruba
Publication venue: Edith Cowan University, Research Online, Perth, Western Australia
Publication date: 01/01/2023
Field of study

The concept of Acoustic Source Identification (ASI), which refers to the process of identifying noise sources has attracted increasing attention in recent years. The ASI technology can be used for surveillance, monitoring, and maintenance applications in a wide range of sectors, such as defence, manufacturing, healthcare, and agriculture. Acoustic signature analysis and pattern recognition remain the core technologies for noise source identification. Manual identification of acoustic signatures, however, has become increasingly challenging as dataset sizes grow. As a result, the use of Artificial Intelligence (AI) techniques for identifying noise sources has become increasingly relevant and useful. In this paper, we provide a comprehensive review of AI-based acoustic source identification techniques. We analyze the strengths and weaknesses of AI-based ASI processes and associated methods proposed by researchers in the literature. Additionally, we did a detailed survey of ASI applications in machinery, underwater applications, environment/event source recognition, healthcare, and other fields. We also highlight relevant research directions

Research Online @ ECU

Text-Independent Speaker Identification Using the Histogram Transform Model

Author: Guo Jun
Ma Zhanyu
Tan Zheng-Hua
Yu Hong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

Crossref

VBN

A Fast Parts-Based Approach to Speaker Verification Using Boosted Slice Classifiers

Author: Anindya Roy
Mathew Magimai.-Doss
Sébastien Marcel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Learning sound representations using trainable COPE feature extractors

Author: Petkov Nicolai
Strisciuglio Nicola
Vento Mario
Publication venue: 'Elsevier BV'
Publication date: 22/03/2019
Field of study

Sound analysis research has mainly been focused on speech and music processing. The deployed methodologies are not suitable for analysis of sounds with varying background noise, in many cases with very low signal-to-noise ratio (SNR). In this paper, we present a method for the detection of patterns of interest in audio signals. We propose novel trainable feature extractors, which we call COPE (Combination of Peaks of Energy). The structure of a COPE feature extractor is determined using a single prototype sound pattern in an automatic configuration process, which is a type of representation learning. We construct a set of COPE feature extractors, configured on a number of training patterns. Then we take their responses to build feature vectors that we use in combination with a classifier to detect and classify patterns of interest in audio signals. We carried out experiments on four public data sets: MIVIA audio events, MIVIA road events, ESC-10 and TU Dortmund data sets. The results that we achieved (recognition rate equal to 91.71% on the MIVIA audio events, 94% on the MIVIA road events, 81.25% on the ESC-10 and 94.27% on the TU Dortmund) demonstrate the effectiveness of the proposed method and are higher than the ones obtained by other existing approaches. The COPE feature extractors have high robustness to variations of SNR. Real-time performance is achieved even when the value of a large number of features is computed.Comment: Accepted for publication in Pattern Recognitio

arXiv.org e-Print Archive

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

A sub-band-based feature reconstruction approach for robust speaker recognition

Author: A Martin
A Papoulis
A Varga
B Raj
B Raj
B Raj
D Reynolds
DA Reynolds
DA Reynolds
DA Reynolds
Furong Yan
GJ Brown
H Bourlard
J Pelecanos
J Shlens
JA González
JA González
JF Gemmeke
Jiachang Yan
L Besacier
L Raj
M Brookes
M Cooke
M McLaren
Martin Rainer
ML Seltzer
N Ma
P Jančovič
P Kenny
P Kenny
R Auckenthaler
R Togneri
T May
V Chandran
W Campbell
X Zhao
X Zhao
Yanbin Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Enhanced Forensic Speaker Verification Using a Combination of DWT and MFCC Feature Warping in the Presence of Noise and Reverberation Conditions

Author: Al-Ali AKH
Chandran V
Dean D
Naik GR
Senadji B
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

© 2013 IEEE. Environmental noise and reverberation conditions severely degrade the performance of forensic speaker verification. Robust feature extraction plays an important role in improving forensic speaker verification performance. This paper investigates the effectiveness of combining features, mel frequency cepstral coefficients (MFCCs), and MFCC extracted from the discrete wavelet transform (DWT) of the speech, with and without feature warping for improving modern identity-vector (i-vector)-based speaker verification performance in the presence of noise and reverberation. The performance of i-vector speaker verification was evaluated using different feature extraction techniques: MFCC, feature-warped MFCC, DWT-MFCC, feature-warped DWT-MFCC, a fusion of DWT-MFCC and MFCC features, and fusion feature-warped DWT-MFCC and feature-warped MFCC features. We evaluated the performance of i-vector speaker verification using the Australian Forensic Voice Comparison and QUT-NOISE databases in the presence of noise, reverberation, and noisy and reverberation conditions. Our results indicate that the fusion of feature-warped DWT-MFCC and feature-warped MFCC is superior to other feature extraction techniques in the presence of environmental noise under the majority of signal-to-noise ratios (SNRs), reverberation, and noisy and reverberation conditions. At 0-dB SNR, the performance of the fusion of feature-warped DWT-MFCC and feature-warped MFCC approach achieves a reduction in average equal error rate of 21.33%, 20.00%, and 13.28% over feature-warped MFCC, respectively, in the presence of various types of environmental noises only, reverberation, and noisy and reverberation environments. The approach can be used for improving the performance of forensic speaker verification and it may be utilized for preparing legal evidence in court

Crossref

OPUS - University of Technology Sydney

Queensland University of Technology ePrints Archive

Western Sydney ResearchDirect

Phonetically transparent technique for the automatic transcription of speech

Author: Morony Michael J.
Publication venue: The University of Edinburgh
Publication date: 01/01/1998
Field of study

Edinburgh Research Archive

A Fast Parts-based Approach to Speaker Verification using Boosted Slice Classifiers

Author: Magimai.-Doss Mathew
Marcel Sébastien
Roy Anindya
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/12/2013
Field of study

Speaker verification on portable devices like smartphones is gradually becoming popular. In this context, two issues need to be considered: 1) such devices have relatively limited computation resources, and 2) they are liable to be used everywhere, possibly in very noisy, uncontrolled environments. This work aims to address both these issues by proposing a computationally efficient yet robust speaker verification system. This novel parts-based system draws inspiration from face and object detection systems in the computer vision domain. The system involves boosted ensembles of simple threshold-based classifiers. It uses a novel set of features extracted from speech spectra, called “slice features”. The performance of the proposed system was evaluated through extensive studies involving a wide range of experimental conditions using the TIMIT, HTIMIT and MOBIO corpus, against standard cepstral features and Gaussian Mixture Model-based speaker verification systems

Infoscience - École polytechnique fédérale de Lausanne