Search CORE

987 research outputs found

Feature Learning with Matrix Factorization Applied to Acoustic Scene Classification

Author: Bisot Victor
Essid Slim
Richard Gael
Serizel Romain
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 09/09/2016
Field of study

International audienceIn this paper, we study the usefulness of various matrix factorization methods for learning features to be used for the specific Acoustic Scene Classification problem. A common way of addressing ASC has been to engineer features capable of capturing the specificities of acoustic environments. Instead, we show that better representations of the scenes can be automatically learned from time-frequency representations using matrix factorization techniques. We mainly focus on extensions including sparse, kernel-based, convolutive and a novel supervised dictionary learning variant of Principal Component Analysis and Nonnegative Matrix Factorization. An experimental evaluation is performed on two of the largest ASC datasets available in order to compare and discuss the usefulness of these methods for the task. We show that the unsupervised learning methods provide better representations of acoustic scenes than the best conventional hand-crafted features on both datasets. Furthermore, the introduction of a novel nonnegative supervised matrix factorization model and Deep Neural networks trained on spectrograms, allow us to reach further improvements

INRIA a CCSD electronic archive server

Detection of Coherent Vorticity Structures using Time-Scale Resolved Acoustic Spectroscopy

Author: Arneodo
Baerg
Baudet
Benzi
Brachet
Choi
Christophe Baudet
Chu
Cohen
Colonius
Douady
Engler
Engler
Flandrin
Gilbert
Gromov
Hlawatsh
Ho
Jeong
Korman
Kraichnan
Lund
Lundgren
Lundgren
Nelkin
Noullez
Obukhov
Olivier Michel
Shafi
She
Tsinober
Wallace
Watkins
William J. Williams
Publication venue: 'Elsevier BV'
Publication date: 23/11/1998
Field of study

We describe here an experimental technique based on the acoustic scattering phenomenon allowing the direct probing of the vorticity field in a turbulent flow. Using time-frequency distributions, recently introduced in signal analysis theory, for the analysis of the scattered acoustic signals, we show how the legibility of these signals is significantly improved (time resolved spectroscopy). The method is illustrated on data extracted from a highly turbulent jet flow : discrete vorticity events are clearly evidenced. We claim that the recourse to time-frequency distributions lead to an operational definition of coherent structures associated with phase stationarity in the time-frequency plane.Comment: 26 pages, 6 figures. Latex2e format Revised version : Added references, figures and Changed conten

arXiv.org e-Print Archive

CiteSeerX

Crossref

Monitoring Mixing Processes Using Ultrasonic Sensors and Machine Learning

Author: Bakalis Serafim
Bowler Alexander L.
Watson Nicholas J.
Publication venue: 'MDPI AG'
Publication date: 25/03/2020
Field of study

Mixing is one of the most common processes across food, chemical, and pharmaceutical manufacturing. Real-time, in-line sensors are required for monitoring, and subsequently optimising, essential processes such as mixing. Ultrasonic sensors are low-cost, real-time, in-line, and applicable to characterise opaque systems. In this study, a non-invasive, reflection-mode ultrasonic measurement technique was used to monitor two model mixing systems. The two systems studied were honey-water blending and flour-water batter mixing. Classification machine learning models were developed to predict if materials were mixed or not mixed. Regression machine learning models were developed to predict the time remaining until mixing completion. Artificial neural networks, support vector machines, long short-term memory neural networks, and convolutional neural networks were tested, along with different methods for engineering features from ultrasonic waveforms in both the time and frequency domain. Comparisons between using a single sensor and performing multisensor data fusion between two sensors were made. Classification accuracies of up to 96.3% for honey-water blending and 92.5% for flour-water batter mixing were achieved, along with R2 values for the regression models of up to 0.977 for honey-water blending and 0.968 for flour-water batter mixing. Each prediction task produced optimal performance with different algorithms and feature engineering methods, vindicating the extensive comparison between different machine learning approaches

Repository@Nottingham

A Comprehensive Survey on Heart Sound Analysis in the Deep Learning Era

Author: Chang Yi
Nguyen Thanh Tam
Qian Kun
Ren Zhao
Schuller Björn W.
Tan Yang
Publication venue
Publication date: 23/01/2023
Field of study

Heart sound auscultation has been demonstrated to be beneficial in clinical usage for early screening of cardiovascular diseases. Due to the high requirement of well-trained professionals for auscultation, automatic auscultation benefiting from signal processing and machine learning can help auxiliary diagnosis and reduce the burdens of training professional clinicians. Nevertheless, classic machine learning is limited to performance improvement in the era of big data. Deep learning has achieved better performance than classic machine learning in many research fields, as it employs more complex model architectures with stronger capability of extracting effective representations. Deep learning has been successfully applied to heart sound analysis in the past years. As most review works about heart sound analysis were given before 2017, the present survey is the first to work on a comprehensive overview to summarise papers on heart sound analysis with deep learning in the past six years 2017--2022. We introduce both classic machine learning and deep learning for comparison, and further offer insights about the advances and future research directions in deep learning for heart sound analysis

arXiv.org e-Print Archive

Geometric deep learning: going beyond Euclidean data

Author: Bronstein Michael M.
Bruna Joan
LeCun Yann
Szlam Arthur
Vandergheynst Pierre
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/11/2016
Field of study

Many scientific fields study data with an underlying structure that is a non-Euclidean space. Some examples include social networks in computational social sciences, sensor networks in communications, functional networks in brain imaging, regulatory networks in genetics, and meshed surfaces in computer graphics. In many applications, such geometric data are large and complex (in the case of social networks, on the scale of billions), and are natural targets for machine learning techniques. In particular, we would like to use deep neural networks, which have recently proven to be powerful tools for a broad range of problems from computer vision, natural language processing, and audio analysis. However, these tools have been most successful on data with an underlying Euclidean or grid-like structure, and in cases where the invariances of these structures are built into networks used to model them. Geometric deep learning is an umbrella term for emerging techniques attempting to generalize (structured) deep neural models to non-Euclidean domains such as graphs and manifolds. The purpose of this paper is to overview different examples of geometric deep learning problems and present available solutions, key difficulties, applications, and future research directions in this nascent field

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Acoustic Features for Environmental Sound Analysis

Author: Bisot Victor
Essid Slim
Richard Gael
Serizel Romain
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

International audienceMost of the time it is nearly impossible to differentiate between particular type of sound events from a waveform only. Therefore, frequency domain and time-frequency domain representations have been used for years providing representations of the sound signals that are more inline with the human perception. However, these representations are usually too generic and often fail to describe specific content that is present in a sound recording. A lot of work have been devoted to design features that could allow extracting such specific information leading to a wide variety of hand-crafted features. During the past years, owing to the increasing availability of medium scale and large scale sound datasets, an alternative approach to feature extraction has become popular, the so-called feature learning. Finally, processing the amount of data that is at hand nowadays can quickly become overwhelming. It is therefore of paramount importance to be able to reduce the size of the dataset in the feature space. The general processing chain to convert an sound signal to a feature vector that can be efficiently exploited by a classifier and the relation to features used for speech and music processing are described is this chapter

INRIA a CCSD electronic archive server

Hybrid Wavelet-Support Vector Classifiers

Author: Steidl Gabriele
Strauß Daniel J.
Publication venue
Publication date: 01/01/2001
Field of study

The Support Vector Machine (SVM) represents a new and very promising technique for machine learning tasks involving classification, regression or novelty detection. Improvements of its generalization ability can be achieved by incorporating prior knowledge of the task at hand. We propose a new hybrid algorithm consisting of signal-adapted wavelet decompositions and SVMs for waveform classification. The adaptation of the wavelet decompositions is tailormade for SVMs with radial basis functions as kernels. It allows the optimization Of the representation of the data before training the SVM and does not suffer from computationally expensive validation techniques. We assess the performance of our algorithm against the background of current concerns in medical diagnostics, namely the classification of endocardial electrograms and the detection of otoacoustic emissions. Here the performance of SVMs can significantly be improved by our adapted preprocessing step

MAnnheim DOCument Server