1,864 research outputs found
Deep Learning for Audio Signal Processing
Given the recent surge in developments of deep learning, this article
provides a review of the state-of-the-art deep learning techniques for audio
signal processing. Speech, music, and environmental sound processing are
considered side-by-side, in order to point out similarities and differences
between the domains, highlighting general methods, problems, key references,
and potential for cross-fertilization between areas. The dominant feature
representations (in particular, log-mel spectra and raw waveform) and deep
learning models are reviewed, including convolutional neural networks, variants
of the long short-term memory architecture, as well as more audio-specific
neural network models. Subsequently, prominent deep learning application areas
are covered, i.e. audio recognition (automatic speech recognition, music
information retrieval, environmental sound detection, localization and
tracking) and synthesis and transformation (source separation, audio
enhancement, generative models for speech, sound, and music synthesis).
Finally, key issues and future questions regarding deep learning applied to
audio signal processing are identified.Comment: 15 pages, 2 pdf figure
Modeling of complex nonlinear dynamic systems using temporal convolution neural networks
An increasingly important class of nonlinear systems includes the nonaffine hybrid systems,
in particular those in which the underlying dynamics explicitly depends on a switching
signal. When the inherent complexity is treatable and the phenomena governing the
system dynamics are known an implicit model can be derived to describe its behaviour
over time. Conversely, when these assumptions are not met the system dynamics can still
be approximated by regression-based techniques, provided a dataset comprising inputs
and outputs collected from the system is available. One approach to deal with data driven
modelling relies on computational intelligent frameworks, in which artificial neural networks
stand out as a prominent class of universal approximation black box models. This
work aims to explore 1D Convolutional Neural Networks capabilities, in which the inputs
are represented by regressors and structural configuration parameters, to modelling
nonlinear hybrid dynamic systems. Moreover, in order evaluate the intrinsic ability to
transparently approximate hybrid dynamics, this deep neural network architecture is
compared to a shallow multilayer layer perceptron framework, in which each structural
configuration is independently approximated.Uma classe de sistemas não lineares que tem vindo a ganhar cada vez mais importância
é a dos sistemas hÃbridos não-afins, em particular aqueles em que a dinâmica subjacente
depende explicitamente de um sinal de comutação. Quando a complexidade inerente é
tratável e os fenómenos que controlam a dinâmica do sistema são conhecidos, é possÃvel
obter-se um modelo implÃcito para descrever seu comportamento ao longo do tempo.
Por outro lado, quando essas suposições não são cumpridas, a dinâmica do sistema pode
ainda ser aproximada por técnicas baseadas em regressão, desde que um conjunto de dados
contendo as entradas e as saÃdas do sistema esteja disponÃvel. Uma abordagem para
lidar com o problema de modelação experimental recorrendo a técnicas de inteligência
computacional, na quais as redes neuronais artificiais se destacam como uma das classes
proeminentes de aproximadores universais. Este trabalho tem como objetivo explorar
as capacidades de redes neuronais convolutivas 1D, onde as entradas são representadas
por regressores e parâmetros de configuração estrutural. Além disso, para avaliar a capacidade
intrÃnseca para a aproximação de dinâmicas hÃbridas, esta arquitetura de rede
neuronal profunda é comparada a uma estrutura neuronal proactivas multicamada, na
qual cada configuração estrutural é independentemente aproximada
A Review of Structural Health Monitoring Techniques as Applied to Composite Structures.
Structural Health Monitoring (SHM) is the process of collecting, interpreting, and analysing data from structures in order to determine its health status and the remaining life span. Composite materials have been extensively use in recent years in several industries with the aim at reducing the total weight of structures while improving their mechanical properties. However, composite materials are prone to develop damage when subjected to low to medium impacts (ie 1 – 10 m/s and 11 – 30 m/s respectively). Hence, the need to use SHM techniques to detect damage at the incipient initiation in composite materials is of high importance. Despite the availability of several SHM methods for the damage identification in composite structures, no single technique has proven suitable for all circumstances. Therefore, this paper offers some updated guidelines for the users of composites on some of the recent advances in SHM applied to composite structures; also, most of the studies reported in the literature seem to have concentrated on the flat composite plates and reinforced with synthetic fibre. There are relatively fewer stories on other structural configurations such as single or double curve structures and hybridised composites reinforced with natural and synthetic fibres as regards SHM
Joint Phoneme Segmentation Inference and Classification using CRFs
State-of-the-art phoneme sequence recognition systems are based on hybrid hidden Markov model/artificial neural networks (HMM/ANN) framework. In this framework, the local classifier, ANN, is typically trained using Viterbi expectation-maximization algorithm, which involves two separate steps: phoneme sequence segmentation and training of ANN. In this paper, we propose a CRF based phoneme sequence recognition approach that simultaneously infers the phoneme segmentation and classifies the phoneme sequence. More specifically, the phoneme sequence recognition system consists of a local classifier ANN followed by a conditional random field (CRF) whose parameters are trained jointly, using a cost function that discriminates the true phoneme sequence against all competing sequences. In order to efficiently train such a system we introduce a novel CRF based segmentation using acyclic graph. We study the viability of the proposed approach on TIMIT phoneme recognition task. Our studies show that the proposed approach is capable of achieving performance similar to standard hybrid HMM/ANN and ANN/CRF systems where the ANN is trained with manual segmentation
- …