7 research outputs found

    Feature Learning with Matrix Factorization Applied to Acoustic Scene Classification

    Get PDF
    International audienceIn this paper, we study the usefulness of various matrix factorization methods for learning features to be used for the specific Acoustic Scene Classification problem. A common way of addressing ASC has been to engineer features capable of capturing the specificities of acoustic environments. Instead, we show that better representations of the scenes can be automatically learned from time-frequency representations using matrix factorization techniques. We mainly focus on extensions including sparse, kernel-based, convolutive and a novel supervised dictionary learning variant of Principal Component Analysis and Nonnegative Matrix Factorization. An experimental evaluation is performed on two of the largest ASC datasets available in order to compare and discuss the usefulness of these methods for the task. We show that the unsupervised learning methods provide better representations of acoustic scenes than the best conventional hand-crafted features on both datasets. Furthermore, the introduction of a novel nonnegative supervised matrix factorization model and Deep Neural networks trained on spectrograms, allow us to reach further improvements

    Dataflow-Based Implementation of Deep Learning Application

    Get PDF
    The proliferation of research on high efficient performance on deep learning has contributed to an increasing challenge and interest in the topic concerning the integration of this advanced-technology into daily life. Although a large amount of work on the domain of machine learning has been dedicated to the accuracy, efficiency, net topology and algorithm in the training and recognition procedures, the investigation of deep learning implementations in highly resource-constrainted contexts has been relatively unexplored due to the large computational requirements involved during the process of training large-scale network. In light of this, one process concentrated on parameters extraction and dataflow design, implementation, optimization of one deep learning application for vehicle classification on multicore platforms with limited numbers of available processor cores is demonstrated. By means of thousands of actors computation and fifos communication, we establish one enormous and complex dataflow graph, and then using the resulting dataflow representations, we apply a wide range of design optimizations to probe efficient implementations on three different multicore platforms. Through the incorporation of dataflow techniques, it is gratifying for us to see its effectiveness and efficiency in the several flexible experiments with alternative platforms that tailored to the resource constraints. Besides, we pioneer three general, novel, primitive and thorough flow charts during the work - deep leanring model, LIDE-C establishing model, LIDE-C coding model. Finally, not only LIDE-C we utilize for the implementation, but also DICE we apply for validation and verification. Both tools are incubated by DSPCAD at Maryland of University, and will be updated better in the future

    Clasificación automática de sonidos utilizando aprendizaje máquina

    Get PDF
    En los últimos años, el aprendizaje máquina se ha venido utilizando intensamente para el reconocimiento de sonidos. Algunos son fácilmente distinguibles, como una risa, pero otros en cambio pueden ser muy similares entre sí, como una batidora y una motosierra. Además, la variabilidad inherente a estos audios hace que este problema sea bastante complicado de resolver mediante técnicas de procesado clásicas, pero supone un desafío apropiado para los altos niveles de abstracción que se pueden conseguir con las técnicas de aprendizaje máquina. En este trabajo se presentan dos modelos de red neuronal convolucional (CNN) para resolver un problema de clasificación de sonidos ambientales en siete categorías distintas. Los extractos de audio usados son los proporcionados por la base de datos UrbanSound8K. El rendimiento de ambos modelos llega a alcanzar el 90% de precisión en la clasificación de estos sonidos.Machine learning has been used intensively for sound recognition in recent years. Some sounds are easily distinguishable, like a laugh, but others can be very similar to each other, like a blender and a chainsaw. Furthermore, the inherent variability in these audios makes this problem quite difficult to solve using classical processing techniques, but it is an appropriate challenge for the high levels of abstraction that can be achieved with machine learning techniques. In this work, two convolutional neural network (CNN) models are presented to solve a problem of environmental sound classification in seven different labels. The audio excerpts used are those provided by the UrbanSound8K database. The performance of both models reaches 90% accuracy in the classification of these sounds.Universidad de Sevilla. Grado en Ingeniería de las Tecnologías de Telecomunicació

    Deep neural networks for audio scene recognition

    No full text
    International audienceThese last years, artificial neural networks (ANN) have known a renewed interest since efficient training procedures have emerged to learn the so called deep neural networks (DNN), i.e. ANN with at least two hidden layers. In the same time, the computational auditory scene recognition (CASR) problem which consists in estimating the environment around a device from the received audio signal has been investigated. Most of works which deal with the CASR problem have tried to ind well-adapted features for this problem. However, these features are generally combined with a classical classi-ier. In this paper, we introduce DNN in the CASR ield and we show that such networks can provide promising results and perform better than standard classiiers when the same features are used
    corecore