Search CORE

53 research outputs found

Deep autoencoders for acoustic anomaly detection: experiments with working machine and in-vehicle audio

Author: Coelho Gabriel José Dias
Cortez Paulo
Ferreira Andre
Matos Luis Miguel
Pereira Pedro Jose
Pilastri Andre
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 19/05/2022
Field of study

The growing usage of digital microphones has generated an increased interest in the topic of Acoustic Anomaly Detection (AAD). Indeed, there are several real-world AAD application domains, including working machines and in-vehicle intelligence (the main target of this research project). This paper introduces three deep AutoEncoders (AE) for unsupervised AAD tasks, namely a Dense AE, a Convolutional Neural Network (CNN) AE and Long Short-Term Memory Autoencoder (LSTM) AE. To tune the deep learning architectures, development data were adopted from public domain audio datasets related with working machines. A large set of computational experiments was held, showing that the three proposed deep autoencoders, when combined with a melspectrogram sound preprocessing, are quite competitive and outperform a recently proposed AE baseline. Next, on a second experimental stage, aiming to address the final in-vehicle passenger safety goal, the three AEs were adapted to learn from in-vehicle normal audio, assuming three realistic scenarios that were generated by a synthetic audio mixture tool. In general, a high quality AAD discrimination was obtained: working machine data - 72% to 91%; and in-vehicle audio - 78% to 81%. In conjunction with an automotive company, an in-vehicle AAD intelligent system prototype was further developed, aiming to test a selected model (LSTM AE) during a pilot demonstration event that targeted the cough anomaly. Interesting results were obtained, with the AAD system presenting a high cough classification accuracy (e.g., 100% for front seat locations).This work is supported by the European Structural and Investment Funds in the FEDER component, through the Operational Competitiveness and Internationalization Programme (COMPETE 2020) - Project no 039334; Funding Reference: POCI-01-0247-FEDER-039334

Universidade do Minho: RepositoriUM

Prediction and Planning in Dynamical Systems with Underlying Markov Decision Processes

Author: Banijamali Seyedershad
Publication venue: 'University of Waterloo'
Publication date: 13/08/2021
Field of study

Predicting the future state of a scene with moving objects is a task that humans handle with ease. This is due to our understanding about the dynamics of the objects in the scene and the way they interact. However, teaching machines such understanding has always been a challenging task in machine learning. In recent years, with the abundance of data and enormous growth in computational power, there have been an outstanding progress in filling the gap between humans and machines perception and prediction. Deep learning, specifically, has been the main framework to address this problem. Prediction models are not only crucial problems by themselves but also many downstream tasks in machine learning and robotics rely on the quality of output of these models. Model-based control and planning require an accurate modelling of the underlying dynamics of the systems. A common assumption about the underlying dynamics, which is also the main theme of this thesis, is that it can be expressed using Markov Decision Processes (MDPs). However, the major portion of the thesis is dedicated to the problems in which we do not have access to the actual underlying MDP and only observe some high-dimensional observations from the dynamical system. The objective is then to model the underlying dynamics from the data and built a model that can potentially be used for planning and control. We consider both single-agent and multi-agent systems and employ deep generative models for modelling the dynamics. For the single-agent problem we propose a model that maps the high-dimensional observations to a low-dimensional space in which the dynamics of the system is modelled by a locally-linear function. We find this mapping by a proper modelling of the variables using graphical models and show that the mapping is robust against dynamics noise and suitable for control. For the multi-agent problem we provide a formulation that describes the prediction problem in terms of the reaction of the environment to the action of one agent (ego-agent) and show that such formulation can improve the prediction accuracy as well as broaden the range of environment conditions. From a different perspective, we also consider the problem in which we have access to the MDP and would like to obtain the optimal policy. More specifically, given a set of base policies on the MDP, we want to find the best policy in their convex hull. We show that this problem is NP-hard in general and provide an approximating algorithm with linear complexity, which outputs a policy that performs close to the optimal policy. This policy can be found under the condition that base policies have overlap in the occupancy measure space

University of Waterloo's Institutional Repository

Recommended from our members

Network Structures, Concurrency, and Interpretability: Lessons from the Development of an AI Enabled Graph Database System

Author: Cooper Hal James
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2020
Field of study

This thesis describes the development of the SmartGraph, an AI enabled graph database. The need for such a system has been independently recognized in the isolated fields of graph databases, graph computing, and computational graph deep learning systems, such as TensorFlow. Though prior works have investigated some relationships between these fields, we believe that the SmartGraph is the first system designed from conception to incorporate the most significant and useful characteristics of each. Examples include the ability to store graph structured data, run analytics natively on this data, and run gradient descent algorithms. It is the synergistic aspects of combining these fields that provide the most novel results presented in this dissertation. Key among them is how the notion of “graph querying” as used in graph databases can be used to solve a problem that has plagued deep learning systems since their inception; rather than attempting to embed graph structured datasets into restrictive vector spaces, we instead allow the deep learning functionality of the system to natively perform graph querying in memory during optimization as a way of interpreting (and learning) the graph. This results in a concept of natural and interpretable processing of graph structured data. Graph computing systems have traditionally used distributed computing across multiple compute nodes (e.g. separate machines connected via Ethernet or internet) to deal with large-scale datasets whilst working sequentially on problems over entire datasets. In this dissertation, we outline a distributed graph computing methodology that facilitates all the above capabilities (even in an environment consisting of a single physical machine) while allowing for a workflow more typical of a graph database than a graph computing system; massive concurrent access allowing for arbitrarily asynchronous execution of queries and analytics across the entire system. Further, we demonstrate how this methodology is key to the artificial intelligence capabilities of the system

Columbia University Academic Commons