1,107 research outputs found

    Sound Event Detection by Exploring Audio Sequence Modelling

    Get PDF
    Everyday sounds in real-world environments are a powerful source of information by which humans can interact with their environments. Humans can infer what is happening around them by listening to everyday sounds. At the same time, it is a challenging task for a computer algorithm in a smart device to automatically recognise, understand, and interpret everyday sounds. Sound event detection (SED) is the process of transcribing an audio recording into sound event tags with onset and offset time values. This involves classification and segmentation of sound events in the given audio recording. SED has numerous applications in everyday life which include security and surveillance, automation, healthcare monitoring, multimedia information retrieval, and assisted living technologies. SED is to everyday sounds what automatic speech recognition (ASR) is to speech and automatic music transcription (AMT) is to music. The fundamental questions in designing a sound recognition system are, which portion of a sound event should the system analyse, and what proportion of a sound event should the system process in order to claim a confident detection of that particular sound event. While the classification of sound events has improved a lot in recent years, it is considered that the temporal-segmentation of sound events has not improved in the same extent. The aim of this thesis is to propose and develop methods to improve the segmentation and classification of everyday sound events in SED models. In particular, this thesis explores the segmentation of sound events by investigating audio sequence encoding-based and audio sequence modelling-based methods, in an effort to improve the overall sound event detection performance. In the first phase of this thesis, efforts are put towards improving sound event detection by explicitly conditioning the audio sequence representations of an SED model using sound activity detection (SAD) and onset detection. To achieve this, we propose multi-task learning-based SED models in which SAD and onset detection are used as auxiliary tasks for the SED task. The next part of this thesis explores self-attention-based audio sequence modelling, which aggregates audio representations based on temporal relations within and between sound events, scored on the basis of the similarity of sound event portions in audio event sequences. We propose SED models that include memory-controlled, adaptive, dynamic, and source separation-induced self-attention variants, with the aim to improve overall sound recognition

    Electrocardiogram Monitoring Wearable Devices and Artificial-Intelligence-Enabled Diagnostic Capabilities: A Review

    Get PDF
    Worldwide, population aging and unhealthy lifestyles have increased the incidence of high-risk health conditions such as cardiovascular diseases, sleep apnea, and other conditions. Recently, to facilitate early identification and diagnosis, efforts have been made in the research and development of new wearable devices to make them smaller, more comfortable, more accurate, and increasingly compatible with artificial intelligence technologies. These efforts can pave the way to the longer and continuous health monitoring of different biosignals, including the real-time detection of diseases, thus providing more timely and accurate predictions of health events that can drastically improve the healthcare management of patients. Most recent reviews focus on a specific category of disease, the use of artificial intelligence in 12-lead electrocardiograms, or on wearable technology. However, we present recent advances in the use of electrocardiogram signals acquired with wearable devices or from publicly available databases and the analysis of such signals with artificial intelligence methods to detect and predict diseases. As expected, most of the available research focuses on heart diseases, sleep apnea, and other emerging areas, such as mental stress. From a methodological point of view, although traditional statistical methods and machine learning are still widely used, we observe an increasing use of more advanced deep learning methods, specifically architectures that can handle the complexity of biosignal data. These deep learning methods typically include convolutional and recurrent neural networks. Moreover, when proposing new artificial intelligence methods, we observe that the prevalent choice is to use publicly available databases rather than collecting new data

    Explainable fault prediction using learning fuzzy cognitive maps

    Get PDF
    IoT sensors capture different aspects of the environment and generate high throughput data streams. Besides capturing these data streams and reporting the monitoring information, there is significant potential for adopting deep learning to identify valuable insights for predictive preventive maintenance. One specific class of applications involves using Long Short-Term Memory Networks (LSTMs) to predict faults happening in the near future. However, despite their remarkable performance, LSTMs can be very opaque. This paper deals with this issue by applying Learning Fuzzy Cognitive Maps (LFCMs) for developing simplified auxiliary models that can provide greater transparency. An LSTM model for predicting faults of industrial bearings based on readings from vibration sensors is developed to evaluate the idea. An LFCM is then used to imitate the performance of the baseline LSTM model. Through static and dynamic analyses, we demonstrate that LFCM can highlight (i) which members in a sequence of readings contribute to the prediction result and (ii) which values could be controlled to prevent possible faults. Moreover, we compare LFCM with state-of-the-art methods reported in the literature, including decision trees and SHAP values. The experiments show that LFCM offers some advantages over these methods. Moreover, LFCM, by conducting a what-if analysis, could provide more information about the black-box model. To the best of our knowledge, this is the first time LFCMs have been used to simplify a deep learning model to offer greater explainability

    Tradition and Innovation in Construction Project Management

    Get PDF
    This book is a reprint of the Special Issue 'Tradition and Innovation in Construction Project Management' that was published in the journal Buildings

    Tensor-variate machine learning on graphs

    Get PDF
    Traditional machine learning algorithms are facing significant challenges as the world enters the era of big data, with a dramatic expansion in volume and range of applications and an increase in the variety of data sources. The large- and multi-dimensional nature of data often increases the computational costs associated with their processing and raises the risks of model over-fitting - a phenomenon known as the curse of dimensionality. To this end, tensors have become a subject of great interest in the data analytics community, owing to their remarkable ability to super-compress high-dimensional data into a low-rank format, while retaining the original data structure and interpretability. This leads to a significant reduction in computational costs, from an exponential complexity to a linear one in the data dimensions. An additional challenge when processing modern big data is that they often reside on irregular domains and exhibit relational structures, which violates the regular grid assumptions of traditional machine learning models. To this end, there has been an increasing amount of research in generalizing traditional learning algorithms to graph data. This allows for the processing of graph signals while accounting for the underlying relational structure, such as user interactions in social networks, vehicle flows in traffic networks, transactions in supply chains, chemical bonds in proteins, and trading data in financial networks, to name a few. Although promising results have been achieved in these fields, there is a void in literature when it comes to the conjoint treatment of tensors and graphs for data analytics. Solutions in this area are increasingly urgent, as modern big data is both large-dimensional and irregular in structure. To this end, the goal of this thesis is to explore machine learning methods that can fully exploit the advantages of both tensors and graphs. In particular, the following approaches are introduced: (i) Graph-regularized tensor regression framework for modelling high-dimensional data while accounting for the underlying graph structure; (ii) Tensor-algebraic approach for computing efficient convolution on graphs; (iii) Graph tensor network framework for designing neural learning systems which is both general enough to describe most existing neural network architectures and flexible enough to model large-dimensional data on any and many irregular domains. The considered frameworks were employed in several real-world applications, including air quality forecasting, protein classification, and financial modelling. Experimental results validate the advantages of the proposed methods, which achieved better or comparable performance against state-of-the-art models. Additionally, these methods benefit from increased interpretability and reduced computational costs, which are crucial for tackling the challenges posed by the era of big data.Open Acces

    Enhancing Image Quality: A Comparative Study of Spatial, Frequency Domain, and Deep Learning Methods

    Get PDF
    Image restoration and noise reduction methods have been created to restore deteriorated images and improve their quality. These methods have garnered substantial significance in recent times, mainly due to the growing utilization of digital imaging across diverse domains, including but not limited to medical imaging, surveillance, satellite imaging, and numerous others. In this paper, we conduct a comparative analysis of three distinct approaches to image restoration: the spatial method, the frequency domain method, and the deep learning method. The study was conducted on a dataset of 10,000 images, and the performance of each method was evaluated using the accuracy and loss metrics. The results show that the deep learning method outperformed the other two methods, achieving a validation accuracy of 72.68% after 10 epochs. The spatial method had the lowest accuracy of the three, achieving a validation accuracy of 69.98% after 10 epochs. The FFT frequency domain method had a validation accuracy of 52.87% after 10 epochs, significantly lower than the other two methods. The study demonstrates that deep learning is a promising approach for image classification tasks and outperforms traditional methods such as spatial and frequency domain techniques

    An Analytical Performance Evaluation on Multiview Clustering Approaches

    Get PDF
    The concept of machine learning encompasses a wide variety of different approaches, one of which is called clustering. The data points are grouped together in this approach to the problem. Using a clustering method, it is feasible, given a collection of data points, to classify each data point as belonging to a specific group. This can be done if the algorithm is given the collection of data points. In theory, data points that constitute the same group ought to have attributes and characteristics that are equivalent to one another, however data points that belong to other groups ought to have properties and characteristics that are very different from one another. The generation of multiview data is made possible by recent developments in information collecting technologies. The data were collected from à variety of sources and were analysed using a variety of perspectives. The data in question are what are known as multiview data. On a single view, the conventional clustering algorithms are applied. In spite of this, real-world data are complicated and can be clustered in a variety of different ways, depending on how the data are interpreted. In practise, the real-world data are messy. In recent years, Multiview Clustering, often known as MVC, has garnered an increasing amount of attention due to its goal of utilising complimentary and consensus information derived from different points of view. On the other hand, the vast majority of the systems that are currently available only enable the single-clustering scenario, whereby only makes utilization of a single cluster to split the data. This is the case since there is only one cluster accessible. In light of this, it is absolutely necessary to carry out investigation on the multiview data format. The study work is centred on multiview clustering and how well it performs compared to these other strategies

    A reduced order modeling methodology for the parametric estimation and optimization of aviation noise

    Get PDF
    The successful mitigation of aviation noise is one of the key enablers of sustainable aviation growth. Technological improvements for noise reduction at the source have been countered by increasing number of operations at most airports. There are several consequences of aviation noise including direct health effects, effects on human and non-human environments, and economic costs. Several mitigation strategies exist including reduction of noise at source, land-use planning and management, noise abatement operational procedures, and operating restrictions. Most noise management programs at airports use a combination of such mitigation measures. To assess the efficacy of noise mitigation measures, a robust modeling and simulation capability is required. Due to the large number of factors which can influence aviation noise metrics, current state-of-the-art tools rely on physics-based and semi-empirical models. These models help in accurately predicting noise metrics in a wide range of scenarios; however, they are computationally expensive to evaluate. Therefore, current noise mitigation studies are limited to singular applications such as annual average day noise quantification. Many-query applications such as parametric trade-off analyses and optimization remain elusive with the current generation of tools and methods. There are several efforts documented in literature which attempt to speed up the process using surrogate models. Techniques include the use of pre-computed noise grids with calibration models for non-standard conditions. These techniques are typically predicated on simplifying assumptions which greatly limit the applicability of such models. Simplifying assumptions are needed to downsize the number influencing factors to be modeled and make the problem tractable. Existing efforts also suffer due to the inclusion of categorical variables for operational profiles which are not conducive to surrogate modeling. In this research, a methodology is developed to address the inherent complexities of the noise quantification process, and thus enable rapid noise modeling capabilities which can facilitate parametric trade-off analysis and optimization efforts. To achieve this objective, a research plan is developed and executed to address two major gaps in literature. First, a parametric representation of operational profiles is proposed to replace existing categorical descriptions. A technique is developed to allow real-world flight data to be efficiently mapped onto this parametric definition. A trajectory clustering method is used to group similar flights and representative flights are parametrized using an inverse-map of an aircraft performance model. Next, a field surrogate modeling method is developed based on Model Order Reduction techniques to reduce the high dimensionality of computed noise metric results. This greatly reduces the complexity of data to be modeled, and thus enables rapid noise quantification. With these two gaps addressed, the overall methodology is developed for rapid noise quantification and optimization. This methodology is demonstrated on a case study where a large number of real-world flight trajectories are efficiently modeled for their noise results. As each such flight trajectory has a unique representation, and typically lacks thrust information, such noise modeling is not computationally feasible with existing methods and tools. The developed parametric representations and field surrogate modeling capabilities enable such an application.Ph.D
    corecore