23 research outputs found

    Improved sparse autoencoder based artificial neural network approach for prediction of heart disease

    Get PDF
    Abstract:In this paper a two stage method is proposed to effectively predict heart disease. The first stage involves training an improved sparse autoencoder (SAE), an unsupervised neural network, to learn the best representation of the training data. The second stage involves using an artificial neural network (ANN) to predict the health status based on the learned records. The SAE was optimized so as to train an efficient model. The experimental result shows that the proposed method improves the performance of the ANN classifier, and is more robust as compared to other methods and similar scholarly works

    Modified Stacked Autoencoder Using Adaptive Morlet Wavelet for Intelligent Fault Diagnosis of Rotating Machinery

    Get PDF
    Intelligent fault diagnosis techniques play an important role in improving the abilities of automated monitoring, inference, and decision making for the repair and maintenance of machinery and processes. In this article, a modified stacked autoencoder (MSAE) that uses adaptive Morlet wavelet is proposed to automatically diagnose various fault types and severities of rotating machinery. First, the Morlet wavelet activation function is utilized to construct an MSAE to establish an accurate nonlinear mapping between the raw nonstationary vibration data and different fault states. Then, the nonnegative constraint is applied to enhance the cost function to improve sparsity performance and reconstruction quality. Finally, the fruit fly optimization algorithm is used to determine the adjustable parameters of the Morlet wavelet to flexibly match the characteristics of the analyzed data. The proposed method is used to analyze the raw vibration data collected from a sun gear unit and a roller bearing unit. Experimental results show that the proposed method is superior to other state-of-the-art methods

    Fixed points of nonnegative neural networks

    Full text link
    We consider the existence of fixed points of nonnegative neural networks, i.e., neural networks that take as an input and produce as an output nonnegative vectors. We first show that nonnegative neural networks with nonnegative weights and biases can be recognized as monotonic and (weakly) scalable functions within the framework of nonlinear Perron-Frobenius theory. This fact enables us to provide conditions for the existence of fixed points of nonnegative neural networks, and these conditions are weaker than those obtained recently using arguments in convex analysis. Furthermore, we prove that the shape of the fixed point set of nonnegative neural networks with nonnegative weights and biases is an interval, which under mild conditions degenerates to a point. These results are then used to obtain the existence of fixed points of more general types of nonnegative neural networks. The results of this paper contribute to the understanding of the behavior of autoencoders, and they provide insight into neural networks designed using the loop-unrolling technique, which can be seen as a fixed point searching algorithm. The chief theoretical results of this paper are verified in numerical simulations.Comment: 34 page

    Condition Monitoring Methods for Large, Low-speed Bearings

    Get PDF
    In all industrial production plants, well-functioning machines and systems are required for sustained and safe operation. However, asset performance degrades over time and may lead to reduced effiency, poor product quality, secondary damage to other assets or even complete failure and unplanned downtime of critical systems. Besides the potential safety hazards from machine failure, the economic consequences are large, particularly in offshore applications where repairs are difficult. This thesis focuses on large, low-speed rolling element bearings, concretized by the main swivel bearing of an offshore drilling machine. Surveys have shown that bearing failure in drilling machines is a major cause of rig downtime. Bearings have a finite lifetime, which can be estimated using formulas supplied by the bearing manufacturer. Premature failure may still occur as a result of irregularities in operating conditions and use, lubrication, mounting, contamination, or external environmental factors. On the contrary, a bearing may also exceed the expected lifetime. Compared to smaller bearings, historical failure data from large, low-speed machinery is rare. Due to the high cost of maintenance and repairs, the preferred maintenance arrangement is often condition based. Vibration measurements with accelerometers is the most common data acquisition technique. However, vibration based condition monitoring of large, low-speed bearings is challenging, due to non-stationary operating conditions, low kinetic energy and increased distance from fault to transducer. On the sensor side, this project has also investigated the usage of acoustic emission sensors for condition monitoring purposes. Roller end damage is identified as a failure mode of interest in tapered axial bearings. Early stage abrasive wear has been observed on bearings in drilling machines. The failure mode is currently only detectable upon visual inspection and potentially through wear debris in the bearing lubricant. In this thesis, multiple machine learning algorithms are developed and applied to handle the challenges of fault detection in large, low-speed bearings with little or no historical data and unknown fault signatures. The feasibility of transfer learning is demonstrated, as an approach to speed up implementation of automated fault detection systems when historical failure data is available. Variational autoencoders are proposed as a method for unsupervised dimensionality reduction and feature extraction, being useful for obtaining a health indicator with a statistical anomaly detection threshold. Data is collected from numerous experiments throughout the project. Most notably, a test was performed on a real offshore drilling machine with roller end wear in the bearing. To replicate this failure mode and aid development of condition monitoring methods, an axial bearing test rig has been designed and built as a part of the project. An overview of all experiments, methods and results are given in the thesis, with details covered in the appended papers.publishedVersio

    AI-driven blind signature classification for IoT connectivity: a deep learning approach

    Get PDF
    Non-orthogonal multiple access (NOMA) promises to fulfill the fast-growing connectivities in future Internet of Things (IoT) using abundant multiple-access signatures. While explicitly notifying the utilized NOMA signatures causes large signaling cost, blind signature classification naturally becomes a low-cost option. To accomplish signature classification for NOMA, we study both likelihood- and feature-based methods. A likelihood-based method is firstly proposed and showed to be optimal in the asymptotic limit of the observations, despite high computational complexity. While feature-based classification methods promise low complexity, efficient features are non-trivial to be manually designed. To this end, we resort to artificial intelligence (AI) for deep learning-based automatic feature extraction. Specifically, our proposed deep neural network for signature classification, namely DeepClassifier, establishes on the insights gained from the likelihood-based method, which contains two stages to respectively deal with a single observation and aggregate the classification results of an observation sequence. The first stage utilizes an iterative structure where each layer employs a memory-extended network to explicitly exploit the knowledge of signature pool. The second stage incorporates the straight-through channels within a deep recurrent structure to avoid information loss of previous observations. Experiments show that DeepClassifier approaches the optimal likelihood-based method with a reduction of 90% complexity

    Acoustic Features for Environmental Sound Analysis

    Get PDF
    International audienceMost of the time it is nearly impossible to differentiate between particular type of sound events from a waveform only. Therefore, frequency domain and time-frequency domain representations have been used for years providing representations of the sound signals that are more inline with the human perception. However, these representations are usually too generic and often fail to describe specific content that is present in a sound recording. A lot of work have been devoted to design features that could allow extracting such specific information leading to a wide variety of hand-crafted features. During the past years, owing to the increasing availability of medium scale and large scale sound datasets, an alternative approach to feature extraction has become popular, the so-called feature learning. Finally, processing the amount of data that is at hand nowadays can quickly become overwhelming. It is therefore of paramount importance to be able to reduce the size of the dataset in the feature space. The general processing chain to convert an sound signal to a feature vector that can be efficiently exploited by a classifier and the relation to features used for speech and music processing are described is this chapter

    Deep learning methods for solving linear inverse problems: Research directions and paradigms

    Get PDF
    The linear inverse problem is fundamental to the development of various scientific areas. Innumerable attempts have been carried out to solve different variants of the linear inverse problem in different applications. Nowadays, the rapid development of deep learning provides a fresh perspective for solving the linear inverse problem, which has various well-designed network architectures results in state-of-the-art performance in many applications. In this paper, we present a comprehensive survey of the recent progress in the development of deep learning for solving various linear inverse problems. We review how deep learning methods are used in solving different linear inverse problems, and explore the structured neural network architectures that incorporate knowledge used in traditional methods. Furthermore, we identify open challenges and potential future directions along this research line

    Gaussian Process Modelling for Audio Signals

    Get PDF
    PhDAudio signals are characterised and perceived based on how their spectral make-up changes with time. Uncovering the behaviour of latent spectral components is at the heart of many real-world applications involving sound, but is a highly ill-posed task given the infi nite number of ways any signal can be decomposed. This motivates the use of prior knowledge and a probabilistic modelling paradigm that can characterise uncertainty. This thesis studies the application of Gaussian processes to audio, which offer a principled non-parametric way to specify probability distributions over functions whilst also encoding prior knowledge. Along the way we consider what prior knowledge we have about sound, the way it behaves, and the way it is perceived, and write down these assumptions in the form of probabilistic models. We show how Bayesian time-frequency analysis can be reformulated as a spectral mixture Gaussian process, and utilise modern day inference methods to carry out joint time-frequency analysis and nonnegative matrix factorisation. Our reformulation results in increased modelling flexibility, allowing more sophisticated prior knowledge to be encoded, which improves performance on a missing data synthesis task. We demonstrate the generality of this paradigm by showing how the joint model can additionally be applied to both denoising and source separation tasks without modi cation. We propose a hybrid statistical-physical model for audio spectrograms based on observations about the way amplitude envelopes decay over time, as well as a nonlinear model based on deep Gaussian processes. We examine the benefi ts of these methods, all of which are generative in the sense that novel signals can be sampled from the underlying models, allowing us to consider the extent to which they encode the important perceptual characteristics of sound

    DualApp: Tight Over-Approximation for Neural Network Robustness Verification via Under-Approximation

    Full text link
    The robustness of neural networks is fundamental to the hosting system's reliability and security. Formal verification has been proven to be effective in providing provable robustness guarantees. To improve the verification scalability, over-approximating the non-linear activation functions in neural networks by linear constraints is widely adopted, which transforms the verification problem into an efficiently solvable linear programming problem. As over-approximations inevitably introduce overestimation, many efforts have been dedicated to defining the tightest possible approximations. Recent studies have however showed that the existing so-called tightest approximations are superior to each other. In this paper we identify and report an crucial factor in defining tight approximations, namely the approximation domains of activation functions. We observe that existing approaches only rely on overestimated domains, while the corresponding tight approximation may not necessarily be tight on its actual domain. We propose a novel under-approximation-guided approach, called dual-approximation, to define tight over-approximations and two complementary under-approximation algorithms based on sampling and gradient descent. The overestimated domain guarantees the soundness while the underestimated one guides the tightness. We implement our approach into a tool called DualApp and extensively evaluate it on a comprehensive benchmark of 84 collected and trained neural networks with different architectures. The experimental results show that DualApp outperforms the state-of-the-art approximation-based approaches, with up to 71.22% improvement to the verification result.Comment: 13 pages, 9 fugures, 3 table

    Discovering a Domain Knowledge Representation for Image Grouping: Multimodal Data Modeling, Fusion, and Interactive Learning

    Get PDF
    In visually-oriented specialized medical domains such as dermatology and radiology, physicians explore interesting image cases from medical image repositories for comparative case studies to aid clinical diagnoses, educate medical trainees, and support medical research. However, general image classification and retrieval approaches fail in grouping medical images from the physicians\u27 viewpoint. This is because fully-automated learning techniques cannot yet bridge the gap between image features and domain-specific content for the absence of expert knowledge. Understanding how experts get information from medical images is therefore an important research topic. As a prior study, we conducted data elicitation experiments, where physicians were instructed to inspect each medical image towards a diagnosis while describing image content to a student seated nearby. Experts\u27 eye movements and their verbal descriptions of the image content were recorded to capture various aspects of expert image understanding. This dissertation aims at an intuitive approach to extracting expert knowledge, which is to find patterns in expert data elicited from image-based diagnoses. These patterns are useful to understand both the characteristics of the medical images and the experts\u27 cognitive reasoning processes. The transformation from the viewed raw image features to interpretation as domain-specific concepts requires experts\u27 domain knowledge and cognitive reasoning. This dissertation also approximates this transformation using a matrix factorization-based framework, which helps project multiple expert-derived data modalities to high-level abstractions. To combine additional expert interventions with computational processing capabilities, an interactive machine learning paradigm is developed to treat experts as an integral part of the learning process. Specifically, experts refine medical image groups presented by the learned model locally, to incrementally re-learn the model globally. This paradigm avoids the onerous expert annotations for model training, while aligning the learned model with experts\u27 sense-making
    corecore