2,272 research outputs found

    Machine Learning and Integrative Analysis of Biomedical Big Data.

    Get PDF
    Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

    Data analytics 2016: proceedings of the fifth international conference on data analytics

    Get PDF

    Uncertainty Quantification in Machine Learning for Engineering Design and Health Prognostics: A Tutorial

    Full text link
    On top of machine learning models, uncertainty quantification (UQ) functions as an essential layer of safety assurance that could lead to more principled decision making by enabling sound risk assessment and management. The safety and reliability improvement of ML models empowered by UQ has the potential to significantly facilitate the broad adoption of ML solutions in high-stakes decision settings, such as healthcare, manufacturing, and aviation, to name a few. In this tutorial, we aim to provide a holistic lens on emerging UQ methods for ML models with a particular focus on neural networks and the applications of these UQ methods in tackling engineering design as well as prognostics and health management problems. Toward this goal, we start with a comprehensive classification of uncertainty types, sources, and causes pertaining to UQ of ML models. Next, we provide a tutorial-style description of several state-of-the-art UQ methods: Gaussian process regression, Bayesian neural network, neural network ensemble, and deterministic UQ methods focusing on spectral-normalized neural Gaussian process. Established upon the mathematical formulations, we subsequently examine the soundness of these UQ methods quantitatively and qualitatively (by a toy regression example) to examine their strengths and shortcomings from different dimensions. Then, we review quantitative metrics commonly used to assess the quality of predictive uncertainty in classification and regression problems. Afterward, we discuss the increasingly important role of UQ of ML models in solving challenging problems in engineering design and health prognostics. Two case studies with source codes available on GitHub are used to demonstrate these UQ methods and compare their performance in the life prediction of lithium-ion batteries at the early stage and the remaining useful life prediction of turbofan engines

    Failure Prognosis of Wind Turbine Components

    Get PDF
    Wind energy is playing an increasingly significant role in the World\u27s energy supply mix. In North America, many utility-scale wind turbines are approaching, or are beyond the half-way point of their originally anticipated lifespan. Accurate estimation of the times to failure of major turbine components can provide wind farm owners insight into how to optimize the life and value of their farm assets. This dissertation deals with fault detection and failure prognosis of critical wind turbine sub-assemblies, including generators, blades, and bearings based on data-driven approaches. The main aim of the data-driven methods is to utilize measurement data from the system and forecast the Remaining Useful Life (RUL) of faulty components accurately and efficiently. The main contributions of this dissertation are in the application of ALTA lifetime analysis to help illustrate a possible relationship between varying loads and generators reliability, a wavelet-based Probability Density Function (PDF) to effectively detecting incipient wind turbine blade failure, an adaptive Bayesian algorithm for modeling the uncertainty inherent in the bearings RUL prediction horizon, and a Hidden Markov Model (HMM) for characterizing the bearing damage progression based on varying operating states to mimic a real condition in which wind turbines operate and to recognize that the damage progression is a function of the stress applied to each component using data from historical failures across three different Canadian wind farms

    Development of a Methodology for Condition-Based Maintenance in a Large-Scale Application Field

    Get PDF
    This paper describes a methodology, developed by the authors, for condition monitoring and diagnostics of several critical components in the large-scale applications with machines. For industry, the main target of condition monitoring is to prevent the machine stopping suddenly and thus avoid economic losses due to lack of production. Once the target is reached at a local level, usually through an R&D project, the extension to a large-scale market gives rise to new goals, such as low computational costs for analysis, easily interpretable results by local technicians, collection of data from worldwide machine installations, and the development of historical datasets to improve methodology, etc. This paper details an approach to condition monitoring, developed together with a multinational corporation, that covers all the critical points mentioned above
    • …
    corecore