52 research outputs found

    DATA-DRIVEN TECHNIQUES FOR DIAGNOSING BEARING DEFECTS IN INDUCTION MOTORS

    Get PDF
    Induction motors are frequently used in many automated systems as a major driving force, and thus, their reliable performances are of predominant concerns. Induction motors are subject to different types of faults and an early detection of faults can reduce maintenance costs and prevent unscheduled downtime. Motor faults are generally related to three components: the stator, the rotor and/or the bearings. This study focuses on the fault diagnosis of the bearings, which is the major reason for failures in induction motors. Data-driven fault diagnosis systems usually include a classification model which is supported by an efficient pre-processing unit. Various classifiers, which aim to diagnose multiple bearing defects (i.e., ball, inner race and outer race defects of different diameters), require well-processed data. The pre-processing tasks plays a vital role for extracting informative features from the vibration signal, reducing the dimensionality of the features and selecting the best features from the feature pool. Once the vibration signal is perfectly analyzed and a proper feature subset is created, then fault classifiers can be trained. However, classification task can be difficult if the training dataset is not balanced. Induction motors usually operate under healthy condition (than faulty situation), thus the monitored vibration samples relate to the normal state of the system expected to be more than the samples of the faulty state. Here, in this work, this challenge is also considered so that the classification model needs to deal with class imbalance problem

    Credit Scoring Using Machine Learning

    Get PDF
    For financial institutions and the economy at large, the role of credit scoring in lending decisions cannot be overemphasised. An accurate and well-performing credit scorecard allows lenders to control their risk exposure through the selective allocation of credit based on the statistical analysis of historical customer data. This thesis identifies and investigates a number of specific challenges that occur during the development of credit scorecards. Four main contributions are made in this thesis. First, we examine the performance of a number supervised classification techniques on a collection of imbalanced credit scoring datasets. Class imbalance occurs when there are significantly fewer examples in one or more classes in a dataset compared to the remaining classes. We demonstrate that oversampling the minority class leads to no overall improvement to the best performing classifiers. We find that, in contrast, adjusting the threshold on classifier output yields, in many cases, an improvement in classification performance. Our second contribution investigates a particularly severe form of class imbalance, which, in credit scoring, is referred to as the low-default portfolio problem. To address this issue, we compare the performance of a number of semi-supervised classification algorithms with that of logistic regression. Based on the detailed comparison of classifier performance, we conclude that both approaches merit consideration when dealing with low-default portfolios. Third, we quantify the differences in classifier performance arising from various implementations of a real-world behavioural scoring dataset. Due to commercial sensitivities surrounding the use of behavioural scoring data, very few empirical studies which directly address this topic are published. This thesis describes the quantitative comparison of a range of dataset parameters impacting classification performance, including: (i) varying durations of historical customer behaviour for model training; (ii) different lengths of time from which a borrower’s class label is defined; and (iii) using alternative approaches to define a customer’s default status in behavioural scoring. Finally, this thesis demonstrates how artificial data may be used to overcome the difficulties associated with obtaining and using real-world data. The limitations of artificial data, in terms of its usefulness in evaluating classification performance, are also highlighted. In this work, we are interested in generating artificial data, for credit scoring, in the absence of any available real-world data

    Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge

    Get PDF
    More than a decade has passed since research on automatic recognition of emotion from speech has become a new field of research in line with its 'big brothers' speech and speaker recognition. This article attempts to provide a short overview on where we are today, how we got there and what this can reveal us on where to go next and how we could arrive there. In a first part, we address the basic phenomenon reflecting the last fifteen years, commenting on databases, modelling and annotation, the unit of analysis and prototypicality. We then shift to automatic processing including discussions on features, classification, robustness, evaluation, and implementation and system integration. From there we go to the first comparative challenge on emotion recognition from speech-the INTERSPEECH 2009 Emotion Challenge, organised by (part of) the authors, including the description of the Challenge's database, Sub-Challenges, participants and their approaches, the winners, and the fusion of results to the actual learnt lessons before we finally address the ever-lasting problems and future promising attempts. (C) 2011 Elsevier B.V. All rights reserved.Schuller B., Batliner A., Steidl S., Seppi D., ''Recognising realistic emotions and affect in speech: state of the art and lessons learnt from the first challenge'', Speech communication, vol. 53, no. 9-10, pp. 1062-1087, November 2011.status: publishe

    Integrating Precipitation Nowcasting in a Deep Learning-Based Flash Flood Prediction Framework and Assessing the Impact of Rainfall Forecasts Uncertainties

    Get PDF
    Flash floods are among the most immediate and destructive natural hazards. To issue warnings on time, various attempts were made to extend the forecast horizon of flash floods prediction models. Particularly, introducing rainfall forecast into process-based hydrological models was found effective. However, integrating precipitation predictions into flash flood data-driven models has not been addressed yet. In this endeavor, we propose a modeling framework that integrates rainfall nowcasts and assesses the impact of rainfall predictions uncertainties on a Deep Learning-based flash flood prediction model. Compared to the Persistence and ARIMA models, the LSTM model provided better rainfall nowcasting performance. Further, we proposed an Encoder-Decoder LSTM-based model architecture for short-term flash flood prediction that supports rainfall forecasts. Computational experiments showed that future rainfall values improved flash floods predictability for extended lead times. We also found that rainfall underestimation had a significant adverse effect on the models performance compared to rainfall overestimation

    Integrating Precipitation Nowcasting in a Deep Learning-Based Flash Flood Prediction Framework and Assessing the Impact of Rainfall Forecasts Uncertainties

    Get PDF
    Flash floods are among the most immediate and destructive natural hazards. To issue warnings on time, various attempts were made to extend the forecast horizon of flash floods prediction models. Particularly, introducing rainfall forecast into process-based hydrological models was found effective. However, integrating precipitation predictions into flash flood data-driven models has not been addressed yet. In this endeavor, we propose a modeling framework that integrates rainfall nowcasts and assesses the impact of rainfall predictions uncertainties on a Deep Learning-based flash flood prediction model. Compared to the Persistence and ARIMA models, the LSTM model provided better rainfall nowcasting performance. Further, we proposed an Encoder-Decoder LSTM-based model architecture for short-term flash flood prediction that supports rainfall forecasts. Computational experiments showed that future rainfall values improved flash floods’ predictability for extended lead times. We also found that rainfall underestimation had a significant adverse effect on the model’s performance compared to rainfall overestimation

    Exploring variabilities through factor analysis in automatic acoustic language recognition

    Get PDF
    La problématique traitée par la Reconnaissance de la Langue (LR) porte sur la définition découverte de la langue contenue dans un segment de parole. Cette thèse se base sur des paramètres acoustiques de courte durée, utilisés dans une approche d adaptation de mélanges de Gaussiennes (GMM-UBM). Le problème majeur de nombreuses applications du vaste domaine de la re- problème connaissance de formes consiste en la variabilité des données observées. Dans le contexte de la Reconnaissance de la Langue (LR), cette variabilité nuisible est due à des causes diverses, notamment les caractéristiques du locuteur, l évolution de la parole et de la voix, ainsi que les canaux d acquisition et de transmission. Dans le contexte de la reconnaissance du locuteur, l impact de la variabilité solution peut sensiblement être réduit par la technique d Analyse Factorielle (Joint Factor Analysis, JFA). Dans ce travail, nous introduisons ce paradigme à la Reconnaissance de la Langue. Le succès de la JFA repose sur plusieurs hypothèses. La première est que l information observée est décomposable en une partie universelle, une partie dépendante de la langue et une partie de variabilité, qui elle est indépendante de la langue. La deuxième hypothèse, plus technique, est que la variabilité nuisible se situe dans un sous-espace de faible dimension, qui est défini de manière globale.Dans ce travail, nous analysons le comportement de la JFA dans le contexte d un dispositif de LR du type GMM-UBM. Nous introduisons et analysons également sa combinaison avec des Machines à Vecteurs Support (SVM). Les premières publications sur la JFA regroupaient toute information qui est amélioration nuisible à la tâche (donc ladite variabilité) dans un seul composant. Celui-ci est supposé suivre une distribution Gaussienne. Cette approche permet de traiter les différentes sortes de variabilités d une manière unique. En pratique, nous observons que cette hypothèse n est pas toujours vérifiée. Nous avons, par exemple, le cas où les données peuvent être groupées de manière logique en deux sous-parties clairement distinctes, notamment en données de sources téléphoniques et d émissions radio. Dans ce cas-ci, nos recherches détaillées montrent un certain avantage à traiter les deux types de données par deux systèmes spécifiques et d élire comme score de sortie celui du système qui correspond à la catégorie source du segment testé. Afin de sélectionner le score de l un des systèmes, nous avons besoin d un analyses détecteur de canal source. Nous proposons ici différents nouveaux designs pour engendrées de tels détecteurs automatiques. Dans ce cadre, nous montrons que les facteurs de variabilité (du sous-espace) de la JFA peuvent être utilisés avec succès pour la détection de la source. Ceci ouvre la perspective intéressante de subdiviser les5données en catégories de canal source qui sont établies de manière automatique. En plus de pouvoir s adapter à des nouvelles conditions de source, cette propriété permettrait de pouvoir travailler avec des données d entraînement qui ne sont pas accompagnées d étiquettes sur le canal de source. L approche JFA permet une réduction de la mesure de coûts allant jusqu à généraux 72% relatives, comparé au système GMM-UBM de base. En utilisant des systèmes spécifiques à la source, suivis d un sélecteur de scores, nous obtenons une amélioration relative de 81%.Language Recognition is the problem of discovering the language of a spoken definitionutterance. This thesis achieves this goal by using short term acoustic information within a GMM-UBM approach.The main problem of many pattern recognition applications is the variability of problemthe observed data. In the context of Language Recognition (LR), this troublesomevariability is due to the speaker characteristics, speech evolution, acquisition and transmission channels.In the context of Speaker Recognition, the variability problem is solved by solutionthe Joint Factor Analysis (JFA) technique. Here, we introduce this paradigm toLanguage Recognition. The success of JFA relies on several assumptions: The globalJFA assumption is that the observed information can be decomposed into a universalglobal part, a language-dependent part and the language-independent variabilitypart. The second, more technical assumption consists in the unwanted variability part to be thought to live in a low-dimensional, globally defined subspace. In this work, we analyze how JFA behaves in the context of a GMM-UBM LR framework. We also introduce and analyze its combination with Support Vector Machines(SVMs).The first JFA publications put all unwanted information (hence the variability) improvemen tinto one and the same component, which is thought to follow a Gaussian distribution.This handles diverse kinds of variability in a unique manner. But in practice,we observe that this hypothesis is not always verified. We have for example thecase, where the data can be divided into two clearly separate subsets, namely datafrom telephony and from broadcast sources. In this case, our detailed investigations show that there is some benefit of handling the two kinds of data with two separatesystems and then to elect the output score of the system, which corresponds to the source of the testing utterance.For selecting the score of one or the other system, we need a channel source related analyses detector. We propose here different novel designs for such automatic detectors.In this framework, we show that JFA s variability factors (of the subspace) can beused with success for detecting the source. This opens the interesting perspectiveof partitioning the data into automatically determined channel source categories,avoiding the need of source-labeled training data, which is not always available.The JFA approach results in up to 72% relative cost reduction, compared to the overall resultsGMM-UBM baseline system. Using source specific systems followed by a scoreselector, we achieve 81% relative improvement.AVIGNON-Bib. numérique (840079901) / SudocSudocFranceF

    Hemodynamics of Native and Bioprosthetic Aortic Valves: Insights from a Reduced Degree-of-Freedom Model

    Get PDF
    Heart disease is the leading cause of deaths in the US with aortic valve (AV) diseases being major contributors. Valve replacement is the primary therapeutic indication for AV diseases and transcatheter aortic valve replacement (TAVR) provides a safe and minimally invasive option. However, post-TAVR patient outcomes show considerable variability with deployment parameters. TAVR valves are also susceptible to failure mechanisms like leaflet thrombosis which increase the risk for serious thromboembolic events. Early detection and intervention can avert such outcomes, but symptoms often manifest at advanced stages of valve failure. Continuous monitoring can facilitate early detection, but regulatory and technological challenges may hinder developing such technology through experimental or clinical means. Computer simulations enable unprecedented predictive capabilities which can help gain insights into the pathophysiology of valvular diseases, conduct in silico trials to design novel monitoring technologies and even guide surgeries for optimal valve deployment. However, accurate, yet efficient numerical models are required. This study describes the implementation of a versatile, efficient AV dynamics model in a previously developed fluid-structure interaction solver, and its application to each of these tasks. The model accelerates simulations by simplifying the constitutive parameter space and equations governing leaflet motion without compromising accuracy. It can simulate native and prosthetic valve dynamics exhibiting physiological and pathological function in idealized and personalized aorta anatomies. This computational framework is used to generate canonical and patient-specific simulation datasets describing hemodynamic differences secondary to healthy and pathological AVs. These differences help identify biomarkers which reliably predict the risk of valvular and vascular diseases. Changes in these biomarkers are used to assess whether TAVR can deter aortic disease progression. Next, statistical differences in such biomarkers recorded by virtual wearable or embedded sensor systems, between normal and abnormal AV function, are analyzed using data-driven methods to infer valve health. This lays the groundwork for inexpensive, at-home diagnostic technologies, based on digital auscultation and in situ embedded-sensor platforms. Finally, a simulation describing the deployment of a commercially available TAVR valve in a patient-specific aorta anatomy and the associated hemodynamics is presented. Such simulations empower clinicians to optimize TAVR deployment and, consequently, patient outcomes
    • …
    corecore