620 research outputs found

    Online Tool Condition Monitoring Based on Parsimonious Ensemble+

    Full text link
    Accurate diagnosis of tool wear in metal turning process remains an open challenge for both scientists and industrial practitioners because of inhomogeneities in workpiece material, nonstationary machining settings to suit production requirements, and nonlinear relations between measured variables and tool wear. Common methodologies for tool condition monitoring still rely on batch approaches which cannot cope with a fast sampling rate of metal cutting process. Furthermore they require a retraining process to be completed from scratch when dealing with a new set of machining parameters. This paper presents an online tool condition monitoring approach based on Parsimonious Ensemble+, pENsemble+. The unique feature of pENsemble+ lies in its highly flexible principle where both ensemble structure and base-classifier structure can automatically grow and shrink on the fly based on the characteristics of data streams. Moreover, the online feature selection scenario is integrated to actively sample relevant input attributes. The paper presents advancement of a newly developed ensemble learning algorithm, pENsemble+, where online active learning scenario is incorporated to reduce operator labelling effort. The ensemble merging scenario is proposed which allows reduction of ensemble complexity while retaining its diversity. Experimental studies utilising real-world manufacturing data streams and comparisons with well known algorithms were carried out. Furthermore, the efficacy of pENsemble was examined using benchmark concept drift data streams. It has been found that pENsemble+ incurs low structural complexity and results in a significant reduction of operator labelling effort.Comment: this paper has been published by IEEE Transactions on Cybernetic

    Hybrid ACO and SVM algorithm for pattern classification

    Get PDF
    Ant Colony Optimization (ACO) is a metaheuristic algorithm that can be used to solve a variety of combinatorial optimization problems. A new direction for ACO is to optimize continuous and mixed (discrete and continuous) variables. Support Vector Machine (SVM) is a pattern classification approach originated from statistical approaches. However, SVM suffers two main problems which include feature subset selection and parameter tuning. Most approaches related to tuning SVM parameters discretize the continuous value of the parameters which will give a negative effect on the classification performance. This study presents four algorithms for tuning the SVM parameters and selecting feature subset which improved SVM classification accuracy with smaller size of feature subset. This is achieved by performing the SVM parameters’ tuning and feature subset selection processes simultaneously. Hybridization algorithms between ACO and SVM techniques were proposed. The first two algorithms, ACOR-SVM and IACOR-SVM, tune the SVM parameters while the second two algorithms, ACOMV-R-SVM and IACOMV-R-SVM, tune the SVM parameters and select the feature subset simultaneously. Ten benchmark datasets from University of California, Irvine, were used in the experiments to validate the performance of the proposed algorithms. Experimental results obtained from the proposed algorithms are better when compared with other approaches in terms of classification accuracy and size of the feature subset. The average classification accuracies for the ACOR-SVM, IACOR-SVM, ACOMV-R and IACOMV-R algorithms are 94.73%, 95.86%, 97.37% and 98.1% respectively. The average size of feature subset is eight for the ACOR-SVM and IACOR-SVM algorithms and four for the ACOMV-R and IACOMV-R algorithms. This study contributes to a new direction for ACO that can deal with continuous and mixed-variable ACO

    On the predictability of U.S. stock market using machine learning and deep learning techniques

    Get PDF
    Conventional market theories are considered to be inconsistent approach in modern financial analysis. This thesis focuses mainly on the application of sophisticated machine learning and deep learning techniques in stock market statistical predictability and economic significance over the benchmark conventional efficient market hypothesis and econometric models. Five chapters and three publishable papers were proposed altogether, and each chapter is developed to solve specific identifiable problem(s). Chapter one gives the general introduction of the thesis. It presents the statement of the research problems identified in the relevant literature, the objective of the study and the significance of the study. Chapter two applies a plethora of machine learning techniques to forecast the direction of the U.S. stock market. The notable sophisticated techniques such as regularization, discriminant analysis, classification trees, Bayesian and neural networks were employed. The empirical findings revealed that the discriminant analysis classifiers, classification trees, Bayesian classifiers and penalized binary probit models demonstrate significant outperformance over the binary probit models both statistically and economically, proving significant alternatives to portfolio managers. Chapter three focuses mainly on the application of regression training (RT) techniques to forecast the U.S. equity premium. The RT models demonstrate significant evidence of equity premium predictability both statistically and economically relative to the benchmark historical average, delivering significant utility gains. Chapter four investigates the statistical predictive power and economic significance of financial stock market data by deep learning techniques. Chapter five give the summary, conclusion and present area(s) of further research. The techniques are proven to be robust both statistically and economically when forecasting the equity premium out-of-sample using recursive window method. Overall, the deep learning techniques produced the best result in this thesis. They seek to provide meaningful economic information on mean-variance portfolio investment for investors who are timing the market to earn future gains at minimal risk

    Survey on Leveraging Uncertainty Estimation Towards Trustworthy Deep Neural Networks: The Case of Reject Option and Post-training Processing

    Full text link
    Although neural networks (especially deep neural networks) have achieved \textit{better-than-human} performance in many fields, their real-world deployment is still questionable due to the lack of awareness about the limitation in their knowledge. To incorporate such awareness in the machine learning model, prediction with reject option (also known as selective classification or classification with abstention) has been proposed in literature. In this paper, we present a systematic review of the prediction with the reject option in the context of various neural networks. To the best of our knowledge, this is the first study focusing on this aspect of neural networks. Moreover, we discuss different novel loss functions related to the reject option and post-training processing (if any) of network output for generating suitable measurements for knowledge awareness of the model. Finally, we address the application of the rejection option in reducing the prediction time for the real-time problems and present a comprehensive summary of the techniques related to the reject option in the context of extensive variety of neural networks. Our code is available on GitHub: \url{https://github.com/MehediHasanTutul/Reject_option

    Stratified Staged Trees: Modelling, Software and Applications

    Get PDF
    The thesis is focused on Probabilistic Graphical Models (PGMs), which are a rich framework for encoding probability distributions over complex domains. In particular, joint multivariate distributions over large numbers of random variables that interact with each other can be investigated through PGMs and conditional independence statements can be succinctly represented with graphical representations. These representations sit at the intersection of statistics and computer science, relying on concepts mainly from probability theory, graph algorithms and machine learning. They are applied in a wide variety of fields, such as medical diagnosis, image understanding, speech recognition, natural language processing, and many more. Over the years theory and methodology have developed and been extended in a multitude of directions. In particular, in this thesis different aspects of new classes of PGMs called Staged Trees and Chain Event Graphs (CEGs) are studied. In some sense, Staged Trees are a generalization of Bayesian Networks (BNs). Indeed, BNs provide a transparent graphical tool to define a complex process in terms of conditional independent structures. Despite their strengths in allowing for the reduction in the dimensionality of joint probability distributions of the statistical model and in providing a transparent framework for causal inference, BNs are not optimal GMs in all situations. The biggest problems with their usage mainly occur when the event space is not a simple product of the sample spaces of the random variables of interest, and when conditional independence statements are true only under certain values of variables. This happens when there are context-specific conditional independence structures. Some extensions to the BN framework have been proposed to handle these issues: context-specific BNs, Bayesian Multinets, or Similarity Networks citep{geiger1996knowledge}. These adopt a hypothesis variable to encode the context-specific statements over a particular set of random variables. For each value taken by the hypothesis variable the graphical modeller has to construct a particular BN model called local network. The collection of these local networks constitute a Bayesian Multinet, Probabilistic Decision Graphs, among others. It has been showed that Chain Event Graph (CEG) models encompass all discrete BN models and its discrete variants described above as a special subclass and they are also richer than Probabilistic Decision Graphs whose semantics is actually somewhat distinct. Unlike most of its competitors, CEGs can capture all (also context-specific) conditional independences in a unique graph, obtained by a coalescence over the vertices of an appropriately constructed probability tree, called Staged Tree. CEGs have been developed for categorical variables and have been used for cohort studies, causal analysis and case-control studies. The user\u2019s toolbox to efficiently and effectively perform uncertainty reasoning with CEGs further includes methods for inference and probability propagation, the exploration of equivalence classes and robustness studies. The main contributions of this thesis to the literature on Staged Trees are related to Stratified Staged Trees with a keen eye of application. Few observations are made on non-Stratified Staged Trees in the last part of the thesis. A core output of the thesis is an R software package which efficiently implements a host of functions for learning and estimating Staged Trees from data, relying on likelihood principles. Also structural learning algorithms based on distance or divergence between pair of categorical probability distributions and based on the clusterization of probability distributions in a fixed number of stages for each stratum of the tree are developed. Also a new class of Directed Acyclic Graph has been introduced, named Asymmetric-labeled DAG (ALDAG), which gives a BN representation of a given Staged Tree. The ALDAG is a minimal DAG such that the statistical model embedded in the Staged Tree is contained in the one associated to the ALDAG. This is possible thanks to the use of colored edges, so that each color indicates a different type of conditional dependence: total, context-specific, partial or local. Staged Trees are also adopted in this thesis as a statistical tool for classification purpose. Staged Tree Classifiers are introduced, which exhibit comparable predictive results based on accuracy with respect to algorithms from state of the art of machine learning such as neural networks and random forests. At last, algorithms to obtain an ordering of variables for the construction of the Staged Tree are designed

    Application of machine learning to agricultural soil data

    Get PDF
    Agriculture is a major sector in the Indian economy. One key advantage of classification and prediction of soil parameters is to save time of specialized technicians developing expensive chemical analysis. In this context, this PhD thesis has been developed in three stages: 1. Classification for soil data: we used chemical soil measurements to classify many relevant soil parameters: village-wise fertility indices; soil pH and type; soil nutrients, in order to recommend suitable amounts of fertilizers; and preferable crop. 2. Regression for generic data: we developed an experimental comparison of many regressors to a large collection of generic datasets selected from the University of California at Irving (UCI) machine learning repository. 3. Regression for soil data: We applied the regressors used in stage 2 to the soil datasets, developing a direct prediction of their numeric values. The accuracy of the prediction was evaluated for the ten soil problems, as an alternative to the prediction of the quantified values (classification) developed in stage 1

    Some Contributions In Complex Statistical Data Analysis for Classification, Feature Selection, and Human Fertility Models

    Get PDF
    The overall goal of this dissertation is twofold: (i) to develop methodologies in two related fields of machine learning, namely classification and variable selection, and (ii) to improve upon the available statistical methods for the study of human fertility. For the area of machine learning, we first present a new classification method, composite quantile-based classifiers. It was established in Hennig and Viroli (2016) that for univariate data and under some assumptions, the decision rule based upon the distances of an observation to the corresponding within-class quantiles for the optimal choice of quantile levels is the Bayes rule. Conceptually, our goal is to use these most powerful univariate classifiers as building blocks for a multivariate classifier. In brief, we propose aggregating the component-wise distances from the feature vector of the observation to the within-class quantiles corresponding to the one-at-a-time optimal choices of the quantile level. Aggregation is performed through an appropriately chosen linear combination of the component-wise distances. Our second contribution in machine learning is in the development of filtering and variable selection methods in the field of bioactive compound identification. Historically, compounds found in plant tissues have been fruitful bases for deriving compounds that are deleterious to certain pathogens and cancer cell lines. Our research constructs a pipeline for processing raw data obtained through liquid chromatography - mass spectrometry (LC-MS) technology and obtaining a ranking of potential bioactive compounds. For the area of human fertility studies, we consider data in the form of day-specific covariates, with pregnancy as the outcome variable. We present research aimed at overcoming several limitations in the literature, including the ability to account for continuous covariates and to allow for missing data. Furthermore, we incorporate the assumed true unimodal shape of the fertile window day probabilities into the model with the goal of reducing the variance of the posterior distribution.Doctor of Philosoph

    The risk of re-intervention after endovascular aortic aneurysm repair

    Get PDF
    This thesis studies survival analysis techniques dealing with censoring to produce predictive tools that predict the risk of endovascular aortic aneurysm repair (EVAR) re-intervention. Censoring indicates that some patients do not continue follow up, so their outcome class is unknown. Methods dealing with censoring have drawbacks and cannot handle the high censoring of the two EVAR datasets collected. Therefore, this thesis presents a new solution to high censoring by modifying an approach that was incapable of differentiating between risks groups of aortic complications. Feature selection (FS) becomes complicated with censoring. Most survival FS methods depends on Cox's model, however machine learning classifiers (MLC) are preferred. Few methods adopted MLC to perform survival FS, but they cannot be used with high censoring. This thesis proposes two FS methods which use MLC to evaluate features. The two FS methods use the new solution to deal with censoring. They combine factor analysis with greedy stepwise FS search which allows eliminated features to enter the FS process. The first FS method searches for the best neural networks' configuration and subset of features. The second approach combines support vector machines, neural networks, and K nearest neighbor classifiers using simple and weighted majority voting to construct a multiple classifier system (MCS) for improving the performance of individual classifiers. It presents a new hybrid FS process by using MCS as a wrapper method and merging it with the iterated feature ranking filter method to further reduce the features. The proposed techniques outperformed FS methods based on Cox's model such as; Akaike and Bayesian information criteria, and least absolute shrinkage and selector operator in the log-rank test's p-values, sensitivity, and concordance. This proves that the proposed techniques are more powerful in correctly predicting the risk of re-intervention. Consequently, they enable doctors to set patients’ appropriate future observation plan

    Fault analysis using state-of-the-art classifiers

    Get PDF
    Fault Analysis is the detection and diagnosis of malfunction in machine operation or process control. Early fault analysis techniques were reserved for high critical plants such as nuclear or chemical industries where abnormal event prevention is given utmost importance. The techniques developed were a result of decades of technical research and models based on extensive characterization of equipment behavior. This requires in-depth knowledge of the system and expert analysis to apply these methods for the application at hand. Since machine learning algorithms depend on past process data for creating a system model, a generic autonomous diagnostic system can be developed which can be used for application in common industrial setups. In this thesis, we look into some of the techniques used for fault detection and diagnosis multi-class and one-class classifiers. First we study Feature Selection techniques and the classifier performance is analyzed against the number of selected features. The aim of feature selection is to reduce the impact of irrelevant variables and to reduce computation burden on the learning algorithm. We introduce the feature selection algorithms as a literature survey. Only few algorithms are implemented to obtain the results. Fault data from a Radio Frequency (RF) generator is used to perform fault detection and diagnosis. Comparison between continuous and discrete fault data is conducted for the Support Vector Machines (SVM) and Radial Basis Function Network (RBF) classifiers. In the second part we look into one-class classification techniques and their application to fault detection. One-class techniques were primarily developed to identify one class of objects from all other possible objects. Since all fault occurrences in a system cannot be simulated or recorded, one-class techniques help in identifying abnormal events. We introduce four one-class classifiers and analyze them using Receiver-Operating Characteristic (ROC) curve. We also develop a feature extraction method for the RF generator data which is used to obtain results for one-class classifiers and Radial Basis Function Network two class classification. To apply these techniques for real-time verification, the RIT Fault Prediction software is built. LabView environment is used to build a basic data management and fault detection using Radial Basis Function Network. This software is stand alone and acts as foundation for future implementations
    • …
    corecore