1,240 research outputs found

    A Diversity-Accuracy Measure for Homogenous Ensemble Selection

    Get PDF
    Several selection methods in the literature are essentially based on an evaluation function that determines whether a model M contributes positively to boost the performances of the whole ensemble. In this paper, we propose a method called DIversity and ACcuracy for Ensemble Selection (DIACES) using an evaluation function based on both diversity and accuracy. The method is applied on homogenous ensembles composed of C4.5 decision trees and based on a hill climbing strategy. This allows selecting ensembles with the best compromise between maximum diversity and minimum error rate. Comparative studies show that in most cases the proposed method generates reduced size ensembles with better performances than usual ensemble simplification methods

    On pruning and feature engineering in Random Forests.

    Get PDF
    Random Forest (RF) is an ensemble classification technique that was developed by Leo Breiman over a decade ago. Compared with other ensemble techniques, it has proved its accuracy and superiority. Many researchers, however, believe that there is still room for optimizing RF further by enhancing and improving its performance accuracy. This explains why there have been many extensions of RF where each extension employed a variety of techniques and strategies to improve certain aspect(s) of RF. The main focus of this dissertation is to develop new extensions of RF using new optimization techniques that, to the best of our knowledge, have never been used before to optimize RF. These techniques are clustering, the local outlier factor, diversified weighted subspaces, and replicator dynamics. Applying these techniques on RF produced four extensions which we have termed CLUB-DRF, LOFB-DRF, DSB-RF, and RDB-DR respectively. Experimental studies on 15 real datasets showed favorable results, demonstrating the potential of the proposed methods. Performance-wise, CLUB-DRF is ranked first in terms of accuracy and classifcation speed making it ideal for real-time applications, and for machines/devices with limited memory and processing power

    A General Spatio-Temporal Clustering-Based Non-local Formulation for Multiscale Modeling of Compartmentalized Reservoirs

    Full text link
    Representing the reservoir as a network of discrete compartments with neighbor and non-neighbor connections is a fast, yet accurate method for analyzing oil and gas reservoirs. Automatic and rapid detection of coarse-scale compartments with distinct static and dynamic properties is an integral part of such high-level reservoir analysis. In this work, we present a hybrid framework specific to reservoir analysis for an automatic detection of clusters in space using spatial and temporal field data, coupled with a physics-based multiscale modeling approach. In this work a novel hybrid approach is presented in which we couple a physics-based non-local modeling framework with data-driven clustering techniques to provide a fast and accurate multiscale modeling of compartmentalized reservoirs. This research also adds to the literature by presenting a comprehensive work on spatio-temporal clustering for reservoir studies applications that well considers the clustering complexities, the intrinsic sparse and noisy nature of the data, and the interpretability of the outcome. Keywords: Artificial Intelligence; Machine Learning; Spatio-Temporal Clustering; Physics-Based Data-Driven Formulation; Multiscale Modelin

    Heat Transfer Mechanism In Particle-Laden Turbulent Shearless Flows

    Get PDF
    Particle-laden turbulent flows are one of the complex flow regimes involved in a wide range of environmental, industrial, biomedical and aeronautical applications. Recently the interest has included also the interaction between scalars and particles, and the complex scenario which arises from the interaction of particle finite inertia, temperature transport, and momentum and heat feedback of particles on the flow leads to a multi-scale and multi-physics phenomenon which is not yet fully understood. The present work aims to investigate the fluid-particle thermal interaction in turbulent mixing under one-way and two-way coupling regimes. A recent novel numerical framework has been used to investigate the impact of suspended sub-Kolmogorov inertial particles on heat transfer within the mixing layer which develops at the interface of two regions with different temperature in an isotropic turbulent flow. Temperature has been considered a passive scalar, advected by the solenoidal velocity field, and subject to the particle thermal feedback in the two-way regime. A self-similar stage always develops where all single-point statistics of the carrier fluid and the suspended particles collapse when properly re-scaled. We quantify the effect of particle inertial, parametrized through the Stokes and thermal Stokes numbers, on the heat transfer through the Nusselt number, defined as the ratio of the heat transfer to the thermal diffusion. A scale analysis will be presented. We show how the modulation of fluid temperature gradients due to the statistical alignments of the particle velocity and the local carrier flow temperature gradient field, impacts the overall heat transfer in the two-way coupling regime

    A Study on Comparison of Classification Algorithms for Pump Failure Prediction

    Get PDF
    The reliability of pumps can be compromised by faults, impacting their functionality. Detecting these faults is crucial, and many studies have utilized motor current signals for this purpose. However, as pumps are rotational equipped, vibrations also play a vital role in fault identification. Rising pump failures have led to increased maintenance costs and unavailability, emphasizing the need for cost-effective and dependable machinery operation. This study addresses the imperative challenge of defect classification through the lens of predictive modeling. With a problem statement centered on achieving accurate and efficient identification of defects, this study’s objective is to evaluate the performance of five distinct algorithms: Fine Decision Tree, Medium Decision Tree, Bagged Trees (Ensemble), RUS-Boosted Trees, and Boosted Trees. Leveraging a comprehensive dataset, the study meticulously trained and tested each model, analyzing training accuracy, test accuracy, and Area Under the Curve (AUC) metrics. The results showcase the supremacy of the Fine Decision Tree (91.2% training accuracy, 74% test accuracy, AUC 0.80), the robustness of the Ensemble approach (Bagged Trees with 94.9% training accuracy, 99.9% test accuracy, and AUC 1.00), and the competitiveness of Boosted Trees (89.4% training accuracy, 72.2% test accuracy, AUC 0.79) in defect classification. Notably, Support Vector Machines (SVM), Artificial Neural Networks (ANN), and k-Nearest Neighbors (KNN) exhibited comparatively lower performance. Our study contributes valuable insights into the efficacy of these algorithms, guiding practitioners toward optimal model selection for defect classification scenarios. This research lays a foundation for enhanced decision-making in quality control and predictive maintenance, fostering advancements in the realm of defect prediction and classification

    Continual learning from stationary and non-stationary data

    Get PDF
    Continual learning aims at developing models that are capable of working on constantly evolving problems over a long-time horizon. In such environments, we can distinguish three essential aspects of training and maintaining machine learning models - incorporating new knowledge, retaining it and reacting to changes. Each of them poses its own challenges, constituting a compound problem with multiple goals. Remembering previously incorporated concepts is the main property of a model that is required when dealing with stationary distributions. In non-stationary environments, models should be capable of selectively forgetting outdated decision boundaries and adapting to new concepts. Finally, a significant difficulty can be found in combining these two abilities within a single learning algorithm, since, in such scenarios, we have to balance remembering and forgetting instead of focusing only on one aspect. The presented dissertation addressed these problems in an exploratory way. Its main goal was to grasp the continual learning paradigm as a whole, analyze its different branches and tackle identified issues covering various aspects of learning from sequentially incoming data. By doing so, this work not only filled several gaps in the current continual learning research but also emphasized the complexity and diversity of challenges existing in this domain. Comprehensive experiments conducted for all of the presented contributions have demonstrated their effectiveness and substantiated the validity of the stated claims

    Ground state cooling of atoms in optical lattices

    Full text link
    We propose two schemes for cooling bosonic and fermionic atoms that are trapped in a deep optical lattice. The first scheme is a quantum algorithm based on particle number filtering and state dependent lattice shifts. The second protocol alternates filtering with a redistribution of particles by means of quantum tunnelling. We provide a complete theoretical analysis of both schemes and characterize the cooling efficiency in terms of the entropy. Our schemes do not require addressing of single lattice sites and use a novel method, which is based on coherent laser control, to perform very fast filtering.Comment: 12 pages, 7 figure
    • …
    corecore