2,126 research outputs found

    Ensemble learning with dynamic weighting for response modeling in direct marketing

    Get PDF
    Response modeling, a key to successful direct marketing, has become increasingly prevalent in recent years. However, it practically suffers from the difficulty of class imbalance, i.e., the number of responding (target) customers is often much smaller than that of the non-responding customers. This issue would result in a response model that is biased to the majority class, leading to the low prediction accuracy on the responding customers. In this study, we develop an Ensemble Learning with Dynamic Weighting (ELDW) approach to address the above problem. The proposed ELDW includes two stages. In the first stage, all the minority class instances are combined with different majority class instances to form a number of training subsets, and a base classifiers is trained in each subset. In the second stage, the results of the base classifiers are dynamically integrated, in which two factors are considered. The first factor is the cross entropy of neighbors in each subset, and the second factor is the feature similarity to the minority class instances. In order to evaluate the performance of ELDW, we conduct experimental studies on 10 imbalanced benchmark datasets. The results show that compared with other state-of-the-art imbalance classification algorithms, ELDW achieves higher accuracy on the minority class. Last, we apply the ELDW to a direct marketing activity of an insurance company to identify the target customers under a limited budget

    Analytical study of computer vision-based pavement crack quantification using machine learning techniques

    Get PDF
    Image-based techniques are a promising non-destructive approach for road pavement condition evaluation. The main objective of this study is to extract, quantify and evaluate important surface defects, such as cracks, using an automated computer vision-based system to provide a better understanding of the pavement deterioration process. To achieve this objective, an automated crack-recognition software was developed, employing a series of image processing algorithms of crack extraction, crack grouping, and crack detection. Bottom-hat morphological technique was used to remove the random background of pavement images and extract cracks, selectively based on their shapes, sizes, and intensities using a relatively small number of user-defined parameters. A technical challenge with crack extraction algorithms, including the Bottom-hat transform, is that extracted crack pixels are usually fragmented along crack paths. For de-fragmenting those crack pixels, a novel crack-grouping algorithm is proposed as an image segmentation method, so called MorphLink-C. Statistical validation of this method using flexible pavement images indicated that MorphLink-C not only improves crack-detection accuracy but also reduces crack detection time. Crack characterization was performed by analysing imagerial features of the extracted crack image components. A comprehensive statistical analysis was conducted using filter feature subset selection (FSS) methods, including Fischer score, Gini index, information gain, ReliefF, mRmR, and FCBF to understand the statistical characteristics of cracks in different deterioration stages. Statistical significance of crack features was ranked based on their relevancy and redundancy. The statistical method used in this study can be employed to avoid subjective crack rating based on human visual inspection. Moreover, the statistical information can be used as fundamental data to justify rehabilitation policies in pavement maintenance. Finally, the application of four classification algorithms, including Artificial Neural Network (ANN), Decision Tree (DT), k-Nearest Neighbours (kNN) and Adaptive Neuro-Fuzzy Inference System (ANFIS) is investigated for the crack detection framework. The classifiers were evaluated in the following five criteria: 1) prediction performance, 2) computation time, 3) stability of results for highly imbalanced datasets in which, the number of crack objects are significantly smaller than the number of non-crack objects, 4) stability of the classifiers performance for pavements in different deterioration stages, and 5) interpretability of results and clarity of the procedure. Comparison results indicate the advantages of white-box classification methods for computer vision based pavement evaluation. Although black-box methods, such as ANN provide superior classification performance, white-box methods, such as ANFIS, provide useful information about the logic of classification and the effect of feature values on detection results. Such information can provide further insight for the image-based pavement crack detection application

    Characterization and Reduction of Noise in Manifold Representations of Hyperspectral Imagery

    Get PDF
    A new workflow to produce dimensionality reduced manifold coordinates based on the improvements of landmark Isometric Mapping (ISOMAP) algorithms using local spectral models is proposed. Manifold space from nonlinear dimensionality reduction better addresses the nonlinearity of the hyperspectral data and often has better per- formance comparing to the results of linear methods such as Minimum Noise Fraction (MNF). The dissertation mainly focuses on using adaptive local spectral models to fur- ther improve the performance of ISOMAP algorithms by addressing local noise issues and perform guided landmark selection and nearest neighborhood construction in local spectral subsets. This work could benefit the performance of common hyperspectral image analysis tasks, such as classification, target detection, etc., but also keep the computational burden low. This work is based on and improves the previous ENH- ISOMAP algorithm in various ways. The workflow is based on a unified local spectral subsetting framework. Embedding spaces in local spectral subsets as local noise models are first proposed and used to perform noise estimation, MNF regression and guided landmark selection in a local sense. Passive and active methods are proposed and ver- ified to select landmarks deliberately to ensure local geometric structure coverage and local noise avoidance. Then, a novel local spectral adaptive method is used to construct the k-nearest neighbor graph. Finally, a global MNF transformation in the manifold space is also introduced to further compress the signal dimensions. The workflow is implemented using C++ with multiple implementation optimizations, including using heterogeneous computing platforms that are available in personal computers. The re- sults are presented and evaluated by Jeffries-Matsushita separability metric, as well as the classification accuracy of supervised classifiers. The proposed workflow shows sig- nificant and stable improvements over the dimensionality reduction performance from traditional MNF and ENH-ISOMAP on various hyperspectral datasets. The computa- tional speed of the proposed implementation is also improved
    • …