5 research outputs found

    A Novel SMOTE-Based Classification Approach to Online Data Imbalance Problem

    Get PDF
    In many practical engineering applications, data are usually collected in online pattern. However, if the classes of these data are severely imbalanced, the classification performance will be restricted. In this paper, a novel classification approach is proposed to solve the online data imbalance problem by integrating a fast and efficient learning algorithm, that is, Extreme Learning Machine (ELM), and a typical sampling strategy, that is, the synthetic minority oversampling technique (SMOTE). To reduce the severe imbalance, the granulation division for major-class samples is made according to the samples’ distribution characteristic, and the original samples are replaced by the obtained granule core to prepare a balanced sample set. In online stage, we firstly make granulation division for minor-class and then conduct oversampling using SMOTE in the region around granule core and granule border. Therefore, the training sample set is gradually balanced and the online ELM model is dynamically updated. We also theoretically introduce fuzzy information entropy to prove that the proposed approach has the lower bound of model reliability after undersampling. Numerical experiments are conducted on two different kinds of datasets, and the results demonstrate that the proposed approach outperforms some state-of-the-art methods in terms of the generalization performance and numerical stability

    Slum mapping : a comparison of single class learning and expert system object-oriented classification for mapping slum settlements in Addis Ababa city, Ethiopia

    Get PDF
    Dissertation submitted in partial fulfilment of the requirements for the degree of Master of Science in Geospatial TechnologiesUpdated spatial information on the dynamics of slums can be helpful to measure and evaluate the progress of urban upgrading projects and policies. Earlier studies have shown that remote sensing techniques, with the help of very-high resolution imagery, can play a significant role in detecting slums, and providing timely spatial information. The main objective of this thesis is to develop a reliable object-oriented slum identification technique that enables the provision of timely spatial information about slum settlements in Addis Ababa city. It compares the one-class support vector machines algorithm with the expert defined classification rule set in the discrimination of slums, using GeoEye-1 imagery. Two different approaches, called manual and automatic fine-tuning, were deployed to determine the best value of parameters in one-class support vector machines algorithm. The manual fine-tuning of the parameters is done using extensive manual trial. The automatic tuning is done using cross-validation grid search with the overall accuracy as the performance metric. Two regions of study were defined with different landscape compositions, providing different classification scenarios to compare the classification approaches. After image segmentation, twenty predictive variables were computed to characterize the objects in both study areas. An image analyst collected one hundred sample objects of a slum to be used as training for the single-class learner. In parallel, an image analyst has defined a hierarchical rule set to discriminate the class of interest. Results in both study areas indicate that the one-class support vector machine with manual tuning yields higher overall accuracy (97.7% in subset 1, and 92% in subset 2) and requiring much less application effort and computing time than the expert system

    Refining, Testing, and Applying Thermal Species Distribution Models to Enhance Ecological Assessments

    Get PDF
    The temperature of streams and rivers is changing rapidly in response to a variety of human activities. This rapid change is concerning because the abundances and distributions of many aquatic species in streams and rivers are strongly associated with temperature. Linking observations of temperature effects on species distributions with observations of temperature effects on fitness is important for improving confidence that temperature (and not some other variable) is causing the distributions we observe. Furthermore, producing accurate models of temperature effects on species distributions may allow us to develop tools to diagnose whether or not thermal pollution has impaired aquatic life. Such a diagnostic tool could help us better target management efforts on the specific stressors impairing aquatic life. In chapter two, I describe several laboratory experiments designed to examine the link between the effects of temperature observed in the field with effects of temperature observed in the laboratory. I found that the effects of temperature on survival were correlated with the thermal limits inferred from species distributions, which supports the hypothesis that temperature influences distributions by affecting the survival of species. In chapters three and four, I assessed two techniques that could potentially improve our ability to model relationships between temperature and distributions. In chapter three, I show that methods for dealing with imbalanced data broadly improved our ability to model the relationship between predictor variables (temperature and other variables)and species distributions. In chapter four, I evaluated a recently developed technique (deep artificial neural networks) for modeling large complex datasets. I found that deep artificial neural networks did not improve predictions over that of standard artificial neural networks and random forest models. In chapter five, I developed and evaluated a diagnostic biotic index for diagnosing the likelihood that temperature has affected macroinvertebrate species in streams and rivers. This index showed that 2.6% of streams across the continental United States had species with thermal tolerances higher than expected compared with thermally undisturbed conditions

    Rails Quality Data Modelling via Machine Learning-Based Paradigms

    Get PDF
    corecore