102 research outputs found

    Ensemble diversity for class imbalance learning

    Get PDF
    This thesis studies the diversity issue of classification ensembles for class imbalance learning problems. Class imbalance learning refers to learning from imbalanced data sets, in which some classes of examples (minority) are highly under-represented comparing to other classes (majority). The very skewed class distribution degrades the learning ability of many traditional machine learning methods, especially in the recognition of examples from the minority classes, which are often deemed to be more important and interesting. Although quite a few ensemble learning approaches have been proposed to handle the problem, no in-depth research exists to explain why and when they can be helpful. Our objectives are to understand how ensemble diversity affects the classification performance for a class imbalance problem according to single-class and overall performance measures, and to make best use of diversity to improve the performance. As the first stage, we study the relationship between ensemble diversity and generalization performance for class imbalance problems. We investigate mathematical links between single-class performance and ensemble diversity. It is found that how the single-class measures change along with diversity falls into six different situations. These findings are then verified in class imbalance scenarios through empirical studies. The impact of diversity on overall performance is also investigated empirically. Strong correlations between diversity and the performance measures are found. Diversity shows a positive impact on the recognition of the minority class and benefits the overall performance of ensembles in class imbalance learning. Our results help to understand if and why ensemble diversity can help to deal with class imbalance problems. Encouraged by the positive role of diversity in class imbalance learning, we then focus on a specific ensemble learning technique, the negative correlation learning (NCL) algorithm, which considers diversity explicitly when creating ensembles and has achieved great empirical success. We propose a new learning algorithm based on the idea of NCL, named AdaBoost.NC, for classification problems. An ``ambiguity" term decomposed from the 0-1 error function is introduced into the training framework of AdaBoost. It demonstrates superiority in both effectiveness and efficiency. Its good generalization performance is explained by theoretical and empirical evidences. It can be viewed as the first NCL algorithm specializing in classification problems. Most existing ensemble methods for class imbalance problems suffer from the problems of overfitting and over-generalization. To improve this situation, we address the class imbalance issue by making use of ensemble diversity. We investigate the generalization ability of NCL algorithms, including AdaBoost.NC, to tackle two-class imbalance problems. We find that NCL methods integrated with random oversampling are effective in recognizing minority class examples without losing the overall performance, especially the AdaBoost.NC tree ensemble. This is achieved by providing smoother and less overfitting classification boundaries for the minority class. The results here show the usefulness of diversity and open up a novel way to deal with class imbalance problems. Since the two-class imbalance is not the only scenario in real-world applications, multi-class imbalance problems deserve equal attention. To understand what problems multi-class can cause and how it affects the classification performance, we study the multi-class difficulty by analyzing the multi-minority and multi-majority cases respectively. Both lead to a significant performance reduction. The multi-majority case appears to be more harmful. The results reveal possible issues that a class imbalance learning technique could have when dealing with multi-class tasks. Following this part of analysis and the promising results of AdaBoost.NC on two-class imbalance problems, we apply AdaBoost.NC to a set of multi-class imbalance domains with the aim of solving them effectively and directly. Our method shows good generalization in minority classes and balances the performance across different classes well without using any class decomposition schemes. Finally, we conclude this thesis with how the study has contributed to class imbalance learning and ensemble learning, and propose several possible directions for future research that may improve and extend this work

    Classroom assessment adjustments, academic achievement, academic wellbeing: a mixed methods study of australian secondary school students with and without disabilities

    Get PDF
    This mixed methods study examined the relationship between academic achievement and academic wellbeing for students with and without disabilities, and the effect of the provision of assessment adjustments on achievement and academic wellbeing for students with disabilities, in Australian mainstream secondary schooling. The study is framed through the biopsychosocial model of disability and social-cognitive theory, emphasising the interactional nature of disability with personal and environmental factors. Although correlational studies examining relationships between achievement and academic wellbeing have been undertaken elsewhere, this study provides evidence about the nature of these relationships for students in Australia. Further, a qualitative study was undertaken to provide new insights into how academic achievement and wellbeing are related for students with disabilities in inclusive education settings. In these settings, adjustments to enable students to demonstrate their achievement are expected in law and policy. A two-strand parallel mixed methods design was used with data collected from two independent groups of participants. In Strand 1 of the study, a correlational study was conducted with 42 students with disabilities and 80 students without disabilities in classrooms in mainstream schools in Australia. Students in the middle years of schooling (Years 7-10) are particularly at risk of not completing school. The students completed the Academic Wellbeing Questionnaire comprised of three research scales: (a) the Self Description Questionnaire II (SDQ-II); (b) the Intellectual Achievement Responsibility Scale (IAR); and (c) the subscale of School Satisfaction from The Multidimensional Student’s Life Satisfaction Scale (MSLSS; Huebner, 1994). Information recorded by schools for the Nationally Consistent Collection of Data (NCCD) was used to identify the level of implemented adjustments in the classroom for students with disabilities. Student achievement data in English and Mathematics based on classroom assessments were provided by schools. Strand 2 of the study consisted of two segments, individual qualitative case studies and cross-case analysis with four case study students. These students completed structured and semi-structured surveys from the Adjustments in Classroom Assessment Project (ACAP) study as well as the Academic Wellbeing Questionnaire. Classroom assessment tasks, adjustments and student assessment responses were collected for the case study students. The first segment of Strand 2 of the study explored how teachers adjusted teacher-designed classroom assessment tasks for four case study students with regard to impairments in access skills and target skills that were assessed by a task. The tasks were summative assessment tasks intended to contribute to reporting to parents but also to have a formative assessment role to contribute to improving student learning. The perceptions of the students, parents, and teachers were explored as to how the provided adjustments related to student outcomes in focus subject areas. The provided assessment adjustments enabled the case study students to demonstrate their knowledge, although not all students were satisfied with their outcomes. The second segment of Strand 2 of the study investigated the academic achievement of case study students in relation to their academic wellbeing under adjusted assessment conditions. The synthesised findings of this study indicated that students with disabilities in inclusive education in mainstream schools are not necessarily low achievers but can reach a level of achievement in some or even all subject areas similar to students without disabilities. The perception of students with and without disabilities about academic abilities, especially in mathematics, was related to their achievement level. Students with and without disabilities had a similar thinking style about academic responsibility. This meant that they were more likely to take internal responsibility for academic success than failure. Findings indicated that students both with and without disabilities were predominantly satisfied with school but the level of school satisfaction of students with disabilities related to their academic achievement, especially in mathematics. The provision of classroom assessment adjustments bridged the gap between the academic achievement and academic wellbeing of students with disabilities to be comparable to their peers without disabilities, especially in mathematics. Overall, this research sheds light on how access to classroom assessment adjustments enables students with disabilities to undertake assessment tasks on the same basis as students without disabilities, which may, in turn, improve their academic achievement outcomes and academic wellbeing

    Separable Inverse Problems, Blind Deconvolution, and Stray Light Correction for Extreme Ultraviolet Solar Images.

    Full text link
    The determination of the inputs to a system given noisy output data is known as an inverse problem. When the system is a linear transformation involving unknown side parameters, the problem is called separable. A quintessential separable inverse problem is blind deconvolution: given a blurry image one must determine the sharp image and point spread function (PSF) that were convolved together to form it. This thesis describes a novel optimization approach for general separable inverse problems, a new blind deconvolution method for images corrupted by camera shake, and the first stray light correction for extreme ultraviolet (EUV) solar images from the EUVI/STEREO instruments. We present a generalization of variable elimination methods for separable inverse problems beyond least squares. Existing variable elimination methods require an explicit formula for the optimal value of the linear variables, so they cannot be used in problems with Poisson likelihoods, bound constraints, or other important departures from least squares. To address this limitation, we propose a generalization of variable elimination in which standard optimization methods are modified to behave as though a variable has been eliminated. Computational experiments indicate that this approach can have significant speed and robustness advantages. A new incremental sparse approximation method is proposed for blind deconvolution of images corrupted by camera shake. Unlike current state-of-the-art variational Bayes methods, it is based on simple alternating projected gradient optimization. In experiments on a standard test set, our method is faster than the state-of-the-art and competitive in deblurring performance. Stray light PSFs are determined for the two EUVI instruments, EUVI-A and B, aboard the STEREO mission. The PSFs are modeled using semi-empirical parametric formulas, and their parameters are determined by semiblind deconvolution of EUVI images. The EUVI-B PSFs were determined from lunar transit data, exploiting the fact that the Moon is not a significant EUV source. The EUVI-A PSFs were determined by analysis of simultaneous A/B observations from December 2006, when the instruments had nearly identical lines of sight to the Sun. We provide the first estimates of systematic error in EUV deconvolved images.PHDApplied and Interdisciplinary MathematicsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/99797/1/shearerp_1.pd

    Plasmodium falciparum: programmed cell death in the erythrocytic stages

    Get PDF
    A thesis submitted to the Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, in fulfilment of the requirements for the degree of Doctor of Philosophy. This thesis is presented as a series of publications and unpublished data. Johannesburg, 2015Plasmodium falciparum is responsible for the majority of global malaria deaths. During the pathogenic blood stages of infection, a rapid increase in parasitaemia threatens the survival of the host before transmission of slow-maturing sexual parasites to the mosquito vector to continue the life cycle. Programmed cell death (PCD) may provide the parasite with the means to control its burden on the host and thereby ensure its own survival. PCD in P. falciparum remains a poorly understood and controversial topic. A gathering body of evidence suggests P. falciparum is capable of PCD, but there are conflicting results regarding the phenotype. This study represents a comprehensive phenotypical characterisation of cell death in intraerythrocytic P. falciparum after various physiologically relevant stress stimuli, including high parasitaemia, heat stress simulating febrile paroxysms, and exposure to natural sunlight. The latter is a novel stimulus for PCD studies in P. falciparum. Biochemical markers of cell death, including DNA fragmentation, mitochondrial dysregulation and phosphatidylserine externalisation on parasitized erythrocytes, were used to provide a holistic description of cell death. Data showed that the combination of cell death markers varied with different stress stimuli and with the developmental stage of the parasite. An apoptosis-like phenotype, characterised by mitochondrial depolarisation, DNA fragmentation and phosphatidylserine externalisation, was suggested after stress from high parasitaemia. Heat stress affected ring stage parasites more severely than previous data suggested and induced an apoptosis-like phenotype. In contrast, late stage parasites showed markers of an autophagic-like cell death, including slight DNA fragmentation, phosphatidylserine externalisation and cytoplasmic vacuolisation. Sunlight exposure induced markers of PCD that included DNA fragmentation preceding mitochondrial hyperpolarisation, but the phenotype was not clear. The paradigm of PCD in P. falciparum is a dynamic and ever-evolving one that will continue to challenge our thinking and understanding of how the world’s deadliest parasitic killer can induce its own death to limit damage on the host. Evidence indicates that P. falciparum undergoes PCD and that the phenotype(s) may be unique. PCD is an important feature of P. falciparum biology and the elucidation of parasite PCD pathway(s) that differ from host mechanisms may yield novel drug targets

    Reductive coupling, and, transition metal calixarene complexes : metal-metal quadruple bonds and pockets

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Chemistry, 1995.Vita.Includes bibliographical references.by Jacqueline A. Acho.Ph.D

    Tipping Scales in Galaxy Surveys: Star/Galaxy Separation and Scale-Dependent Bias

    Get PDF
    In the first part of this thesis, we address the problem of separating stars from galaxies in future large photometric surveys. We derive the science requirements on star/galaxy separation, for measurement of the cosmological parameters with the Gravitational Weak Lensing and Large Scale Structure probes, in chapter 2. We formulate the requirements in terms of the completeness and purity provided by a given star/galaxy classifier. In order to achieve these requirements, we propose a new method for star/galaxy separation in chapter 3, combining Principal Component Analysis with an Artificial Neural Network. When tested on simulations of the Dark Energy Survey (DES), this multi-parameter approach improves upon purely morphometric classifiers (such as the classifier implemented in SExtractor), especially at faint magnitudes. Chapter 4 is dedicated to the testing of this tool on real data, namely the recent internal release of DES Science Verification data. In the second part and last chapter of this thesis, chapter 5, we develop a method to detect the modulation by Baryonic Acoustic Oscillations of the density ratio of baryon to dark matter across large regions of the Universe. Such a detection would provide a direct measurement of a difference in the large-scale clustering of mass and light and a confirmation of the standard cosmological paradigm from a different angle than any other measurement. We measure the number density correlation function and the luminosity weighted correlation function of the DR10 releases of the Baryon Oscillation Spectroscopic Survey (BOSS), and fit a model of scale dependent bias to our measurement. Although our measurement is compatible with previous theoretical predictions, more accurate data is needed to prove or disprove this effect

    A robust framework for medical image segmentation through adaptable class-specific representation

    Get PDF
    Medical image segmentation is an increasingly important component in virtual pathology, diagnostic imaging and computer-assisted surgery. Better hardware for image acquisition and a variety of advanced visualisation methods have paved the way for the development of computer based tools for medical image analysis and interpretation. The routine use of medical imaging scans of multiple modalities has been growing over the last decades and data sets such as the Visible Human Project have introduced a new modality in the form of colour cryo section data. These developments have given rise to an increasing need for better automatic and semiautomatic segmentation methods. The work presented in this thesis concerns the development of a new framework for robust semi-automatic segmentation of medical imaging data of multiple modalities. Following the specification of a set of conceptual and technical requirements, the framework known as ACSR (Adaptable Class-Specific Representation) is developed in the first case for 2D colour cryo section segmentation. This is achieved through the development of a novel algorithm for adaptable class-specific sampling of point neighbourhoods, known as the PGA (Path Growing Algorithm), combined with Learning Vector Quantization. The framework is extended to accommodate 3D volume segmentation of cryo section data and subsequently segmentation of single and multi-channel greyscale MRl data. For the latter the issues of inhomogeneity and noise are specifically addressed. Evaluation is based on comparison with previously published results on standard simulated and real data sets, using visual presentation, ground truth comparison and human observer experiments. ACSR provides the user with a simple and intuitive visual initialisation process followed by a fully automatic segmentation. Results on both cryo section and MRI data compare favourably to existing methods, demonstrating robustness both to common artefacts and multiple user initialisations. Further developments into specific clinical applications are discussed in the future work section
    • …
    corecore