54 research outputs found

    A novel feature selection-based sequential ensemble learning method for class noise detection in high-dimensional data

    Get PDF
    © 2018, Springer Nature Switzerland AG. Most of the irrelevant or noise features in high-dimensional data present significant challenges to high-dimensional mislabeled instances detection methods based on feature selection. Traditional methods often perform the two dependent step: The first step, searching for the relevant subspace, and the second step, using the feature subspace which obtained in the previous step training model. However, Feature subspace that are not related to noise scores and influence detection performance. In this paper, we propose a novel sequential ensemble method SENF that aggregate the above two phases, our method learns the sequential ensembles to obtain refine feature subspace and improve detection accuracy by iterative sparse modeling with noise scores as the regression target attribute. Through extensive experiments on 8 real-world high-dimensional datasets from the UCI machine learning repository [3], we show that SENF performs significantly better or at least similar to the individual baselines as well as the existing state-of-the-art label noise detection method

    Hybrid case‑base maintenance approach for modeling large scale case‑based reasoning systems

    Get PDF
    YesCase-based reasoning (CBR) is a nature inspired paradigm of machine learning capable to continuously learn from the past experience. Each newly solved problem and its corresponding solution is retained in its central knowledge repository called case-base. Withρ the regular use of the CBR system, the case-base cardinality keeps on growing. It results into performance bottleneck as the number of comparisons of each new problem with the existing problems also increases with the case-base growth. To address this performance bottleneck, different case-base maintenance (CBM) strategies are used so that the growth of the case-base is controlled without compromising on the utility of knowledge maintained in the case-base. This research work presents a hybrid case-base maintenance approach which equally utilizes the benefits of case addition as well as case deletion strategies to maintain the case-base in online and offline modes respectively. The proposed maintenance method has been evaluated using a simulated model of autonomic forest fire application and its performance has been compared with the existing approaches on a large case-base of the simulated case study.Authors acknowledge the internal funding support received from Namal College Mianwali to complete the research work

    Classification of Caesarean Section and Normal Vaginal Deliveries Using Foetal Heart Rate Signals and Advanced Machine Learning Algorithms

    Get PDF
    ABSTRACT – Background: Visual inspection of Cardiotocography traces by obstetricians and midwives is the gold standard for monitoring the wellbeing of the foetus during antenatal care. However, inter- and intra-observer variability is high with only a 30% positive predictive value for the classification of pathological outcomes. This has a significant negative impact on the perinatal foetus and often results in cardio-pulmonary arrest, brain and vital organ damage, cerebral palsy, hearing, visual and cognitive defects and in severe cases, death. This paper shows that using machine learning and foetal heart rate signals provides direct information about the foetal state and helps to filter the subjective opinions of medical practitioners when used as a decision support tool. The primary aim is to provide a proof-of-concept that demonstrates how machine learning can be used to objectively determine when medical intervention, such as caesarean section, is required and help avoid preventable perinatal deaths. Methodology: This is evidenced using an open dataset that comprises 506 controls (normal virginal deliveries) and 46 cases (caesarean due to pH ≤7.05 and pathological risk). Several machine-learning algorithms are trained, and validated, using binary classifier performance measures. Results: The findings show that deep learning classification achieves Sensitivity = 94%, Specificity = 91%, Area under the Curve = 99%, F-Score = 100%, and Mean Square Error = 1%. Conclusions: The results demonstrate that machine learning significantly improves the efficiency for the detection of caesarean section and normal vaginal deliveries using foetal heart rate signals compared with obstetrician and midwife predictions and systems reported in previous studies

    A multiobjective module-order model for software quality enhancement

    No full text
    The knowledge, prior to system operations, of which program modules are problematic is valuable to a software quality assurance team, especially when there is a constraint on software quality enhancement resources. A cost-effective approach for allocating such resources is to obtain a prediction in the form of a quality-based ranking of program modules. Subsequently, a module-order model (MOM) is used to gauge the performance of the predicted rankings. From a practical software engineering point of view, multiple software quality objectives may be desired by a MOM for the system under consideration: e.g., the desired rankings may be such that 100% of the faults should be detected if the top 50% of modules with highest number of faults are subjected to quality improvements. Moreover, the management team for the same system may also desire that 80% of the faults should be accounted if the top 20% of the modules are targeted for improvement. Existing work related to MOM(s) use a quantitative prediction model to obtain the predicted rankings of-program modules, implying that only the fault prediction error measures such as the average, relative, or mean square errors are minimized. Such an approach does not provide a direct insight into the performance behavior of a MOM. For a given percentage of modules enhanced, the performance of a MOM is gauged by how many faults are accounted for by the predicted ranking as compared with the perfect ranking. We propose an approach for calibrating a multiobjective MOM using genetic programming. Other estimation techniques, e.g., multiple linear regression and neural networks cannot achieve multiobjective optimization for MOM(s). The proposed methodology facilitates the simultaneous optimization of multiple performance objectives for a MOM. Case studies of two industrial software systems are presented, the empirical results of which demonstrate a new promise for goal-oriented software quality modeling

    Predicting Fault-Prone Modules by Word Occurrence in Identifiers

    No full text
    corecore