36,033 research outputs found

    A survey of cost-sensitive decision tree induction algorithms

    Get PDF
    The past decade has seen a significant interest on the problem of inducing decision trees that take account of costs of misclassification and costs of acquiring the features used for decision making. This survey identifies over 50 algorithms including approaches that are direct adaptations of accuracy based methods, use genetic algorithms, use anytime methods and utilize boosting and bagging. The survey brings together these different studies and novel approaches to cost-sensitive decision tree learning, provides a useful taxonomy, a historical timeline of how the field has developed and should provide a useful reference point for future research in this field

    A Survey on Software Testing Techniques using Genetic Algorithm

    Full text link
    The overall aim of the software industry is to ensure delivery of high quality software to the end user. To ensure high quality software, it is required to test software. Testing ensures that software meets user specifications and requirements. However, the field of software testing has a number of underlying issues like effective generation of test cases, prioritisation of test cases etc which need to be tackled. These issues demand on effort, time and cost of the testing. Different techniques and methodologies have been proposed for taking care of these issues. Use of evolutionary algorithms for automatic test generation has been an area of interest for many researchers. Genetic Algorithm (GA) is one such form of evolutionary algorithms. In this research paper, we present a survey of GA approach for addressing the various issues encountered during software testing.Comment: 13 Page

    Reduction of the size of datasets by using evolutionary feature selection: the case of noise in a modern city

    Get PDF
    Smart city initiatives have emerged to mitigate the negative effects of a very fast growth of urban areas. Most of the population in our cities are exposed to high levels of noise that generate discomfort and different health problems. These issues may be mitigated by applying different smart cities solutions, some of them require high accurate noise information to provide the best quality of serve possible. In this study, we have designed a machine learning approach based on genetic algorithms to analyze noise data captured in the university campus. This method reduces the amount of data required to classify the noise by addressing a feature selection optimization problem. The experimental results have shown that our approach improved the accuracy in 20% (achieving an accuracy of 87% with a reduction of up to 85% on the original dataset).Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech. This research has been partially funded by the Spanish MINECO and FEDER projects TIN2016-81766-REDT (http://cirti.es), and TIN2017-88213-R (http://6city.lcc.uma.es)

    Comparative analysis of diagnostic performance, feasibility and cost of different test-methods for thyroid nodules with indeterminate cytology

    Get PDF
    Since it is impossible to recognize malignancy at fine needle aspiration (FNA) cytology in indeterminate thyroid nodules, surgery is recommended for all of them. However, cancer rate at final histology is < 30%. Many different test-methods have been proposed to increase diagnostic accuracy in such lesions, including Galectin-3-ICC (GAL-3-ICC), BRAF mutation analysis (BRAF), Gene Expression Classifier (GEC) alone and GEC+BRAF, mutation/fusion (M/F) panel, alone, M/F panel+miRNA GEC, and M/F panel by next generation sequencing (NGS), FDG-PET/CT, MIBI-Scan and TSHR mRNA blood assay. We performed systematic reviews and meta-analyses to compare their features, feasibility, diagnostic performance and cost. GEC, GEC+BRAF, M/F panel+miRNA GEC and M/F panel by NGS were the best in ruling-out malignancy (sensitivity = 90%, 89%, 89% and 90% respectively). BRAF and M/F panel alone and by NGS were the best in ruling-in malignancy (specificity = 100%, 93% and 93%). The M/F by NGS showed the highest accuracy (92%) and BRAF the highest diagnostic odds ratio (DOR) (247). GAL-3-ICC performed well as rule-out (sensitivity = 83%) and rule-in test (specificity = 85%), with good accuracy (84%) and high DOR (27) and is one of the cheapest (113 USD) and easiest one to be performed in different clinical settings. In conclusion, the more accurate molecular-based test-methods are still expensive and restricted to few, highly specialized and centralized laboratories. GAL-3-ICC, although limited by some false negatives, represents the most suitable screening test-method to be applied on a large-scale basis in the diagnostic algorithm of indeterminate thyroid lesions

    Is the Stack Distance Between Test Case and Method Correlated With Test Effectiveness?

    Full text link
    Mutation testing is a means to assess the effectiveness of a test suite and its outcome is considered more meaningful than code coverage metrics. However, despite several optimizations, mutation testing requires a significant computational effort and has not been widely adopted in industry. Therefore, we study in this paper whether test effectiveness can be approximated using a more light-weight approach. We hypothesize that a test case is more likely to detect faults in methods that are close to the test case on the call stack than in methods that the test case accesses indirectly through many other methods. Based on this hypothesis, we propose the minimal stack distance between test case and method as a new test measure, which expresses how close any test case comes to a given method, and study its correlation with test effectiveness. We conducted an empirical study with 21 open-source projects, which comprise in total 1.8 million LOC, and show that a correlation exists between stack distance and test effectiveness. The correlation reaches a strength up to 0.58. We further show that a classifier using the minimal stack distance along with additional easily computable measures can predict the mutation testing result of a method with 92.9% precision and 93.4% recall. Hence, such a classifier can be taken into consideration as a light-weight alternative to mutation testing or as a preceding, less costly step to that.Comment: EASE 201

    On the design of an ECOC-compliant genetic algorithm

    Get PDF
    Genetic Algorithms (GA) have been previously applied to Error-Correcting Output Codes (ECOC) in state-of-the-art works in order to find a suitable coding matrix. Nevertheless, none of the presented techniques directly take into account the properties of the ECOC matrix. As a result the considered search space is unnecessarily large. In this paper, a novel Genetic strategy to optimize the ECOC coding step is presented. This novel strategy redefines the usual crossover and mutation operators in order to take into account the theoretical properties of the ECOC framework. Thus, it reduces the search space and lets the algorithm to converge faster. In addition, a novel operator that is able to enlarge the code in a smart way is introduced. The novel methodology is tested on several UCI datasets and four challenging computer vision problems. Furthermore, the analysis of the results done in terms of performance, code length and number of Support Vectors shows that the optimization process is able to find very efficient codes, in terms of the trade-off between classification performance and the number of classifiers. Finally, classification performance per dichotomizer results shows that the novel proposal is able to obtain similar or even better results while defining a more compact number of dichotomies and SVs compared to state-of-the-art approaches
    corecore