Search CORE

36,033 research outputs found

A survey of cost-sensitive decision tree induction algorithms

Author: Bradford J. P.
Elkan C.
Esmeir S.
Esmeir S.
Estruch V.
Fan W.
Ferri C.
Freund Y.
Hart A. E.
Knoll U.
Li J.
Lin F. Y.
Liu X.
Mease D.
Murthy S.
Ni A.
Norton S. W.
Pazzani M.
Quinlan J. R.
Quinlan J. R.
Schapire R. E.
Sunil Vadera
Susan Lomax
Swets J.
Tan M.
Ting K.
Ting K.
Ting K. M.
von Neumann J.
Zadrozny B.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/02/2013
Field of study

The past decade has seen a significant interest on the problem of inducing decision trees that take account of costs of misclassification and costs of acquiring the features used for decision making. This survey identifies over 50 algorithms including approaches that are direct adaptations of accuracy based methods, use genetic algorithms, use anytime methods and utilize boosting and bagging. The survey brings together these different studies and novel approaches to cost-sensitive decision tree learning, provides a useful taxonomy, a historical timeline of how the field has developed and should provide a useful reference point for future research in this field

University of Salford Institutional Repository

Crossref

A Survey on Software Testing Techniques using Genetic Algorithm

Author: Sabharwal Sangeeta
Sharma Chayanika
Sibal Ritu
Publication venue
Publication date: 05/11/2014
Field of study

The overall aim of the software industry is to ensure delivery of high quality software to the end user. To ensure high quality software, it is required to test software. Testing ensures that software meets user specifications and requirements. However, the field of software testing has a number of underlying issues like effective generation of test cases, prioritisation of test cases etc which need to be tackled. These issues demand on effort, time and cost of the testing. Different techniques and methodologies have been proposed for taking care of these issues. Use of evolutionary algorithms for automatic test generation has been an area of interest for many researchers. Genetic Algorithm (GA) is one such form of evolutionary algorithms. In this research paper, we present a survey of GA approach for addressing the various issues encountered during software testing.Comment: 13 Page

arXiv.org e-Print Archive

CiteSeerX

Reduction of the size of datasets by using evolutionary feature selection: the case of noise in a modern city

Author: B Xue
C Steele
DH Nguyen
DJ Sheskin
E Murphy
EE Kempen Van
FA Fortin
G Chandrashekar
J Segura-Garcia
J Yang
JC Pruessner
S García
S McClellan
S Nesmachnow
WM Spears
ZH Mir
Publication venue
Publication date: 10/12/2018
Field of study

Smart city initiatives have emerged to mitigate the negative effects of a very fast growth of urban areas. Most of the population in our cities are exposed to high levels of noise that generate discomfort and different health problems. These issues may be mitigated by applying different smart cities solutions, some of them require high accurate noise information to provide the best quality of serve possible. In this study, we have designed a machine learning approach based on genetic algorithms to analyze noise data captured in the university campus. This method reduces the amount of data required to classify the noise by addressing a feature selection optimization problem. The experimental results have shown that our approach improved the accuracy in 20% (achieving an accuracy of 87% with a reduction of up to 85% on the original dataset).Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech. This research has been partially funded by the Spanish MINECO and FEDER projects TIN2016-81766-REDT (http://cirti.es), and TIN2017-88213-R (http://6city.lcc.uma.es)

Crossref

Repositorio Institucional Universidad de Málaga

Comparative analysis of diagnostic performance, feasibility and cost of different test-methods for thyroid nodules with indeterminate cytology

Author: AIOM-AIRTUM Working Group
AIRTUM Working Group
Alexander
Alexander
Ali
Altman
Attia
Baldari
Baloch
Bartolazzi
Bartolazzi
Bartolazzi
Beaudenon-Huibregtse
Bevilacqua
Beyer
Biggerstaff
Birkeland
Bongiovanni
Burch
Cabatu
Carpi
Carty
Chhieng
Choi
Chung
Chung
Corsello
Cvejić
D'Armiento
da Silva Pinhal
de Geus-Oei
de Geus-Oei
degli Uberti
Eszlinger
Evans
Evans
Fadda
Fadda
Fox
Gharib
Girelli
Gupta
Gupta
Gupta
Heffron
Jo
Kebebew
Kennedy
Kennedy
Khanafshar
Kwak
Labourier
Ladenson
Larsson
Lotan
Mazzaferri
McGee
McGregor
Michael
Moher
Moley
Montori
Newbold
Niederle
Nikiforova
Nikiforova
Nikiforova
Nishino
Oh
Orlandi
Osman
Paik
Papotti
Paschke
Paschke
Paschke
Pastorino
Payne
Perros
Plotly Technologies Inc.
Puxeddu
R Core Team
Rabinovich
Rindi
Rosenthal
Rosenthal
Rugge
Sadler
Saxe
Schachter
Schlumberger
Schmid
Schreyögg
Sciacchitano
Sciacchitano
Shah
Shi
Shong
Siperstein
Sosa
Stephen
Steward
Teng
Tonacchera
van de Velde
Verburg
Veselic
Vierhapper
Vitti
Vitti
Vitti
Wasserman
Westra
Wiseman
Woeber
Xing
Yu
Zantut-Wittmann
Zatelli
Zeiger
Publication venue: 'Impact Journals, LLC'
Publication date: 01/01/2017
Field of study

Since it is impossible to recognize malignancy at fine needle aspiration (FNA) cytology in indeterminate thyroid nodules, surgery is recommended for all of them. However, cancer rate at final histology is < 30%. Many different test-methods have been proposed to increase diagnostic accuracy in such lesions, including Galectin-3-ICC (GAL-3-ICC), BRAF mutation analysis (BRAF), Gene Expression Classifier (GEC) alone and GEC+BRAF, mutation/fusion (M/F) panel, alone, M/F panel+miRNA GEC, and M/F panel by next generation sequencing (NGS), FDG-PET/CT, MIBI-Scan and TSHR mRNA blood assay. We performed systematic reviews and meta-analyses to compare their features, feasibility, diagnostic performance and cost. GEC, GEC+BRAF, M/F panel+miRNA GEC and M/F panel by NGS were the best in ruling-out malignancy (sensitivity = 90%, 89%, 89% and 90% respectively). BRAF and M/F panel alone and by NGS were the best in ruling-in malignancy (specificity = 100%, 93% and 93%). The M/F by NGS showed the highest accuracy (92%) and BRAF the highest diagnostic odds ratio (DOR) (247). GAL-3-ICC performed well as rule-out (sensitivity = 83%) and rule-in test (specificity = 85%), with good accuracy (84%) and high DOR (27) and is one of the cheapest (113 USD) and easiest one to be performed in different clinical settings. In conclusion, the more accurate molecular-based test-methods are still expensive and restricted to few, highly specialized and centralized laboratories. GAL-3-ICC, although limited by some false negatives, represents the most suitable screening test-method to be applied on a large-scale basis in the diagnostic algorithm of indeterminate thyroid lesions

Crossref

Archivio della ricerca- Università di Roma La Sapienza

Is the Stack Distance Between Test Case and Method Correlated With Test Effectiveness?

Author: Acree Allen Troy
Chawla Nitesh V
Jefferson Offutt A
Ji Changbin
Kohavi Ron
Marko Ivanković Goran Petrović
Niedermayr Rainer
Schuler David
Strug Joanna
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 13/03/2019
Field of study

Mutation testing is a means to assess the effectiveness of a test suite and its outcome is considered more meaningful than code coverage metrics. However, despite several optimizations, mutation testing requires a significant computational effort and has not been widely adopted in industry. Therefore, we study in this paper whether test effectiveness can be approximated using a more light-weight approach. We hypothesize that a test case is more likely to detect faults in methods that are close to the test case on the call stack than in methods that the test case accesses indirectly through many other methods. Based on this hypothesis, we propose the minimal stack distance between test case and method as a new test measure, which expresses how close any test case comes to a given method, and study its correlation with test effectiveness. We conducted an empirical study with 21 open-source projects, which comprise in total 1.8 million LOC, and show that a correlation exists between stack distance and test effectiveness. The correlation reaches a strength up to 0.58. We further show that a classifier using the minimal stack distance along with additional easily computable measures can predict the mutation testing result of a method with 92.9% precision and 93.4% recall. Hence, such a classifier can be taken into consideration as a light-weight alternative to mutation testing or as a preceding, less costly step to that.Comment: EASE 201

arXiv.org e-Print Archive

Crossref

On the design of an ECOC-compliant genetic algorithm

Author: Baró Solé Xavier
Bautista Miguel Ángel
Escalera Guerrero Sergio
Pujol Vila Oriol
Publication venue: 'Elsevier BV'
Publication date: 04/06/2013
Field of study

Genetic Algorithms (GA) have been previously applied to Error-Correcting Output Codes (ECOC) in state-of-the-art works in order to find a suitable coding matrix. Nevertheless, none of the presented techniques directly take into account the properties of the ECOC matrix. As a result the considered search space is unnecessarily large. In this paper, a novel Genetic strategy to optimize the ECOC coding step is presented. This novel strategy redefines the usual crossover and mutation operators in order to take into account the theoretical properties of the ECOC framework. Thus, it reduces the search space and lets the algorithm to converge faster. In addition, a novel operator that is able to enlarge the code in a smart way is introduced. The novel methodology is tested on several UCI datasets and four challenging computer vision problems. Furthermore, the analysis of the results done in terms of performance, code length and number of Support Vectors shows that the optimization process is able to find very efficient codes, in terms of the trade-off between classification performance and the number of classifiers. Finally, classification performance per dichotomizer results shows that the novel proposal is able to obtain similar or even better results while defining a more compact number of dichotomies and SVs compared to state-of-the-art approaches

The Oberta in open access