Search CORE

16,725 research outputs found

Feature weighting techniques for CBR in software effort estimation studies: A review and empirical evaluation

Author: Aha D. W.
Ashley K. D.
Bardsiri V. K.
Bareiss R.
Cain T.
Hedges L.
Higgins J.
Kirsopp C.
Mohri T.
Skalak D.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 17/09/2014
Field of study

Context : Software effort estimation is one of the most important activities in the software development process. Unfortunately, estimates are often substantially wrong. Numerous estimation methods have been proposed including Case-based Reasoning (CBR). In order to improve CBR estimation accuracy, many researchers have proposed feature weighting techniques (FWT). Objective: Our purpose is to systematically review the empirical evidence to determine whether FWT leads to improved predictions. In addition we evaluate these techniques from the perspectives of (i) approach (ii) strengths and weaknesses (iii) performance and (iv) experimental evaluation approach including the data sets used. Method: We conducted a systematic literature review of published, refereed primary studies on FWT (2000-2014). Results: We identified 19 relevant primary studies. These reported a range of different techniques. 17 out of 19 make benchmark comparisons with standard CBR and 16 out of 17 studies report improved accuracy. Using a one-sample sign test this positive impact is significant (p = 0:0003). Conclusion: The actionable conclusion from this study is that our review of all relevant empirical evidence supports the use of FWTs and we recommend that researchers and practitioners give serious consideration to their adoption

Crossref

Brunel University Research Archive

FEATURE SELECTION APPLIED TO THE TIME-FREQUENCY REPRESENTATION OF MUSCLE NEAR-INFRARED SPECTROSCOPY (NIRS) SIGNALS: CHARACTERIZATION OF DIABETIC OXYGENATION PATTERNS

Author: Dash M.
Ferreira L. F.
FILIPPO MOLINARI
GABRIELLA BALESTRA
Hirata K.
Holand J.
Matsumoto Y.
Molinari F.
SAMANTA ROSATI
Somol P.
Wilks S. S.
Publication venue: WORLD SCIENTIFIC
Publication date: 01/01/2012
Field of study

Diabetic patients might present peripheral microcirculation impairment and might benefit from physical training. Thirty-nine diabetic patients underwent the monitoring of the tibialis anterior muscle oxygenation during a series of voluntary ankle flexo-extensions by near-infrared spectroscopy (NIRS). NIRS signals were acquired before and after training protocols. Sixteen control subjects were tested with the same protocol. Time-frequency distributions of the Cohen's class were used to process the NIRS signals relative to the concentration changes of oxygenated and reduced hemoglobin. A total of 24 variables were measured for each subject and the most discriminative were selected by using four feature selection algorithms: QuickReduct, Genetic Rough-Set Attribute Reduction, Ant Rough-Set Attribute Reduction, and traditional ANOVA. Artificial neural networks were used to validate the discriminative power of the selected features. Results showed that different algorithms extracted different sets of variables, but all the combinations were discriminative. The best classification accuracy was about 70%. The oxygenation variables were selected when comparing controls to diabetic patients or diabetic patients before and after training. This preliminary study showed the importance of feature selection techniques in NIRS assessment of diabetic peripheral vascular impairmen

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

A heuristic model of bounded route choice in urban areas

Author: Cheng T
Manley EJ
Orr SW
Publication venue
Publication date: 19/04/2015
Field of study

There is substantial evidence to indicate that route choice in urban areas is complex cognitive process, conducted under uncertainty and formed on partial perspectives. Yet, conventional route choice models continue make simplistic assumptions around the nature of human cognitive ability, memory and preference. In this paper, a novel framework for route choice in urban areas is introduced, aiming to more accurately reflect the uncertain, bounded nature of route choice decision making. Two main advances are introduced. The first involves the definition of a hierarchical model of space representing the relationship between urban features and human cognition, combining findings from both the extensive previous literature on spatial cognition and a large route choice dataset. The second advance involves the development of heuristic rules for route choice decisions, building upon the hierarchical model of urban space. The heuristics describe the process by which quick, 'good enough' decisions are made when individuals are faced with uncertainty. This element of the model is once more constructed and parameterised according to findings from prior research and the trends identified within a large routing dataset. The paper outlines the implementation of the framework within a real-world context, validating the results against observed behaviours. Conclusions are offered as to the extension and improvement of this approach, outlining its potential as an alternative to other route choice modelling frameworks

Elsevier - Publisher Connector

UCL Discovery

Scipedia

Nature inspired feature selection meta-heuristics

Author: Diao Ren
Shen Qiang
Publication venue
Publication date: 14/01/2015
Field of study

Crossref

Aberystwyth Research Portal

Two new feature selection algorithms with rough sets theory

Author: Bello Rafael
Caballero Yailé
García Lorenzo María Matilde
Álvarez Delia
Publication venue
Publication date: 01/08/2006
Field of study

Rough Sets Theory has opened new trends for the development of the Incomplete Information Theory. Inside this one, the notion of reduct is a very significant one, but to obtain a reduct in a decision system is an expensive computing process although very important in data analysis and knowledge discovery. Because of this, it has been necessary the development of different variants to calculate reducts. The present work look into the utility that offers Rough Sets Model and Information Theory in feature selection and a new method is presented with the purpose of calculate a good reduct. This new method consists of a greedy algorithm that uses heuristics to work out a good reduct in acceptable times. In this paper we propose other method to find good reducts, this method combines elements of Genetic Algorithm with Estimation of Distribution Algorithms. The new methods are compared with others which are implemented inside Pattern Recognition and Ant Colony Optimization Algorithms and the results of the statistical tests are shown.IFIP International Conference on Artificial Intelligence in Theory and Practice - Knowledge Acquisition and Data MiningRed de Universidades con Carreras en Informática (RedUNCI

A bi-objective feature selection algorithm for large omics datasets

Author: Almuallim
Boros
Cavique
Cavique
Chandrashekar
Chung
Chvatal
Collette
Crama
Joncour
Kira
Liu
Pawlak
Pawlak
Peters
Polkowski
Smet
Stephens
Talbi
The 1000 Genomes Project Consortium
Yao
Publication venue: 'Wiley'
Publication date: 01/01/2018
Field of study

Special Issue: Fourth special issue on knowledge discovery and business intelligence.Feature selection is one of the most important concepts in data mining when dimensionality reduction is needed. The performance measures of feature selection encompass predictive accuracy and result comprehensibility. Consistency based methods are a significant category of feature selection research that substantially improves the comprehensibility of the result using the parsimony principle. In this work, the bi-objective version of the algorithm Logical Analysis of Inconsistent Data is applied to large volumes of data. In order to deal with hundreds of thousands of attributes, heuristic decomposition uses parallel processing to solve a set covering problem and a cross-validation technique. The bi-objective solutions contain the number of reduced features and the accuracy. The algorithm is applied to omics datasets with genome-like characteristics of patients with rare diseases.The authors would like to thank the FCT support UID/Multi/04046/2013. This work used the EGI, European Grid Infrastructure, with the support of the IBERGRID, Iberian Grid Infrastructure, and INCD (Portugal).info:eu-repo/semantics/publishedVersio

Crossref

Repositório Aberto da Universidade Aberta

Repositório Científico do Instituto Nacional de Saúde