220 research outputs found
Reduction of the size of datasets by using evolutionary feature selection: the case of noise in a modern city
Smart city initiatives have emerged to mitigate the negative effects of a very fast growth of urban areas. Most of the population in our cities are exposed to high levels of noise that generate discomfort and different health problems. These issues may be mitigated by applying different smart cities solutions, some of them require high accurate noise information to provide the best quality of serve possible. In this study, we have designed a machine learning approach based on genetic algorithms to analyze noise data captured in the university campus. This method reduces the amount of data required to classify the noise by addressing a feature selection optimization problem. The experimental results have shown that our approach improved the accuracy in 20% (achieving an accuracy of 87% with a reduction of up to 85% on the original dataset).Universidad de Málaga. Campus de Excelencia Internacional AndalucĂa Tech.
This research has been partially funded by the Spanish MINECO and FEDER projects TIN2016-81766-REDT (http://cirti.es), and TIN2017-88213-R (http://6city.lcc.uma.es)
Feature selection using genetic algorithms and probabilistic neural networks
Selection of input variables is a key stage in building
predictive models, and an important form of data mining. As exhaustive evaluation of potential input sets using full non-linear models is impractical, it is necessary to use simple fast-evaluating models and heuristic selection strategies. This paper discusses a fast, efficient, and powerful nonlinear input selection procedure using a combination of Probabilistic Neural Networks and repeated
bitwise gradient descent. The algorithm is compared
with forward elimination, backward elimination and genetic algorithms using a selection of real-world data sets. The algorithm has comparative performance and greatly reduced execution time with respect to these alternative approaches. It is demonstrated empirically that reliable results cannot be gained using any of these approaches without the use of resampling
Assessing similarity of feature selection techniques in high-dimensional domains
Recent research efforts attempt to combine multiple feature selection techniques instead of using a single one. However, this combination is often made on an “ad hoc” basis, depending on the specific problem at hand, without considering the degree of diversity/similarity of the involved methods. Moreover, though it is recognized that different techniques may return quite dissimilar outputs, especially in high dimensional/small sample size domains, few direct comparisons exist that quantify these differences and their implications on classification performance. This paper aims to provide a contribution in this direction by proposing a general methodology for assessing the similarity between the outputs of different feature selection methods in high dimensional classification problems. Using as benchmark the genomics domain, an empirical study has been conducted to compare some of the most popular feature selection methods, and useful insight has been obtained about their pattern of agreement
Hybrid Metaheuristics for Classification Problems
High accuracy and short amount of time are required for the solutions of many classification problems such as real-world classification problems. Due to the practical importance of many classification problems (such as crime detection), many algorithms have been developed to tackle them. For years, metaheuristics (MHs) have been successfully used for solving classification problems. Recently, hybrid metaheuristics have been successfully used for many real-world optimization problems such as flight scheduling and load balancing in telecommunication networks. This chapter investigates the use of this new interdisciplinary field for classification problems. Moreover, it demonstrates the forms of metaheuristics hybridization as well as designing a new hybrid metaheuristic
Recommended from our members
An incremental approach to MSE-based feature selection
Feature selection plays an important role in classification systems. Using classifier error rate as the evaluation function, feature selection is integrated with incremental training. A neural network classifier is implemented with an incremental training approach to detect and discard irrelevant features. By learning attributes one after another, our classifier can find directly the attributes that make no contribution to classification. These attributes are marked and considered for removal. Incorporated with a Minimum Squared Error (MSE) based feature ranking scheme, four batch removal methods based on classifier error rate have been developed to discard irrelevant features. These feature selection methods reduce the computational complexity involved in searching among a large number of possible solutions significantly. Experimental results show that our feature selection methods work well on several benchmark problems compared with other feature selection methods. The selected subsets are further validated by a Constructive Backpropagation (CBP) classifier, which confirms increased classification accuracy and reduced training cost
- …