6,099 research outputs found
A generic optimising feature extraction method using multiobjective genetic programming
In this paper, we present a generic, optimising feature extraction method using multiobjective genetic programming. We re-examine the feature extraction problem and show that effective feature extraction can significantly enhance the performance of pattern recognition systems with simple classifiers. A framework is presented to evolve optimised feature extractors that transform an input pattern space into a decision space in which maximal class separability is obtained. We have applied this method to real world datasets from the UCI Machine Learning and StatLog databases to verify our approach and compare our proposed method with other reported results. We conclude that our algorithm is able to produce classifiers of superior (or equivalent) performance to the conventional classifiers examined, suggesting removal of the need to exhaustively evaluate a large family of conventional classifiers on any new problem. (C) 2010 Elsevier B.V. All rights reserved
Feature analysis of functional MRI data for mapping epileptic networks
Issued as final reportUniversity of Pennsylvani
Evolutionary Computation and QSAR Research
[Abstract] The successful high throughput screening of molecule libraries for a specific biological property is one of the main improvements in drug discovery. The virtual molecular filtering and screening relies greatly on quantitative structure-activity relationship (QSAR) analysis, a mathematical model that correlates the activity of a molecule with molecular descriptors. QSAR models have the potential to reduce the costly failure of drug candidates in advanced (clinical) stages by filtering combinatorial libraries, eliminating candidates with a predicted toxic effect and poor pharmacokinetic profiles, and reducing the number of experiments. To obtain a predictive and reliable QSAR model, scientists use methods from various fields such as molecular modeling, pattern recognition, machine learning or artificial intelligence. QSAR modeling relies on three main steps: molecular structure codification into molecular descriptors, selection of relevant variables in the context of the analyzed activity, and search of the optimal mathematical model that correlates the molecular descriptors with a specific activity. Since a variety of techniques from statistics and artificial intelligence can aid variable selection and model building steps, this review focuses on the evolutionary computation methods supporting these tasks. Thus, this review explains the basic of the genetic algorithms and genetic programming as evolutionary computation approaches, the selection methods for high-dimensional data in QSAR, the methods to build QSAR models, the current evolutionary feature selection methods and applications in QSAR and the future trend on the joint or multi-task feature selection methods.Instituto de Salud Carlos III, PIO52048Instituto de Salud Carlos III, RD07/0067/0005Ministerio de Industria, Comercio y Turismo; TSI-020110-2009-53)Galicia. ConsellerÃa de EconomÃa e Industria; 10SIN105004P
Recommended from our members
State-of-the-art on research and applications of machine learning in the building life cycle
Fueled by big data, powerful and affordable computing resources, and advanced algorithms, machine learning has been explored and applied to buildings research for the past decades and has demonstrated its potential to enhance building performance. This study systematically surveyed how machine learning has been applied at different stages of building life cycle. By conducting a literature search on the Web of Knowledge platform, we found 9579 papers in this field and selected 153 papers for an in-depth review. The number of published papers is increasing year by year, with a focus on building design, operation, and control. However, no study was found using machine learning in building commissioning. There are successful pilot studies on fault detection and diagnosis of HVAC equipment and systems, load prediction, energy baseline estimate, load shape clustering, occupancy prediction, and learning occupant behaviors and energy use patterns. None of the existing studies were adopted broadly by the building industry, due to common challenges including (1) lack of large scale labeled data to train and validate the model, (2) lack of model transferability, which limits a model trained with one data-rich building to be used in another building with limited data, (3) lack of strong justification of costs and benefits of deploying machine learning, and (4) the performance might not be reliable and robust for the stated goals, as the method might work for some buildings but could not be generalized to others. Findings from the study can inform future machine learning research to improve occupant comfort, energy efficiency, demand flexibility, and resilience of buildings, as well as to inspire young researchers in the field to explore multidisciplinary approaches that integrate building science, computing science, data science, and social science
Self learning neuro-fuzzy modeling using hybrid genetic probabilistic approach for engine air/fuel ratio prediction
Machine Learning is concerned in constructing models which can learn and make predictions based on data. Rule extraction from real world data that are usually tainted with noise, ambiguity, and uncertainty, automatically requires feature selection. Neuro-Fuzzy system (NFS) which is known with its prediction performance has the difficulty in determining the proper number of rules and the number of membership functions for each rule. An enhanced hybrid Genetic Algorithm based Fuzzy Bayesian
classifier (GA-FBC) was proposed to help the NFS in the rule extraction. Feature selection was performed in the rule level overcoming the problems of the FBC which depends on the frequency of the features leading to ignore the patterns of small classes. As dealing with a real world problem such as the Air/Fuel Ratio (AFR) prediction, a multi-objective problem is adopted. The GA-FBC uses mutual information entropy, which considers the relevance between feature attributes and class attributes. A fitness function is proposed to deal with multi-objective problem without weight using a new composition method. The model was compared to other learning algorithms for NFS such as Fuzzy c-means (FCM) and grid partition algorithm. Predictive accuracy and the complexity of the Fuzzy Rule Base System (FRBS) including number of rules and number of terms in each rule were taken as terms of evaluation. It was also compared to the original GA-FBC depending on the
frequency not on Mutual Information (MI). Experimental results using Air/Fuel Ratio
(AFR) data sets show that the new model participates in decreasing the average number of attributes in the rule and sometimes in increasing the average performance compared to other models. This work facilitates in achieving a self-generating FRBS from real data. The GA-FBC can be used as a new direction in machine learning research. This research contributes in controlling automobile emissions in helping the
reduction of one of the most causes of pollution to produce greener environment
- …