Search CORE

14,232 research outputs found

A new multi-objective wrapper method for feature selection – Accuracy and stability analysis for BCI

Author: Damas Hermoso Miguel
Gan John Q.
González Peñalver Jesús
Martín Smith Pedro Jesús
Ortega Lopera Julio
Publication venue: 'Elsevier BV'
Publication date: 01/01/2019
Field of study

Feature selection is an important step in building classifiers for high-dimensional data problems, such as EEG classification for BCI applications. This paper proposes a new wrapper method for feature selection, based on a multi-objective evolutionary algorithm, where the representation of the individuals or potential solutions, along with the breeding operators and objective functions, have been carefully designed to select a small subset of features that has good generalization capability, trying to avoid the over-fitting problems that wrapper methods usually suffer. A novel feature ranking procedure is also proposed in order to analyze the stability of the proposed wrapper method. Four different classification schemes have been applied within the proposed wrapper method in order to evaluate its accuracy and stability for feature selection on a real motor imagery dataset. Experimental results show that the wrapper method presented in this paper is able to obtain very small subsets of features, which are quite stable and also achieve high classification accuracy, regardless of the classifiers used.Project TIN2015-67020-P (Spanish “Ministerio de Economía y Competitividad”)European Regional Development Funds (ERDF

University of Essex Research Repository

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Institucional Universidad de Granada

Machine learning for automatic prediction of the quality of electrophysiological recordings

Author: AB Wiltschko
BT Priest
C Mathes
CG Galizia
Dominique Martinez
F Franke
H Lei
Jean-Pierre Rospars
Johannes Reisert
M Asmild
MS Lewicki
R Friedrich
R Kohavi
S Panzeri
S Takahashi
SB Wilson
Shereen Elbanna
Sylvia Anton
T Nowotny
Thomas Nowotny
Y Saeys
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

The quality of electrophysiological recordings varies a lot due to technical and biological variability and neuroscientists inevitably have to select “good” recordings for further analyses. This procedure is time-consuming and prone to selection biases. Here, we investigate replacing human decisions by a machine learning approach. We define 16 features, such as spike height and width, select the most informative ones using a wrapper method and train a classifier to reproduce the judgement of one of our expert electrophysiologists. Generalisation performance is then assessed on unseen data, classified by the same or by another expert. We observe that the learning machine can be equally, if not more, consistent in its judgements as individual experts amongst each other. Best performance is achieved for a limited number of informative features; the optimal feature set being different from one data set to another. With 80–90% of correct judgements, the performance of the system is very promising within the data sets of each expert but judgments are less reliable when it is used across sets of recordings from different experts. We conclude that the proposed approach is relevant to the selection of electrophysiological recordings, provided parameters are adjusted to different types of experiments and to individual experimenters

Public Library of Science (PLOS)

Crossref

INRIA a CCSD electronic archive server

Directory of Open Access Journals

Sussex Research Online

FigShare

Evolving Spatially Aggregated Features from Satellite Imagery for Regional Modeling

Author: AE Hoerl
D Buckingham
J Bongard
J Dong
J Dozier
J Martinec
J Rees
JR Koza
K Krawiec
M Hollander
M Schmidt
M Tedesco
MD Schmidt
R Tibshirani
TH Painter
WR Tobler
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/12/2017
Field of study

Satellite imagery and remote sensing provide explanatory variables at relatively high resolutions for modeling geospatial phenomena, yet regional summaries are often desirable for analysis and actionable insight. In this paper, we propose a novel method of inducing spatial aggregations as a component of the machine learning process, yielding regional model features whose construction is driven by model prediction performance rather than prior assumptions. Our results demonstrate that Genetic Programming is particularly well suited to this type of feature construction because it can automatically synthesize appropriate aggregations, as well as better incorporate them into predictive models compared to other regression methods we tested. In our experiments we consider a specific problem instance and real-world dataset relevant to predicting snow properties in high-mountain Asia

arXiv.org e-Print Archive

Crossref

Machine Learning and Integrative Analysis of Biomedical Big Data.

Author: Choi Howard
Chung Neo Christopher
Mirza Bilal
Ping Peipei
Wang Jie
Wang Wei
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

Multidisciplinary Digital Publishing Institute

Ezid

Directory of Open Access Journals

eScholarship - University of California

Recommended from our members

Prediction of progression in idiopathic pulmonary fibrosis using CT scans atbaseline: A quantum particle swarm optimization - Random forest approach

Author: Brown Matthew S.
Goldin Jonathan G.
Kim Grace Hyun J.
Shi Yu
Wong Weng Kee
Publication venue: eScholarship, University of California
Publication date: 19/08/2019
Field of study

Idiopathic pulmonary fibrosis (IPF) is a fatal lung disease characterized by an unpredictable progressive declinein lung function. Natural history of IPF is unknown and the prediction of disease progression at the time ofdiagnosis is notoriously difficult. High resolution computed tomography (HRCT) has been used for the diagnosisof IPF, but not generally for monitoring purpose. The objective of this work is to develop a novel predictivemodel for the radiological progression pattern at voxel-wise level using only baseline HRCT scans. Mainly, thereare two challenges: (a) obtaining a data set of features for region of interest (ROI) on baseline HRCT scans andtheir follow-up status; and (b) simultaneously selecting important features from high-dimensional space, andoptimizing the prediction performance. We resolved the first challenge by implementing a study design andhaving an expert radiologist contour ROIs at baseline scans, depending on its progression status in follow-upvisits. For the second challenge, we integrated the feature selection with prediction by developing an algorithmusing a wrapper method that combines quantum particle swarm optimization to select a small number of featureswith random forest to classify early patterns of progression. We applied our proposed algorithm to analyzeanonymized HRCT images from 50 IPF subjects from a multi-center clinical trial. We showed that it yields aparsimonious model with 81.8% sensitivity, 82.2% specificity and an overall accuracy rate of 82.1% at the ROIlevel. These results are superior to other popular feature selections and classification methods, in that ourmethod produces higher accuracy in prediction of progression and more balanced sensitivity and specificity witha smaller number of selected features. Our work is the first approach to show that it is possible to use onlybaseline HRCT scans to predict progressive ROIs at 6 months to 1year follow-ups using artificial intelligence

eScholarship - University of California

A survey on utilization of data mining approaches for dermatological (skin) diseases prediction

Author: Adibi N
Ahmadzadeh MR
Barati E
Mohammadi A
Saraee MH
Publication venue: Cyber Journals
Publication date: 01/03/2011
Field of study

Due to recent technology advances, large volumes of medical data is obtained. These data contain valuable information. Therefore data mining techniques can be used to extract useful patterns. This paper is intended to introduce data mining and its various techniques and a survey of the available literature on medical data mining. We emphasize mainly on the application of data mining on skin diseases. A categorization has been provided based on the different data mining techniques. The utility of the various data mining methodologies is highlighted. Generally association mining is suitable for extracting rules. It has been used especially in cancer diagnosis. Classification is a robust method in medical mining. In this paper, we have summarized the different uses of classification in dermatology. It is one of the most important methods for diagnosis of erythemato-squamous diseases. There are different methods like Neural Networks, Genetic Algorithms and fuzzy classifiaction in this topic. Clustering is a useful method in medical images mining. The purpose of clustering techniques is to find a structure for the given data by finding similarities between data according to data characteristics. Clustering has some applications in dermatology. Besides introducing different mining methods, we have investigated some challenges which exist in mining skin data

University of Salford Institutional Repository