95,155 research outputs found
Wavelet feature extraction and genetic algorithm for biomarker detection in colorectal cancer data
Biomarkers which predict patient’s survival can play an important role in medical diagnosis and
treatment. How to select the significant biomarkers from hundreds of protein markers is a key step in
survival analysis. In this paper a novel method is proposed to detect the prognostic biomarkers ofsurvival in colorectal cancer patients using wavelet analysis, genetic algorithm, and Bayes classifier. One dimensional discrete wavelet transform (DWT) is normally used to reduce the dimensionality of biomedical data. In this study one dimensional continuous wavelet transform (CWT) was proposed to extract the features of colorectal cancer data. One dimensional CWT has no ability to reduce
dimensionality of data, but captures the missing features of DWT, and is complementary part of DWT. Genetic algorithm was performed on extracted wavelet coefficients to select the optimized features, using Bayes classifier to build its fitness function. The corresponding protein markers were
located based on the position of optimized features. Kaplan-Meier curve and Cox regression model 2 were used to evaluate the performance of selected biomarkers. Experiments were conducted on colorectal cancer dataset and several significant biomarkers were detected. A new protein biomarker CD46 was found to significantly associate with survival time
Application of artificial neural network in market segmentation: A review on recent trends
Despite the significance of Artificial Neural Network (ANN) algorithm to
market segmentation, there is a need of a comprehensive literature review and a
classification system for it towards identification of future trend of market
segmentation research. The present work is the first identifiable academic
literature review of the application of neural network based techniques to
segmentation. Our study has provided an academic database of literature between
the periods of 2000-2010 and proposed a classification scheme for the articles.
One thousands (1000) articles have been identified, and around 100 relevant
selected articles have been subsequently reviewed and classified based on the
major focus of each paper. Findings of this study indicated that the research
area of ANN based applications are receiving most research attention and self
organizing map based applications are second in position to be used in
segmentation. The commonly used models for market segmentation are data mining,
intelligent system etc. Our analysis furnishes a roadmap to guide future
research and aid knowledge accretion and establishment pertaining to the
application of ANN based techniques in market segmentation. Thus the present
work will significantly contribute to both the industry and academic research
in business and marketing as a sustainable valuable knowledge source of market
segmentation with the future trend of ANN application in segmentation.Comment: 24 pages, 7 figures,3 Table
Feature selection using genetic algorithms and probabilistic neural networks
Selection of input variables is a key stage in building
predictive models, and an important form of data mining. As exhaustive evaluation of potential input sets using full non-linear models is impractical, it is necessary to use simple fast-evaluating models and heuristic selection strategies. This paper discusses a fast, efficient, and powerful nonlinear input selection procedure using a combination of Probabilistic Neural Networks and repeated
bitwise gradient descent. The algorithm is compared
with forward elimination, backward elimination and genetic algorithms using a selection of real-world data sets. The algorithm has comparative performance and greatly reduced execution time with respect to these alternative approaches. It is demonstrated empirically that reliable results cannot be gained using any of these approaches without the use of resampling
Comparison Between Supervised and Unsupervised Classifications of Neuronal Cell Types: A Case Study
In the study of neural circuits, it becomes essential to discern the different neuronal cell types that build the circuit. Traditionally, neuronal cell types have been classified using qualitative descriptors. More recently, several attempts have been made to classify neurons quantitatively, using unsupervised clustering methods. While useful, these algorithms do not take advantage of previous information known to the investigator, which could improve the classification task. For neocortical GABAergic interneurons, the problem to discern among different cell types is particularly difficult and better methods are needed to perform objective classifications. Here we explore the use of supervised classification algorithms to classify neurons based on their morphological features, using a database of 128 pyramidal cells and 199 interneurons from mouse neocortex. To evaluate the performance of different algorithms we used, as a “benchmark,” the test to automatically distinguish between pyramidal cells and interneurons, defining “ground truth” by the presence or absence of an apical dendrite. We compared hierarchical clustering with a battery of different supervised classification algorithms, finding that supervised classifications outperformed hierarchical clustering. In addition, the selection of subsets of distinguishing features enhanced the classification accuracy for both sets of algorithms. The analysis of selected variables indicates that dendritic features were most useful to distinguish pyramidal cells from interneurons when compared with somatic and axonal morphological variables. We conclude that supervised classification algorithms are better matched to the general problem of distinguishing neuronal cell types when some information on these cell groups, in our case being pyramidal or interneuron, is known a priori. As a spin-off of this methodological study, we provide several methods to automatically distinguish neocortical pyramidal cells from interneurons, based on their morphologies
Feature selection when there are many influential features
Recent discussion of the success of feature selection methods has argued that
focusing on a relatively small number of features has been counterproductive.
Instead, it is suggested, the number of significant features can be in the
thousands or tens of thousands, rather than (as is commonly supposed at
present) approximately in the range from five to fifty. This change, in orders
of magnitude, in the number of influential features, necessitates alterations
to the way in which we choose features and to the manner in which the success
of feature selection is assessed. In this paper, we suggest a general approach
that is suited to cases where the number of relevant features is very large,
and we consider particular versions of the approach in detail. We propose ways
of measuring performance, and we study both theoretical and numerical
properties of the proposed methodology.Comment: Published in at http://dx.doi.org/10.3150/13-BEJ536 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
- …