Search CORE

2,225 research outputs found

Feature selection using Haar wavelet power spectrum

Author: A Blum
A Califano
AO Cavazzana
C Bhattacharyya
CK Chui
D Michie
DW Aha
E Xiang
F Abramovich
FL Ramsey
G Forman
G Strang
I Daubechies
I Guyon
I Kononenko
J Doak
J Khan
J Khan
J Li
K Kira
KE Lee
L Li
LD Miller
M Smith
MA Shipp
Mallat Stephanie
NK Kasabov
OM El-Badry
P Lio
P Yau
Prabakaran Subramani
R Caruana
R Kohavi
Rajendra Sahu
RO Duda
S Kim
S Kim
Shekhar Verma
T Li
T Nagano
TA Kestin
TR Golub
VL Savchenko
X Zhou
X Zhou
X Zhou
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Feature selection is an approach to overcome the 'curse of dimensionality' in complex researches like disease classification using microarrays. Statistical methods are utilized more in this domain. Most of them do not fit for a wide range of datasets. The transform oriented signal processing domains are not probed much when other fields like image and video processing utilize them well. Wavelets, one of such techniques, have the potential to be utilized in feature selection method. The aim of this paper is to assess the capability of Haar wavelet power spectrum in the problem of clustering and gene selection based on expression data in the context of disease classification and to propose a method based on Haar wavelet power spectrum. RESULTS: Haar wavelet power spectra of genes were analysed and it was observed to be different in different diagnostic categories. This difference in trend and magnitude of the spectrum may be utilized in gene selection. Most of the genes selected by earlier complex methods were selected by the very simple present method. Each earlier works proved only few genes are quite enough to approach the classification problem [1]. Hence the present method may be tried in conjunction with other classification methods. The technique was applied without removing the noise in data to validate the robustness of the method against the noise or outliers in the data. No special softwares or complex implementation is needed. The qualities of the genes selected by the present method were analysed through their gene expression data. Most of them were observed to be related to solve the classification issue since they were dominant in the diagnostic category of the dataset for which they were selected as features. CONCLUSION: In the present paper, the problem of feature selection of microarray gene expression data was considered. We analyzed the wavelet power spectrum of genes and proposed a clustering and feature selection method useful for classification based on Haar wavelet power spectrum. Application of this technique in this area is novel, simple, and faster than other methods, fit for a wide range of data types. The results are encouraging and throw light into the possibility of using this technique for problem domains like disease classification, gene network identification and personalized drug design

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Elephant Search with Deep Learning for Microarray Data Analysis

Author: Panda Mrutyunjaya
Publication venue
Publication date: 12/07/2017
Field of study

Even though there is a plethora of research in Microarray gene expression data analysis, still, it poses challenges for researchers to effectively and efficiently analyze the large yet complex expression of genes. The feature (gene) selection method is of paramount importance for understanding the differences in biological and non-biological variation between samples. In order to address this problem, a novel elephant search (ES) based optimization is proposed to select best gene expressions from the large volume of microarray data. Further, a promising machine learning method is envisioned to leverage such high dimensional and complex microarray dataset for extracting hidden patterns inside to make a meaningful prediction and most accurate classification. In particular, stochastic gradient descent based Deep learning (DL) with softmax activation function is then used on the reduced features (genes) for better classification of different samples according to their gene expression levels. The experiments are carried out on nine most popular Cancer microarray gene selection datasets, obtained from UCI machine learning repository. The empirical results obtained by the proposed elephant search based deep learning (ESDL) approach are compared with most recent published article for its suitability in future Bioinformatics research.Comment: 12 pages, 5 Tabl

arXiv.org e-Print Archive

A novel neural network approach to cDNA microarray image segmentation

Author: Adams
Bachar Zineddin
Bajcsy
Bishop
Blekas
Blekas
Bozinov
Bozinov
Buckley
Burt
Demirkaya
Demuth
DeRisi
DeRisi
Eisen
Fausett
Fraser
Fraser
Ham
Haykin
Hebb
Jain
Jie Cao
Jinling Liang
Katzer
Lawrence
Lehmussola
Li
Liao
Lukac
Lukac
Mata
MathWorks
McCulloch
Min Du
Moore
Morris
Nianyin Zeng
Noda
Orengo
Otsu
Schena
Srinark
Tran
Wang
Wang
Whitchurch
Wit
Xiaohui Liu
Yang
Yurong Li
Zidong Wang
Zineddin
Publication venue: 'Elsevier BV'
Publication date: 01/07/2013
Field of study

This is the post-print version of the Article. The official published version can be accessed from the link below. Copyright @ 2013 Elsevier.Microarray technology has become a great source of information for biologists to understand the workings of DNA which is one of the most complex codes in nature. Microarray images typically contain several thousands of small spots, each of which represents a different gene in the experiment. One of the key steps in extracting information from a microarray image is the segmentation whose aim is to identify which pixels within an image represent which gene. This task is greatly complicated by noise within the image and a wide degree of variation in the values of the pixels belonging to a typical spot. In the past there have been many methods proposed for the segmentation of microarray image. In this paper, a new method utilizing a series of artificial neural networks, which are based on multi-layer perceptron (MLP) and Kohonen networks, is proposed. The proposed method is applied to a set of real-world cDNA images. Quantitative comparisons between the proposed method and commercial software GenePix(®) are carried out in terms of the peak signal-to-noise ratio (PSNR). This method is shown to not only deliver results comparable and even superior to existing techniques but also have a faster run time.This work was funded in part by the National Natural Science Foundation of China under Grants 61174136 and 61104041, the Natural Science Foundation of Jiangsu Province of China under Grant BK2011598, the International Science and Technology Cooperation Project of China under Grant No. 2011DFA12910, the Engineering and Physical Sciences Research Council (EPSRC) of the U.K. under Grant GR/S27658/01, the Royal Society of the U.K., and the Alexander von Humboldt Foundation of Germany

Crossref

Brunel University Research Archive

Peridocity, Change Detection and Prediction in Microarrays

Author: Islam Mohammad Shahidul
Publication venue: Scholarship@Western
Publication date: 01/01/2008
Field of study

Three topics in the analysis of microarray genomic data are discussed and improved statistical methods are developed in each case. A statistical test with higher power is developed for detecting periodicity in microarray time series data. Periodicity in short series, with non-Fourier frequencies, is detected through a Pearson curve calibrated to the null distribution obtained by computer simulation. Unlike other traditional methods, this approach is applicable even in the presence of missing values or unequal time intervals. The usefulness of the new method is demonstrated on simulated series as well as actual microarray time series. The second topic develops a new method for detection of changes in DNA or gene copy number. Regions for DNA copy number aberrations in chromosomal material are detected using maximum overlapping discrete wavelet transform (MODWT). It is shown how repeated application of MODWT to a series can be used to confirm the presence of change points. Application to simulated as well as array CGH (Comparative Genomic Hybridization) data confirms the excellent performance of this method. In the third topic, it is shown that an improved class predictor for tissue samples in microarray experiments is developed by incorporating nearest neighbour covariates (NNC). It is demonstrated that this method reduces the mis-classification errors in both simulated and actual microarray data

Scholarship@Western

Biomarker discovery and redundancy reduction towards classification using a multi-factorial MALDI-TOF MS T2DM mouse model dataset

Author: A Chadt
A Colorni
A Gamez-Pozo
A Rasche
A Tiss
A Tiss
AC Sauve
AL Oberg
Alexandra Chadt
Ali Tiss
B Wu
C Bauer
C Mercier
C Yang
Celia J Smith
Chris Bauer
D Kwon
D Mantini
DB West
Dieter Beule
E Lange
EP Xing
Frank Kleinjung
G Ge
GK Smyth
H Ressom
Hadi Al-Hasani
HS Jurgens
HS Jürgens
I Guyon
J Hua
J McGuire
J Norris
J Voortman
JE Shaw
JF Timms
JL Rodgers
Johannes Schuchhardt
Johnson RAaBGK
JR Ortlepp
K Coombes
Knut Reinert
L Breiman
M Dorigo
M Kirchner
M Palmblad
M Sturm
Mark W Towers
ME de Noo
MJ Crawley
MP van der Werff
N Tiffin
O Kohlbacher
P Du
P Pratapa
P Zhang
PV Rao
Q Liu
R Aebersold
R Cramer
Rainer Cramer
RC Gentleman
Robert Gentleman and Vince Carey and Wolfgang Huber and Rafael Irizarry and Sandrine Dudoit (Ed)
SM Carlson
T Alexandrov
T Dreja
T Hastie
Tanja Dreja
W Yu
X Liu
X Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Diabetes like many diseases and biological processes is not mono-causal. On the one hand multifactorial studies with complex experimental design are required for its comprehensive analysis. On the other hand, the data from these studies often include a substantial amount of redundancy such as proteins that are typically represented by a multitude of peptides. Coping simultaneously with both complexities (experimental and technological) makes data analysis a challenge for Bioinformatics

Central Archive at the University of Reading

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central