33,484 research outputs found
Of mice and men: Sparse statistical modeling in cardiovascular genomics
In high-throughput genomics, large-scale designed experiments are becoming
common, and analysis approaches based on highly multivariate regression and
anova concepts are key tools. Shrinkage models of one form or another can
provide comprehensive approaches to the problems of simultaneous inference that
involve implicit multiple comparisons over the many, many parameters
representing effects of design factors and covariates. We use such approaches
here in a study of cardiovascular genomics. The primary experimental context
concerns a carefully designed, and rich, gene expression study focused on
gene-environment interactions, with the goals of identifying genes implicated
in connection with disease states and known risk factors, and in generating
expression signatures as proxies for such risk factors. A coupled exploratory
analysis investigates cross-species extrapolation of gene expression
signatures--how these mouse-model signatures translate to humans. The latter
involves exploration of sparse latent factor analysis of human observational
data and of how it relates to projected risk signatures derived in the animal
models. The study also highlights a range of applied statistical and genomic
data analysis issues, including model specification, computational questions
and model-based correction of experimental artifacts in DNA microarray data.Comment: Published at http://dx.doi.org/10.1214/07-AOAS110 in the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
A cDNA Microarray Gene Expression Data Classifier for Clinical Diagnostics Based on Graph Theory
Despite great advances in discovering cancer molecular profiles, the proper application of microarray technology to routine clinical diagnostics is still a challenge. Current practices in the classification of microarrays' data show two main limitations: the reliability of the training data sets used to build the classifiers, and the classifiers' performances, especially when the sample to be classified does not belong to any of the available classes. In this case, state-of-the-art algorithms usually produce a high rate of false positives that, in real diagnostic applications, are unacceptable. To address this problem, this paper presents a new cDNA microarray data classification algorithm based on graph theory and is able to overcome most of the limitations of known classification methodologies. The classifier works by analyzing gene expression data organized in an innovative data structure based on graphs, where vertices correspond to genes and edges to gene expression relationships. To demonstrate the novelty of the proposed approach, the authors present an experimental performance comparison between the proposed classifier and several state-of-the-art classification algorithm
Differential gene expression graphs: A data structure for classification in DNA microarrays
This paper proposes an innovative data structure to be used as a backbone in designing microarray phenotype sample classifiers. The data structure is based on graphs and it is built from a differential analysis of the expression levels of healthy and diseased tissue samples in a microarray dataset. The proposed data structure is built in such a way that, by construction, it shows a number of properties that are perfectly suited to address several problems like feature extraction, clustering, and classificatio
Application of new probabilistic graphical models in the genetic regulatory networks studies
This paper introduces two new probabilistic graphical models for
reconstruction of genetic regulatory networks using DNA microarray data. One is
an Independence Graph (IG) model with either a forward or a backward search
algorithm and the other one is a Gaussian Network (GN) model with a novel
greedy search method. The performances of both models were evaluated on four
MAPK pathways in yeast and three simulated data sets. Generally, an IG model
provides a sparse graph but a GN model produces a dense graph where more
information about gene-gene interactions is preserved. Additionally, we found
two key limitations in the prediction of genetic regulatory networks using DNA
microarray data, the first is the sufficiency of sample size and the second is
the complexity of network structures may not be captured without additional
data at the protein level. Those limitations are present in all prediction
methods which used only DNA microarray data.Comment: 38 pages, 3 figure
Viral pathogen discovery.
Viral pathogen discovery is of critical importance to clinical microbiology, infectious diseases, and public health. Genomic approaches for pathogen discovery, including consensus polymerase chain reaction (PCR), microarrays, and unbiased next-generation sequencing (NGS), have the capacity to comprehensively identify novel microbes present in clinical samples. Although numerous challenges remain to be addressed, including the bioinformatics analysis and interpretation of large datasets, these technologies have been successful in rapidly identifying emerging outbreak threats, screening vaccines and other biological products for microbial contamination, and discovering novel viruses associated with both acute and chronic illnesses. Downstream studies such as genome assembly, epidemiologic screening, and a culture system or animal model of infection are necessary to establish an association of a candidate pathogen with disease
A novel neural network approach to cDNA microarray image segmentation
This is the post-print version of the Article. The official published version can be accessed from the link below. Copyright @ 2013 Elsevier.Microarray technology has become a great source of information for biologists to understand the workings of DNA which is one of the most complex codes in nature. Microarray images typically contain several thousands of small spots, each of which represents a different gene in the experiment. One of the key steps in extracting information from a microarray image is the segmentation whose aim is to identify which pixels within an image represent which gene. This task is greatly complicated by noise within the image and a wide degree of variation in the values of the pixels belonging to a typical spot. In the past there have been many methods proposed for the segmentation of microarray image. In this paper, a new method utilizing a series of artificial neural networks, which are based on multi-layer perceptron (MLP) and Kohonen networks, is proposed. The proposed method is applied to a set of real-world cDNA images. Quantitative comparisons between the proposed method and commercial software GenePix(®) are carried out in terms of the peak signal-to-noise ratio (PSNR). This method is shown to not only deliver results comparable and even superior to existing techniques but also have a faster run time.This work was funded in part by the National Natural Science Foundation of China under Grants 61174136 and 61104041, the Natural Science Foundation of Jiangsu Province of China under Grant BK2011598, the International Science and Technology Cooperation Project of China under Grant No. 2011DFA12910, the Engineering and Physical Sciences Research Council (EPSRC) of the U.K. under Grant GR/S27658/01, the Royal Society of the U.K., and the Alexander von Humboldt Foundation of Germany
- …