Search CORE

67 research outputs found

Automatic Gridding for DNA Microarray Image Using Image Projection Profile

Author: Siswantoro Joko
Publication venue
Publication date: 01/01/2010
Field of study

DNA microarray is powerful tool and widely used in many areas. DNA microarray is produced from control and test tissue sample cDNAs, which are labeled with two different fluorescent dyes. After hybridization using a laser scanner, microarray images are obtained. Image analysis play an important role in extracting fluorescence intensity from microarray image. First step in microarray image analysis is addressing, that is finding areas in the image on which contain one spot using gird lines. This step can be done by either manually or automatically. In this paper we propose an efficient and simple automatic gridding for microarray image analysis using image projection profile, base on fact that microarray image has local minimum and maximum intensity at background and foreground areas respectively. Grid lines are obtained by finding local minimum of vertical and horizontal projection profile. This algorithm has been implemented in MATLAB and tested with several microarray image

University of Surabaya Institutional Repository

M3G: Maximum Margin Microarray Gridding

Author: Biodiscovery Inc
C Cortes
CC Chang
CJC Burges
D Juric
DG Bariamis
Dimitris Bariamis
Dimitris K Iakovidis
Dimitris Maroulis
E Zacharia
G Antoniol
HY Jung
J Angulo
J Platt
Jr Hirata R
K Blekas
K Hartelius
L Rueda
M Ceccarelli
M Katzer
M Katzer
MB Eisen
N Brändle
N Giannakeas
N Otsu
ND Lawrence
P Bajcsy
P Hegde
RC Gonzalez
RE Fan
S Theodoridis
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Complementary DNA (cDNA) microarrays are a well established technology for studying gene expression. A microarray image is obtained by laser scanning a hybridized cDNA microarray, which consists of thousands of spots representing chains of cDNA sequences, arranged in a two-dimensional array. The separation of the spots into distinct cells is widely known as microarray image gridding. Methods In this paper we propose M3G, a novel method for automatic gridding of cDNA microarray images based on the maximization of the margin between the rows and the columns of the spots. Initially the microarray image rotation is estimated and then a pre-processing algorithm is applied for a rough spot detection. In order to diminish the effect of artefacts, only a subset of the detected spots is selected by matching the distribution of the spot sizes to the normal distribution. Then, a set of grid lines is placed on the image in order to separate each pair of consecutive rows and columns of the selected spots. The optimal positioning of the lines is determined by maximizing the margin between these rows and columns by using a maximum margin linear classifier, effectively facilitating the localization of the spots. Results The experimental evaluation was based on a reference set of microarray images containing more than two million spots in total. The results show that M3G outperforms state of the art methods, demonstrating robustness in the presence of noise and artefacts. More than 98% of the spots reside completely inside their respective grid cells, whereas the mean distance between the spot center and the grid cell center is 1.2 pixels. Conclusions The proposed method performs highly accurate gridding in the presence of noise and artefacts, while taking into account the input image rotation. Thus, it provides the potential of achieving perfect gridding for the vast majority of the spots.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Studying the Functional Genomics of Stress Responses in Loblolly Pine With the Expresso Microarray Experiment Management System

Author: Aharoni
Alexandre
Alscher
Bard
Boris I. Chevone
Brachat
Brown
Callis
Chang
Chen
Cho
Chu
Claverie
Costa
Costa
Craig A. Struble
Cushman
Daniels
Dawei Chen
Degenhardt
Donahue
Dong
Dzeroski
Eisen
Epstein
Flach
Fraley
Gallant
Gang
Garofalakis
Gasch
Geisler
Gilchrest
Golub
Gracey
Greller
Hilsenbeck
Hong
Jain
Jelinsky
Jordan
Kannan
Kawasaki
Khan
Lavrac
Lazzeroni
Lee
Lenwood S. Heath
Leonel van Zyl
Lev-Yadun
May
Monni
Muggleton
Muggleton
Mullineaux
Naren Ramakrishnan
Perou
Reymond
Rial
Ronald R. Sederoff
Ross W. Whetten
Ruan
Ruth Grene
Scandalios
Schaffer
Schnaider
Seki
Sherlock
Shinozaki
Shinozaki
Smyth
Somerville
Srinivasan
Sullivan
Uno
Vapnik
Vincent Y. Jouenne
Wang
Wang
White
Wu
Yang
Zhu
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2002
Field of study

Conception, design, and implementation of cDNA microarray experiments present a variety of bioinformatics challenges for biologists and computational scientists. The multiple stages of data acquisition and analysis have motivated the design of Expresso, a system for microarray experiment management. Salient aspects of Expresso include support for clone replication and randomized placement; automatic gridding, extraction of expression data from each spot, and quality monitoring; flexible methods of combining data from individual spots into information about clones and functional categories; and the use of inductive logic programming for higher-level data analysis and mining. The development of Expresso is occurring in parallel with several generations of microarray experiments aimed at elucidating genomic responses to drought stress in loblolly pine seedlings. The current experimental design incorporates 384 pine cDNAs replicated and randomly placed in two specific microarray layouts. We describe the design of Expresso as well as results of analysis with Expresso that suggest the importance of molecular chaperones and membrane transport proteins in mechanisms conferring successful adaptation to long-term drought stress

epublications@Marquette

Crossref

Directory of Open Access Journals

PubMed Central

Methods to improve gene signal : Application to cDNA microarrays

Author: Gupta Rashi
Publication venue: 'University of Helsinki Libraries'
Publication date: 17/04/2009
Field of study

Microarrays are high throughput biological assays that allow the screening of thousands of genes for their expression. The main idea behind microarrays is to compute for each gene a unique signal that is directly proportional to the quantity of mRNA that was hybridized on the chip. A large number of steps and errors associated with each step make the generated expression signal noisy. As a result, microarray data need to be carefully pre-processed before their analysis can be assumed to lead to reliable and biologically relevant conclusions. This thesis focuses on developing methods for improving gene signal and further utilizing this improved signal for higher level analysis. To achieve this, first, approaches for designing microarray experiments using various optimality criteria, considering both biological and technical replicates, are described. A carefully designed experiment leads to signal with low noise, as the effect of unwanted variations is minimized and the precision of the estimates of the parameters of interest are maximized. Second, a system for improving the gene signal by using three scans at varying scanner sensitivities is developed. A novel Bayesian latent intensity model is then applied on these three sets of expression values, corresponding to the three scans, to estimate the suitably calibrated true signal of genes. Third, a novel image segmentation approach that segregates the fluorescent signal from the undesired noise is developed using an additional dye, SYBR green RNA II. This technique helped in identifying signal only with respect to the hybridized DNA, and signal corresponding to dust, scratch, spilling of dye, and other noises, are avoided. Fourth, an integrated statistical model is developed, where signal correction, systematic array effects, dye effects, and differential expression, are modelled jointly as opposed to a sequential application of several methods of analysis. The methods described in here have been tested only for cDNA microarrays, but can also, with some modifications, be applied to other high-throughput technologies. Keywords: High-throughput technology, microarray, cDNA, multiple scans, Bayesian hierarchical models, image analysis, experimental design, MCMC, WinBUGS.Tarkastellaan menetelmiä, joilla voidaan parantaa geneetisiä signaaleja ja hyödyntää vahvistetun signaalin käyttöä myöhemmissä analyyseissä

Helsingin yliopiston digitaalinen arkisto

Finding spot shape in cdna microarray by using a deformable grid and a Markov segmentation

Author: Gouinaud Christophe
Hill David R.C.
Peyret Pierre
Yon Loïc
Publication venue: HAL CCSD
Publication date: 30/04/2009
Field of study

L'intérêt de l'utilisation des biopuces cdna pour la génétique n'est plus à démontrer [EISE-99]. Cette technologie complexe arrive maintenant à une certaine maturité et son utilisation s'étend notamment dans la modélisation des relations gène expression individu. De ce fait, le défi actuel est l'amélioration de la précision des mesures réalisées de fac¸on à augmenter la qualité des expressions estimées et donc les résultats fonctionnels. En effet, le plus souvent les réponses cherchées jusqu'à présent étaient binaires, alors que maintenant la recherche s'oriente vers des mesures moins tranchées où l'on veut mesurer un quantum d'expression

HAL Clermont Université

Changes in Gene Expression Foreshadow Diet-Induced Obesity in Genetically Identical Mice

Author: Christopher Faulk
Gregory Barsh
Jessica Hogan
Jihad Skaf
Jong-Seop Rim
Larissa Nikonova
Leslie P Kozak
Robert A Koza
Tamra Mendoza
Publication venue: Public Library of Science
Publication date: 01/05/2006
Field of study

High phenotypic variation in diet-induced obesity in male C57BL/6J inbred mice suggests a molecular model to investigate non-genetic mechanisms of obesity. Feeding mice a high-fat diet beginning at 8 wk of age resulted in a 4-fold difference in adiposity. The phenotypes of mice characteristic of high or low gainers were evident by 6 wk of age, when mice were still on a low-fat diet; they were amplified after being switched to the high-fat diet and persisted even after the obesogenic protocol was interrupted with a calorically restricted, low-fat chow diet. Accordingly, susceptibility to diet-induced obesity in genetically identical mice is a stable phenotype that can be detected in mice shortly after weaning. Chronologically, differences in adiposity preceded those of feeding efficiency and food intake, suggesting that observed difference in leptin secretion is a factor in determining phenotypes related to food intake. Gene expression analyses of adipose tissue and hypothalamus from mice with low and high weight gain, by microarray and qRT-PCR, showed major changes in the expression of genes of Wnt signaling and tissue re-modeling in adipose tissue. In particular, elevated expression of SFRP5, an inhibitor of Wnt signaling, the imprinted gene MEST and BMP3 may be causally linked to fat mass expansion, since differences in gene expression observed in biopsies of epididymal fat at 7 wk of age (before the high-fat diet) correlated with adiposity after 8 wk on a high-fat diet. We propose that C57BL/6J mice have the phenotypic characteristics suitable for a model to investigate epigenetic mechanisms within adipose tissue that underlie diet-induced obesity

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Multivariate, Heteroscedastic Empirical Bayes via Nonparametric Maximum Likelihood

Author: Guntuboyina Adityanand
Sen Bodhisattva
Soloff Jake A.
Publication venue
Publication date: 29/12/2023
Field of study

Multivariate, heteroscedastic errors complicate statistical inference in many large-scale denoising problems. Empirical Bayes is attractive in such settings, but standard parametric approaches rest on assumptions about the form of the prior distribution which can be hard to justify and which introduce unnecessary tuning parameters. We extend the nonparametric maximum likelihood estimator (NPMLE) for Gaussian location mixture densities to allow for multivariate, heteroscedastic errors. NPMLEs estimate an arbitrary prior by solving an infinite-dimensional, convex optimization problem; we show that this convex optimization problem can be tractably approximated by a finite-dimensional version. The empirical Bayes posterior means based on an NPMLE have low regret, meaning they closely target the oracle posterior means one would compute with the true prior in hand. We prove an oracle inequality implying that the empirical Bayes estimator performs at nearly the optimal level (up to logarithmic factors) for denoising without prior knowledge. We provide finite-sample bounds on the average Hellinger accuracy of an NPMLE for estimating the marginal densities of the observations. We also demonstrate the adaptive and nearly-optimal properties of NPMLEs for deconvolution. We apply our method to two denoising problems in astronomy, constructing a fully data-driven color-magnitude diagram of 1.4 million stars in the Milky Way and investigating the distribution of 19 chemical abundance ratios for 27 thousand stars in the red clump. We also apply our method to hierarchical linear models, illustrating the advantages of nonparametric shrinkage of regression coefficients on an education data set and on a microarray data set

arXiv.org e-Print Archive

Recommended from our members

Combining heterogeneous sources of data for the reverse-engineering of gene regulatory networks

Author: Steele Emma
Publication venue: Brunel University, School of Information Systems, Computing and Mathematics
Publication date: 01/01/2010
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Gene Regulatory Networks (GRNs) represent how genes interact in various cellular processes by describing how the expression level, or activity, of genes can affect the expression of the other genes. Reverse-engineering GRN models can help biologists understand and gain insight into genetic conditions and diseases. Recently, the increasingly widespread use of DNA microarrays, a high-throughput technology that allows the expression of thousands of genes to be measured simultaneously in biological experiments, has led to many datasets of gene expression measurements becoming publicly available and a subsequent explosion of research in the reverse-engineering of GRN models. However, microarray technology has a number of limitations as a data source for the modelling of GRNs, due to concerns over its reliability and the reproducibility of experimental results. The underlying theme of the research presented in this thesis is the incorporation of multiple sources and different types of data into techniques for reverse-engineering or learning GRNs from data. By drawing on many data sources, the resulting network models should be more robust, accurate and reliable than models that have been learnt using a single data source. This is achieved by focusing on two main strands of research. First, the thesis presents some of the earliest work in the incorporation of prior knowledge that has been generated from a large body of scientific papers, for Bayesian network based GRN models. Second, novel methods for the use of multiple microarray datasets to produce Bayesian network based GRN models are introduced. Empirical evaluations are used to show that the incorporation of literature-based prior knowledge and combining multiple microarray datasets can provide an improvement, when compared to the use of a single microarray dataset, for the reverse-engineering of Bayesian network based GRN models

Brunel University Research Archive

Bayesian nonparametric clusterings in relational and high-dimensional settings with applications in bioinformatics.

Author: Li Dazhuo
Publication venue: ThinkIR: The University of Louisville\u27s Institutional Repository
Publication date: 01/05/2012
Field of study

Recent advances in high throughput methodologies offer researchers the ability to understand complex systems via high dimensional and multi-relational data. One example is the realm of molecular biology where disparate data (such as gene sequence, gene expression, and interaction information) are available for various snapshots of biological systems. This type of high dimensional and multirelational data allows for unprecedented detailed analysis, but also presents challenges in accounting for all the variability. High dimensional data often has a multitude of underlying relationships, each represented by a separate clustering structure, where the number of structures is typically unknown a priori. To address the challenges faced by traditional clustering methods on high dimensional and multirelational data, we developed three feature selection and cross-clustering methods: 1) infinite relational model with feature selection (FIRM) which incorporates the rich information of multirelational data; 2) Bayesian Hierarchical Cross-Clustering (BHCC), a deterministic approximation to Cross Dirichlet Process mixture (CDPM) and to cross-clustering; and 3) randomized approximation (RBHCC), based on a truncated hierarchy. An extension of BHCC, Bayesian Congruence Measuring (BCM), is proposed to measure incongruence between genes and to identify sets of congruent loci with identical evolutionary histories. We adapt our BHCC algorithm to the inference of BCM, where the intended structure of each view (congruent loci) represents consistent evolutionary processes. We consider an application of FIRM on categorizing mRNA and microRNA. The model uses latent structures to encode the expression pattern and the gene ontology annotations. We also apply FIRM to recover the categories of ligands and proteins, and to predict unknown drug-target interactions, where latent categorization structure encodes drug-target interaction, chemical compound similarity, and amino acid sequence similarity. BHCC and RBHCC are shown to have improved predictive performance (both in terms of cluster membership and missing value prediction) compared to traditional clustering methods. Our results suggest that these novel approaches to integrating multi-relational information have a promising future in the biological sciences where incorporating data related to varying features is often regarded as a daunting task

University of Louisville