Search CORE

26,200 research outputs found

Bioinformatics: A challenge for statisticians

Author: Upton Graham J G
Publication venue: 'Elsevier BV'
Publication date: 01/01/2009
Field of study

Bioinformatics is a subject that requires the skills of biologists, computer scientists, mathematicians and staisticians. This paper introduces the reader to one small aspect of the subject: the study of microarrays. It describes some of the complexities of the enormous amounts of data that are available and shows how simple statistical techniques can be used to highlight deficiencies in that data

University of Essex Research Repository

Physico-chemical foundations underpinning microarray and next-generation sequencing experiments

Author: A. Buhot
A. E. Pozhitkov
A. Halperin
A. Harrison
A. Ott
Amend
B. M. Pettitt
Berger
Binder
Binder
Binder
Binder
Bolstad
Bullard
Burden
Burden
C. Gibas
C. J. Burden
Chou
Chou
Czypionka
D. P. Kreil
D. Tautz
E. Carlon
Fasold
Fasold
Fiche
Fuchs
H. Binder
Halperin
Harr
Harrison
He
Heim
Held
Hooyberghs
Huettel
Iltumur
Irizarry
Irizarry
Irizarry
Irving
J. Hooyberghs
Kane
L. J. Gamble
Lee
Lee
Letowski
Li
Liebich
Lockhart
Luebke
Marshall
Matveeva
Mueckstein
Mulders
Naiser
Naiser
P. A. Noble
Pingel
Pozhitkov
R. Levicky
Relogio
Rouillard
Tanaka
Trapp
Upton
Vainrub
Wodicka
Yu
Zhang
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2013
Field of study

Hybridization of nucleic acids on solid surfaces is a key process involved in high-throughput technologies such as microarrays and, in some cases, next-generation sequencing (NGS). A physical understanding of the hybridization process helps to determine the accuracy of these technologies. The goal of a widespread research program is to develop reliable transformations between the raw signals reported by the technologies and individual molecular concentrations from an ensemble of nucleic acids. This research has inputs from many areas, from bioinformatics and biostatistics, to theoretical and experimental biochemistry and biophysics, to computer simulations. A group of leading researchers met in Ploen Germany in 2011 to discuss present knowledge and limitations of our physico-chemical understanding of high-throughput nucleic acid technologies. This meeting inspired us to write this summary, which provides an overview of the state-of-the-art approaches based on physico-chemical foundation to modeling of the nucleic acids hybridization process on solid surfaces. In addition, practical application of current knowledge is emphasized

University of Essex Research Repository

Crossref

Hal - Université Grenoble Alpes

PubMed Central

Warwick Research Archives Portal Repository

The Australian National University

HAL-CEA

MPG.PuRe

Nonequilibrium effects in DNA microarrays: a multiplatform study

Author: Binder H.
Binder H.
Bloomfield V. A.
Burden C. J.
Burden C. J.
Cantor C. R.
Carlon E.
E. Carlon
Fish D. J.
Glazer M.
Gray D. M.
Hagan M. F.
Halperin A.
Halperin A.
Hekstra D.
Held G. A.
Hooyberghs J.
Hooyberghs J.
J. Hooyberghs
J.-C. Walter
K. M. Kroll
Levicky R.
Naiser T.
Naiser T.
Ono N.
Peyret N.
SantaLucia J.
SantaLucia J.
Suzuki S.
Vainrub A.
Weckx S.
Zhang L.
Zhang L.
Publication venue: 'American Chemical Society (ACS)'
Publication date: 08/07/2011
Field of study

It has recently been shown that in some DNA microarrays the time needed to reach thermal equilibrium may largely exceed the typical experimental time, which is about 15h in standard protocols (Hooyberghs et al. Phys. Rev. E 81, 012901 (2010)). In this paper we discuss how this breakdown of thermodynamic equilibrium could be detected in microarray experiments without resorting to real time hybridization data, which are difficult to implement in standard experimental conditions. The method is based on the analysis of the distribution of fluorescence intensities I from different spots for probes carrying base mismatches. In thermal equilibrium and at sufficiently low concentrations, log I is expected to be linearly related to the hybridization free energy

\Delta G

with a slope equal to

1/RT_{exp}

, where

T_{exp}

is the experimental temperature and R is the gas constant. The breakdown of equilibrium results in the deviation from this law. A model for hybridization kinetics explaining the observed experimental behavior is discussed, the so-called 3-state model. It predicts that deviations from equilibrium yield a proportionality of

\log I

\Delta G/RT_{eff}

. Here,

T_{eff}

is an effective temperature, higher than the experimental one. This behavior is indeed observed in some experiments on Agilent arrays. We analyze experimental data from two other microarray platforms and discuss, on the basis of the results, the attainment of equilibrium in these cases. Interestingly, the same 3-state model predicts a (dynamical) saturation of the signal at values below the expected one at equilibrium.Comment: 27 pages, 9 figures, 1 tabl

arXiv.org e-Print Archive

Crossref

Surface free energy and microarray deposition technology

Author: Aussillous
Blossey
Cheran
Deegan
Deegan
Glen McHale
McHale
Wixforth
Wixforth
Publication venue: 'Royal Society of Chemistry (RSC)'
Publication date: 01/01/2007
Field of study

Microarray techniques use a combinatorial approach to assess complex biochemical interactions. The fundamental goal is simultaneous, large-scale experimentation analogous to the automation achieved in the semiconductor industry. However, microarray deposition inherently involves liquids contacting solid substrates. Liquid droplet shapes are determined by surface and interfacial tension forces, and flows during drying. This article looks at how surface free energy and wetting considerations may influence the accuracy and reliability of spotted microarray experiments

CiteSeerX

Northumbria Research Link

Crossref

Nottingham Trent Institutional Repository (IRep)

Identifying the impact of G-quadruplexes on Affymetrix 3' arrays using cloud computing.

Author: Harrison AP
Memon FN
Owen AM
Sanchez-Graillet O
Upton GJG
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2010
Field of study

A tetramer quadruplex structure is formed by four parallel strands of DNA/ RNA containing runs of guanine. These quadruplexes are able to form because guanine can Hoogsteen hydrogen bond to other guanines, and a tetrad of guanines can form a stable arrangement. Recently we have discovered that probes on Affymetrix GeneChips that contain runs of guanine do not measure gene expression reliably. We associate this finding with the likelihood that quadruplexes are forming on the surface of GeneChips. In order to cope with the rapidly expanding size of GeneChip array datasets in the public domain, we are exploring the use of cloud computing to replicate our experiments on 3' arrays to look at the effect of the location of G-spots (runs of guanines). Cloud computing is a recently introduced high-performance solution that takes advantage of the computational infrastructure of large organisations such as Amazon and Google. We expect that cloud computing will become widely adopted because it enables bioinformaticians to avoid capital expenditure on expensive computing resources and to only pay a cloud computing provider for what is used. Moreover, as well as financial efficiency, cloud computing is an ecologically-friendly technology, it enables efficient data-sharing and we expect it to be faster for development purposes. Here we propose the advantageous use of cloud computing to perform a large data-mining analysis of public domain 3' arrays

University of Essex Research Repository

Crossref

Directory of Open Access Journals

Publications at Bielefeld University

The Geneticists\u27 Approach to Bilski

Author: Boughman Joann A.
Brown Kyle M.
Publication venue: DigitalCommons@UM Carey Law
Publication date: 01/01/2011
Field of study

Digital Commons @ UM Law

A Revised Design for Microarray Experiments to Account for Experimental Noise and Uncertainty of Probe Response

Author: A Halperin
A Halperin
A Harrison
A Held
A Pozhitkov
AE Pozhitkov
AE Pozhitkov
Alex E. Pozhitkov
AM Osborn
BE Lang
C Burden
CJ Burden
D Hekstra
D Steger
Diethard Tautz
FF Millenaar
FW Studier
GA Held
H Freundlich
H Yang
J Bryk
J Seo
J Shendure
Jarosław Bryk
JB Fan
Lars Kaderali
N Ono
OV Matveeva
Peter A. Noble
S Li
T Czypionka
T Heim
U Mueckstein
Y Barash
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 26/03/2013
Field of study

Background Although microarrays are analysis tools in biomedical research, they are known to yield noisy output that usually requires experimental confirmation. To tackle this problem, many studies have developed rules for optimizing probe design and devised complex statistical tools to analyze the output. However, less emphasis has been placed on systematically identifying the noise component as part of the experimental procedure. One source of noise is the variance in probe binding, which can be assessed by replicating array probes. The second source is poor probe performance, which can be assessed by calibrating the array based on a dilution series of target molecules. Using model experiments for copy number variation and gene expression measurements, we investigate here a revised design for microarray experiments that addresses both of these sources of variance. Results Two custom arrays were used to evaluate the revised design: one based on 25 mer probes from an Affymetrix design and the other based on 60 mer probes from an Agilent design. To assess experimental variance in probe binding, all probes were replicated ten times. To assess probe performance, the probes were calibrated using a dilution series of target molecules and the signal response was fitted to an adsorption model. We found that significant variance of the signal could be controlled by averaging across probes and removing probes that are nonresponsive or poorly responsive in the calibration experiment. Taking this into account, one can obtain a more reliable signal with the added option of obtaining absolute rather than relative measurements. Conclusion The assessment of technical variance within the experiments, combined with the calibration of probes allows to remove poorly responding probes and yields more reliable signals for the remaining ones. Once an array is properly calibrated, absolute quantification of signals becomes straight forward, alleviating the need for normalization and reference hybridizations

arXiv.org e-Print Archive

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Electronic Archiving System

MPG.PuRe

University of Huddersfield Repository

Application of Volcano Plots in Analyses of mRNA Differential Expressions with Microarrays

Author: Alvord W. G.
Auer P. L.
Chen Y.
Chen Z.
Cohen J.
Fechner G. T.
Guyon I.
Göhlmann H.
Lee J.
Li C.
Schwender H.
Smyth G. K.
Snedecor G. W.
Trevino V.
Vandesompele J.
Welsh B. L.
WENTIAN LI
Zhao C.
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 28/08/2013
Field of study

Volcano plot displays unstandardized signal (e.g. log-fold-change) against noise-adjusted/standardized signal (e.g. t-statistic or -log10(p-value) from the t test). We review the basic and an interactive use of the volcano plot, and its crucial role in understanding the regularized t-statistic. The joint filtering gene selection criterion based on regularized statistics has a curved discriminant line in the volcano plot, as compared to the two perpendicular lines for the "double filtering" criterion. This review attempts to provide an unifying framework for discussions on alternative measures of differential expression, improved methods for estimating variance, and visual display of a microarray analysis result. We also discuss the possibility to apply volcano plots to other fields beyond microarray.Comment: 8 figure

arXiv.org e-Print Archive

Crossref

Consensus and meta-analysis regulatory networks for combining multiple microarray gene expression datasets

Author: Akaike
Allan Tucker
Beissbarth
Conlon
Courcelle
DerSimonian
Eisen
Emma Steele
Faith
Friedman
Gasch
Grigull
Hanley
Hartemink
Jarvinen
Khil
Kuo
Matzkevich
Ng
Pearl
Pearl
Pennock
Pe’er
Pe’er
Quillardet
Salgado
Sangurdekar
Smyth
Soinov
Spellman
Stoica
Sutton
Teixeira
Wang
Yauk
Publication venue: 'Elsevier BV'
Publication date: 01/12/2008
Field of study

Microarray data is a key source of experimental data for modelling gene regulatory interactions from expression levels. With the rapid increase of publicly available microarray data comes the opportunity to produce regulatory network models based on multiple datasets. Such models are potentially more robust with greater confidence, and place less reliance on a single dataset. However, combining datasets directly can be difficult as experiments are often conducted on different microarray platforms, and in different laboratories leading to inherent biases in the data that are not always removed through pre-processing such as normalisation. In this paper we compare two frameworks for combining microarray datasets to model regulatory networks: pre- and post-learning aggregation. In pre-learning approaches, such as using simple scale-normalisation prior to the concatenation of datasets, a model is learnt from a combined dataset, whilst in post-learning aggregation individual models are learnt from each dataset and the models are combined. We present two novel approaches for post-learning aggregation, each based on aggregating high-level features of Bayesian network models that have been generated from different microarray expression datasets. Meta-analysis Bayesian networks are based on combining statistical confidences attached to network edges whilst Consensus Bayesian networks identify consistent network features across all datasets. We apply both approaches to multiple datasets from synthetic and real (Escherichia coli and yeast) networks and demonstrate that both methods can improve on networks learnt from a single dataset or an aggregated dataset formed using a standard scale-normalisation

Elsevier - Publisher Connector

Crossref

Brunel University Research Archive

Thermodynamics of RNA/DNA hybridization in high density oligonucleotide microarrays

Author: Bolstad
Brown
Enrico Carlon
Forman
Halperin
Hekstra
Held
Li
Lipshutz
Naef
Naef
SantaLucia
Sugimoto
Sugimoto
Thomas Heim
Vainrub
Zhang
Publication venue: 'Elsevier BV'
Publication date: 01/01/2006
Field of study

We analyze a series of publicly available controlled experiments (Latin square) on Affymetrix high density oligonucleotide microarrays using a simple physical model of the hybridization process. We plot for each gene the signal intensity versus the hybridization free energy of RNA/DNA duplexes in solution, for perfect matching and mismatching probes. Both values tend to align on a single master curve in good agreement with Langmuir adsorption theory, provided one takes into account the decrease of the effective target concentration due to target-target hybridization in solution. We give an example of a deviation from the expected thermodynamical behavior for the probe set 1091\_at due to annotation problems, i.e. the surface-bound probe is not the exact complement of the target RNA sequence, because of errors present in public databases at the time when the array was designed. We show that the parametrization of the experimental data with RNA/DNA free energy improves the quality of the fits and enhances the stability of the fitting parameters compared to previous studies.Comment: 11 pages, 16 figures - final version as publishe

arXiv.org e-Print Archive

Crossref

HAL Descartes

Hal-Diderot