Search CORE

149 research outputs found

Effect of various normalization methods on Applied Biosystems expression array system data

Author: A Hartemink
BM Bolstad
CA Heid
Catalin C Barbacioru
David N Keys
EF Petricoin 3rd
Frances Chan
GK Smyth
JL Hackett
Karen A Poulter
L Guo
R Canales
Raymond R Samaha
Roger D Canales
S Dudoit
T Patterson
UE Gibson
V Tusher
W Huber
WS Cleveland
Y Benjamini
Y Wang
YH Yang
YH Yang
Yongming A Sun
Yulei Wang
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: DNA microarray technology provides a powerful tool for characterizing gene expression on a genome scale. While the technology has been widely used in discovery-based medical and basic biological research, its direct application in clinical practice and regulatory decision-making has been questioned. A few key issues, including the reproducibility, reliability, compatibility and standardization of microarray analysis and results, must be critically addressed before any routine usage of microarrays in clinical laboratory and regulated areas can occur. In this study we investigate some of these issues for the Applied Biosystems Human Genome Survey Microarrays. RESULTS: We analyzed the gene expression profiles of two samples: brain and universal human reference (UHR), a mixture of RNAs from 10 cancer cell lines, using the Applied Biosystems Human Genome Survey Microarrays. Five technical replicates in three different sites were performed on the same total RNA samples according to manufacturer's standard protocols. Five different methods, quantile, median, scale, VSN and cyclic loess were used to normalize AB microarray data within each site. 1,000 genes spanning a wide dynamic range in gene expression levels were selected for real-time PCR validation. Using the TaqMan(® )assays data set as the reference set, the performance of the five normalization methods was evaluated focusing on the following criteria: (1) Sensitivity and reproducibility in detection of expression; (2) Fold change correlation with real-time PCR data; (3) Sensitivity and specificity in detection of differential expression; (4) Reproducibility of differentially expressed gene lists. CONCLUSION: Our results showed a high level of concordance between these normalization methods. This is true, regardless of whether signal, detection, variation, fold change measurements and reproducibility were interrogated. Furthermore, we used TaqMan(® )assays as a reference, to generate TPR and FDR plots for the various normalization methods across the assay range. Little impact is observed on the TP and FP rates in detection of differentially expressed genes. Additionally, little effect was observed by the various normalization methods on the statistical approaches analyzed which indicates a certain robustness of the analysis methods currently in use in the field, particularly when used in conjunction with the Applied Biosystems Gene Expression System

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Constructing non-stationary Dynamic Bayesian Networks with a flexible lag choosing mechanism

Author: A Bernard
A Hall
A Nobile
A Para
AJ Hartemink
AV Werhli
CA Benedict
D Heckerman
D Husmeier
F Guo
H Duan
H Yu
HH McAdams
J Yu
Jr JD
Jun Huan
JW Robinson
K Honda
K Murphy
M Grzegorczy
M Zou
MF Covington
MN Arbeitman
N Friedman
N Nariai
P Mas
PA Salome
PJ Green
RM Cripps
S Chib
S Imoto
S Imoto
S Raza
SY Kim
T Mizuno
T Sandmann
W Zhao
W Zhao
Yi Jia
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Dynamic Bayesian Networks (DBNs) are widely used in regulatory network structure inference with gene expression data. Current methods assumed that the underlying stochastic processes that generate the gene expression data are stationary. The assumption is not realistic in certain applications where the intrinsic regulatory networks are subject to changes for adapting to internal or external stimuli. Results In this paper we investigate a novel non-stationary DBNs method with a potential regulator detection technique and a flexible lag choosing mechanism. We apply the approach for the gene regulatory network inference on three non-stationary time series data. For the Macrophages and Arabidopsis data sets with the reference networks, our method shows better network structure prediction accuracy. For the Drosophila data set, our approach converges faster and shows a better prediction accuracy on transition times. In addition, our reconstructed regulatory networks on the Drosophila data not only share a lot of similarities with the predictions of the work of other researchers but also provide many new structural information for further investigation. Conclusions Compared with recent proposed non-stationary DBNs methods, our approach has better structure prediction accuracy By detecting potential regulators, our method reduces the size of the search space, hence may speed up the convergence of MCMC sampling.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

KU ScholarWorks

PubMed Central

A classification-based framework for predicting and analyzing gene regulatory response

Author: AJ Hartemink
Anshul Kundaje
AP Gasch
AP Gasch
Chris H Wiggins
Christina Leslie
CI Holmberg
D Pe'er
D Pe'er
D Pollard
DC Raitt
E Ramil
E Segal
E Segal
ER Gansner
HJ Bussemaker
I Ota
I Pedruzzi
J Ihmels
JD Hughes
JT Lin
M Middendorf
M Middendorf
M Middendorf
MA Beer
Manuel Middendorf
Mihir Shah
P Zarzov
RE Schapire
TI Lee
VK Vyas
W Hoeffding
Y Pilpel
Yoav Freund
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: We have recently introduced a predictive framework for studying gene transcriptional regulation in simpler organisms using a novel supervised learning algorithm called GeneClass. GeneClass is motivated by the hypothesis that in model organisms such as Saccharomyces cerevisiae, we can learn a decision rule for predicting whether a gene is up- or down-regulated in a particular microarray experiment based on the presence of binding site subsequences ("motifs") in the gene's regulatory region and the expression levels of regulators such as transcription factors in the experiment ("parents"). GeneClass formulates the learning task as a classification problem — predicting +1 and -1 labels corresponding to up- and down-regulation beyond the levels of biological and measurement noise in microarray measurements. Using the Adaboost algorithm, GeneClass learns a prediction function in the form of an alternating decision tree, a margin-based generalization of a decision tree. METHODS: In the current work, we introduce a new, robust version of the GeneClass algorithm that increases stability and computational efficiency, yielding a more scalable and reliable predictive model. The improved stability of the prediction tree enables us to introduce a detailed post-processing framework for biological interpretation, including individual and group target gene analysis to reveal condition-specific regulation programs and to suggest signaling pathways. Robust GeneClass uses a novel stabilized variant of boosting that allows a set of correlated features, rather than single features, to be included at nodes of the tree; in this way, biologically important features that are correlated with the single best feature are retained rather than decorrelated and lost in the next round of boosting. Other computational developments include fast matrix computation of the loss function for all features, allowing scalability to large datasets, and the use of abstaining weak rules, which results in a more shallow and interpretable tree. We also show how to incorporate genome-wide protein-DNA binding data from ChIP chip experiments into the GeneClass algorithm, and we use an improved noise model for gene expression data. RESULTS: Using the improved scalability of Robust GeneClass, we present larger scale experiments on a yeast environmental stress dataset, training and testing on all genes and using a comprehensive set of potential regulators. We demonstrate the improved stability of the features in the learned prediction tree, and we show the utility of the post-processing framework by analyzing two groups of genes in yeast — the protein chaperones and a set of putative targets of the Nrg1 and Nrg2 transcription factors — and suggesting novel hypotheses about their transcriptional and post-transcriptional regulation. Detailed results and Robust GeneClass source code is available for download from

Crossref

Springer - Publisher Connector

Columbia University Academic Commons

PubMed Central

High prevalence of hyperglycaemia and the impact of high household income in transforming Rural China

Author: AR Omran
BM Popkin
BM Popkin
C Albala
Chaowei Fu
E Ferrannini
EE Agardh
F de Vegt
Fadi Wang
G Danaei
GK Dowse
H Harati
H Tian
H Zuo
HC Gerstein
JF Grant
Jiangen Song
JJ Wang
JM Robbins
K Sen
KT Chen
LM Li
LN Borrell
M Tang
MM Gabir
N Hartemink
National Diabetes Research Group
P Amuna
PA Metcalf
Qingwu Jiang
S Du
S Du
S Genuth
S Wild
SC Maty
SS Rasmussen
V Kosulwat
W Yang
XR Pan
Xuecai Wang
Y Liu
Yue Chen
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background The prevalence of hyperglycaemia and its association with socioeconomic factors have been well studied in developed countries, however, little is known about them in transforming rural China. Methods A cross-sectional study was carried out in 4 rural communities of Deqing County located in East China in 2006-07, including 4,506 subjects aged 18 to 64 years. Fasting plasma glucose (FPG) was measured. Subjects were considered to have impaired fasting glucose (IFG) if FPG was in the range from 5.6 to 6.9 mmol/L and to have diabetes mellitus (DM) if FG was 7.0 mmol/L or above. Results The crude prevalences of IFG and DM were 5.4% and 2.2%, respectively. The average ratio of IFG/DM was 2.5, and tended to be higher for those under the age of 35 years than older subjects. After adjustment for covariates including age (continuous), sex, BMI (continuous), smoking, alcohol drinking, and regular leisure physical activity, subjects in the high household income group had a significantly higher risk of IFG compared with the medium household income group (OR: 1.74, 95% CI: 1.11-2.72) and no significant difference in IFG was observed between the low and medium household income groups. Education and farmer occupation were not significantly associated with IFG. Conclusions High household income was significantly associated with an increased risk of IFG. A high ratio of IFG/DM suggests a high risk of diabetes in foreseeable future in the Chinese transforming rural communities.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Environmental Correlation and Spatial Autocorrelation of Soil Properties in Keller Peninsula, Maritime Antarctica

Author: André Geraldo de Lima Moraes
André Thomazini
Beyer L
Carlos Ernesto Gonçalves Reynaud Schaefer
Carvalho Junior W
Chagas CS
Ciampalini R
Claessen MEC
Francelino MR
Goodman JM
Hartemink AE
Kvålseth TO
Lagacherie P
Malone BP
Marcio Rocha Francelino
Marcos Gervasio Pereira
McBratney AB
Mendes Junior CW
Moura PA
Pahlavan-Rad MR
Pereira AB
Simas FNB
Souza JJLL
Sulaeman Y
Thomazini A
Vaysse K
Victoria FC
Waldir de Carvalho Junior
Yeomans JC
Publication venue: 'FapUNIFESP (SciELO)'
Publication date
Field of study

Crossref

Factors affecting patterns of tick parasitism on forest rodents in tick-borne encephalitis risk areas, Germany

Identifying factors affecting individual vector burdens is essential for understanding infectious disease systems. Drawing upon data of a rodent monitoring programme conducted in nine different forest patches in southern Hesse, Germany, we developed models which predict tick (Ixodes spp. and Dermacentor spp.) burdens on two rodent species Apodemus flavicollis and Myodes glareolus. Models for the two rodent species were broadly similar but differed in some aspects. Patterns of Ixodes spp. burdens were influenced by extrinsic factors such as season, unexplained spatial variation (both species), relative humidity and vegetation cover (A. flavicollis). We found support for the ‘body mass’ (tick burdens increase with body mass/age) and for the ‘dilution’ hypothesis (tick burdens decline with increasing rodent densities) and little support for the ‘sex-bias’ hypothesis (both species). Surprisingly, roe deer densities were not correlated with larvae counts on rodents. Factors influencing the mean burden did not significantly explain the observed dispersion of tick counts. Co-feeding aggregations, which are essential for tick-borne disease transmission, were mainly found in A. flavicollis of high body mass trapped in areas with fast increase in spring temperatures. Locally, Dermacentor spp. appears to be an important parasite on A. flavicollis and M. glareolus. Dermacentor spp. was rather confined to areas with higher average temperatures during the vegetation period. Nymphs of Dermacentor spp. mainly fed on M. glareolus and were seldom found on A. flavicollis. Whereas Ixodes spp. is the dominant tick genus in woodlands of our study area, the distribution and epidemiological role of Dermacentor spp. should be monitored closely

Crossref

Springer - Publisher Connector

PubMed Central

Publikationsserver des Robert Koch-Instituts

Gallstone ileus: correlation between computed tomography, double-balloon enteroscopy and intra-operative findings

Author: AA Ayantunde
DN Lobo
F Lassandro
G Muthukumarasamy
H Lübbers
H Schwacha
Indra C. Pieters-van den Bos
JC Rodriguez-Sanjuan
Koen J. Hartemink
LE Browning
MD Zielinski
PA Vagefi
RM Reisner
Stijn J. B. van Weyenberg
Suzanne M. Hepp
W Kirchmayr
Publication venue
Publication date: 01/01/2010
Field of study

Crossref

Magnetic susceptibility in the prediction of soil attributes in two sugarcane harvesting management systems

Crossref

Towards an integrated approach in surveillance of vector-borne diseases in Europe

Vector borne disease (VBD) emergence is a complex and dynamic process. Interactions between multiple disciplines and responsible health and environmental authorities are often needed for an effective early warning, surveillance and control of vectors and the diseases they transmit. To fully appreciate this complexity, integrated knowledge about the human and the vector population is desirable. In the current paper, important parameters and terms of both public health and medical entomology are defined in order to establish a common language that facilitates collaboration between the two disciplines. Special focus is put on the different VBD contexts with respect to the current presence or absence of the disease, the pathogen and the vector in a given location. Depending on the context, whether a VBD is endemic or not, surveillance activities are required to assess disease burden or threat, respectively. Following a decision for action, surveillance activities continue to assess trends

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central