Search CORE

46 research outputs found

Borrowing information across genes and experiments for improved error variance estimation in microarray data analysis and statistical inferences for gene expression heterosis

Author: Ji Tieming
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2012
Field of study

The advancement in microarray technology enables the simultaneous measurement of expression levels of thousands of genes. However, due to the relatively high cost of making a replicate in a microarray experiment, the number of replicates in a single experiment is typically small. This results in the small n, large p problem for statistical inferences, where there are gene expression measurements for many genes, but only a few biological replicates (or observations) for each gene. In this dissertation, we develop statistical models and methods for microarray data to borrow information across genes and/or even across experiments to improve statistical inferences for specific biological questions. In Chapter 2, we develop statistical methods to improve the estimation of gene expression error variances. Good estimation of error variances is crucial for detecting differentially expressed genes (genes that differ in mean expression level across treatments or conditions of interest). Since the sample size available for each gene is often low, the usual unbiased estimator of the error variance can be unreliable. Shrinkage methods, including empirical Bayes approaches that borrow information across genes to produce more stable estimates, have been developed in recent years. Because the same microarray platform is often used for at least several experiments to study similar biological systems, there is an opportunity to improve variance estimation further by borrowing information not only across genes but also across experiments. We propose a lognormal model for error variances that involves random gene effects and random experiment effects. Based on the model, we develop an empirical Bayes estimator of the error variance for each combination of gene and experiment and call this estimator BAGE because information is Borrowed Across Genes and Experiments. A permutation strategy is used to make inference about the differential expression status of each gene. Simulation studies with data generated from different probability models and real microarray data show that our method outperforms existing approaches. In Chapter 3, we develop statistical methods to improve the estimation and testing of gene expression heterosis. Heterosis, also known as the hybrid vigor, refers to the superior phenotype of the hybrid offspring relative to its two inbred parents. Though the heterosis phenomenon has been extensively utilized in agriculture for over a century, the molecular basis is still unknown. In an effort to understand the basic mechanisms responsible for the phenotypic heterosis at the molecular level, researchers have begun to compare expression levels of thousands of genes in the parental inbred lines and their offspring to find genes that exhibit gene expression heterosis. In our study, we focus on three types of gene expression heterosis: high-parent heterosis, low-parent heterosis and mid-parent heterosis. Currently, the sample average method is the most commonly used method for estimation and testing of gene expression heterosis. However, the sample average estimators underestimate high-parent heterosis and low-parent heterosis, which consequently leads to loss of power in hypothesis testing. Though the sample average estimator for mid-parent heterosis is unbiased, with only a few replicates in a typical microarray experiment, estimation is highly variable. To improve the estimation and testing of all three types of gene expression heterosis, we develop a hierarchical model, which permits information sharing across genes. Based on the model, we derive empirical Bayes estimators, and test gene expression heterosis using posterior probabilities. The effectiveness of our approach is demonstrated through simulations based on two real heterosis microarray experiments as well as hypothetical probability models that violate our model assumptions. Chapter 4 presents statistical analysis of a soil-based carbon sequestration experiment. Driven by global climate change due to the increasing level of atmospheric carbon dioxide, researchers have proposed a soil-based carbon sequestration approach. A soil-based carbon sequestration approach reduces carbon dioxide emission from crop residues after harvesting and sequesters more carbon into the land as a soil nutrient. Previous research has reported significant differences across species in their rates of residue decomposition and the amount of carbon dioxide emission. Because the biomass composition varies across maize genotypes, we hypothesize that there are also differences among genotypes within the maize species in their rates of biomass decomposition and abilities of carbon sequestration. We designed and performed a longitudinal experiment to measure the amount of carbon dioxide flux from crop stover samples of 14 maize varieties. Flux observations for more than 150 days were collected. We modeled the logarithm of carbon dioxide flux as a linear function of genotype, day, and genotype-by-day interaction effects as well as several other important fixed and random factors. The analysis results show significant differences among maize varieties with respect to the accumulated carbon dioxide flux from crop residues as well as flux pattern over time. We also investigate relationships of carbon dioxide emission and several potentially influential chemical compounds in the maize residue biomass composition. These results suggest the potential for development of carbon capturing crops through bioengineering or hybrid methods

Digital Repository @ Iowa State University (ISU)

Estimation and Testing of Gene Expression Heterosis

Author: Ji Tieming
Liu Peng
Nettleton Dan
Publication venue: Iowa State University Digital Repository
Publication date: 01/09/2014
Field of study

Heterosis, also known as the hybrid vigor, occurs when the mean phenotype of hybrid offspring is superior to that of its two inbred parents. The heterosis phenomenon is extensively utilized in agriculture though the molecular basis is still unknown. In an effort to understand phenotypic heterosis at the molecular level, researchers have begun to compare expression levels of thousands of genes between parental inbred lines and their hybrid offspring to search for evidence of gene expression heterosis. Standard statistical approaches for separately analyzing expression data for each gene can produce biased and highly variable estimates and unreliable tests of heterosis. To address these shortcomings, we develop a hierarchical model to borrow information across genes. Using our modeling framework, we derive empirical Bayes estimators and an inference strategy to identify gene expression heterosis. Simulation results show that our proposed method outperforms the more traditional strategy used to detect gene expression heterosis. This article has supplementary material online

Digital Repository @ Iowa State University (ISU)

Springer - Publisher Connector

PubMed Central

Differential gene expression in response to eCry3.1Ab ingestion in an unselected and eCry3.1Abselected western corn rootworm (Diabrotica virgifera virgifera LeConte) population

Author: Elsik Christine G.
Hibbard Bruce E.
Ji Tieming
Meihls Lisa N.
Shelby Kent S.
Zhao Zixiao
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 20/03/2019
Field of study

Diabrotica virgifera virgifera LeConte, the western corn rootworm (WCR) is one of the most destructive pests in the U.S. Corn Belt. Transgenic maize lines expressing various Cry toxins from Bacillus thuringiensis have been adopted as a management strategy. However, resistance to many Bt toxins has occurred. To investigate the mechanisms of Bt resistance we carried out RNA-seq using Illumina sequencing technology on resistant, eCry3.1Ab-selected and susceptible, unselected, whole WCR neonates which fed on seedling maize with and without eCry3.1Ab for 12 and 24 hours. In a parallel experiment RNA-seq experiments were conducted when only the midgut of neonate WCR was evaluated from the same treatments. After de novo transcriptome assembly we identified differentially expressed genes (DEGs). Results from the assemblies and annotation indicate that WCR neonates from the eCry3.1Ab-selected resistant colony expressed a small number of up and down-regulated genes following Bt intoxication. In contrast, unselected susceptible WCR neonates expressed a large number of up and down-regulated transcripts in response to intoxication. Annotation and pathway analysis of DEGs between susceptible and resistant whole WCR and their midgut tissue revealed genes associated with cell membrane, immune response, detoxification, and potential Bt receptors which are likely related to eCry3.1Ab resistance. This research provides a framework to study the toxicology of Bt toxins and mechanism of resistance in WCR, an economically important coleopteran pest species

Assessment of a Novel VEGF Targeted Agent Using Patient-Derived Tumor Tissue Xenograft Models of Colon Carcinoma with Lymphatic and Hepatic Metastases

Author: A Akcakanat
Alana L. Welm
B Rubio-Viqueira
Binbin Cui
Bojian Xie
C Bozzetti
C Tapia
CL Morton
D Gancberg
E Sasatomi
EA Sausville
F Baffert
F Loupakis
F Molinari
Feilin Cao
Guangliang Li
H Huynh
Haohao Wang
Huanrong Lan
I Fichtner
J Huang
JI Johnson
Jing Zhang
JM Wu
K Jin
K Jin
K Jin
K Jin
K Jin
Ketao Jin
Kuifeng He
Lisong Teng
LS Teng
LS Teng
M Scartozzi
M Scartozzi
M Tanner
M Zhang
M Zhang
M Zhang
MR Mancuso
Na Han
P Baluk
P Regitnig
R Perez-Soler
S Morikawa
S Ogino
S Park
S Schneider
SE Baldus
SE Monaco
T Voskoglou-Nomikos
Tieming Zhu
UE Gibson
Y Gong
Y Wang
Z Li
Zhenzhen Xu
Publication venue: Public Library of Science
Publication date: 02/12/2011
Field of study

The lack of appropriate tumor models of primary tumors and corresponding metastases that can reliably predict for response to anticancer agents remains a major deficiency in the clinical practice of cancer therapy. It was the aim of our study to establish patient-derived tumor tissue (PDTT) xenograft models of colon carcinoma with lymphatic and hepatic metastases useful for testing of novel molecularly targeted agents. PDTT of primary colon carcinoma, lymphatic and hepatic metastases were used to create xenograft models. Hematoxylin and eosin staining, immunohistochemical staining, genome-wide gene expression analysis, pyrosequencing, qRT-PCR, and western blotting were used to determine the biological stability of the xenografts during serial transplantation compared with the original tumor tissues. Early passages of the PDTT xenograft models of primary colon carcinoma, lymphatic and hepatic metastases revealed a high degree of similarity with the original clinical tumor samples with regard to histology, immunohistochemistry, genes expression, and mutation status as well as mRNA expression. After we have ascertained that these xenografts models retained similar histopathological features and molecular signatures as the original tumors, drug sensitivities of the xenografts to a novel VEGF targeted agent, FP3 was evaluated. In this study, PDTT xenograft models of colon carcinoma with lymphatic and hepatic metastasis have been successfully established. They provide appropriate models for testing of novel molecularly targeted agents

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Mu Transposon Insertion Sites and Meiotic Recombination Events Co-Localize with Epigenetic Marks for Open Chromatin across the Maize Genome

Author: A Kong
A Miyao
AC Spradling
AD Cresse
AM Settles
BP May
C Feschotte
C Soderlund
C Yan
Cheng-Ting Yeh
CJ Wang
CR Dietrich
CY Kramer
D Lisch
D Lisch
D Mester
Dan Nettleton
DI Mester
DR McCarty
DS Robertson
E Coe
ED Akhunov
F Qiu
GC Liao
GK Wong
GT Marth
H Candela
H Santos-Rosa
Haiyan Wu
Harmit S. Malik
HK Dooner
Ho Man Tang
J Fernandes
J Li
JL Bennetzen
JL Gerton
JM Kolkman
K Fengler
K Ohtsu
Kai Ying
KC Cone
KJ Hardeman
L Das
LE Palmer
M Alleman
M Falque
M Yamazaki
MN Raizada
MP Ball
N Jiang
P SanMiguel
Patrick S. Schnable
PS Schnable
R Lister
S Hake
S Hanley
S Liu
S Liu
Sanzhen Liu
SJ Cokus
SJ Emrich
SM Fullerton
SN Wood
SN Wood
TD Wu
Tieming Ji
TK Wolfgruber
TP Brutnell
V Borde
V Walbot
WS Cleveland
X Li
X Wang
X Zhang
Y Fu
Y Fu
Yan Fu
Publication venue: Public Library of Science
Publication date: 01/11/2009
Field of study

The Mu transposon system of maize is highly active, with each of the ∼50–100 copies transposing on average once each generation. The approximately one dozen distinct Mu transposons contain highly similar ∼215 bp terminal inverted repeats (TIRs) and generate 9-bp target site duplications (TSDs) upon insertion. Using a novel genome walking strategy that uses these conserved TIRs as primer binding sites, Mu insertion sites were amplified from Mu stocks and sequenced via 454 technology. 94% of ∼965,000 reads carried Mu TIRs, demonstrating the specificity of this strategy. Among these TIRs, 21 novel Mu TIRs were discovered, revealing additional complexity of the Mu transposon system. The distribution of >40,000 non-redundant Mu insertion sites was strikingly non-uniform, such that rates increased in proportion to distance from the centromere. An identified putative Mu transposase binding consensus site does not explain this non-uniformity. An integrated genetic map containing more than 10,000 genetic markers was constructed and aligned to the sequence of the maize reference genome. Recombination rates (cM/Mb) are also strikingly non-uniform, with rates increasing in proportion to distance from the centromere. Mu insertion site frequencies are strongly correlated with recombination rates. Gene density does not fully explain the chromosomal distribution of Mu insertion and recombination sites, because pronounced preferences for the distal portion of chromosome are still observed even after accounting for gene density. The similarity of the distributions of Mu insertions and meiotic recombination sites suggests that common features, such as chromatin structure, are involved in site selection for both Mu insertion and meiotic recombination. The finding that Mu insertions and meiotic recombination sites both concentrate in genomic regions marked with epigenetic marks of open chromatin provides support for the hypothesis that open chromatin enhances rates of both Mu insertion and meiotic recombination

Digital Repository @ Iowa State University (ISU)

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Maize Inbreds Exhibit High Levels of Copy Number Variation (CNV) and Presence/Absence Variation (PAV) in Genome Content

Author: A Ching
A Kato
A Ronchi
A. Leonardo Iniguez
AB Olshen
AJ Sharp
AP Dempster
AP Hsia
AS Lee
B McClintock
BS Everitt
C Workman
CB Della Vedova
Cheng-Ting Yeh
DA Laurie
Dan Nettleton
E Buckler
EL Walker
ES Venkatraman
F Tian
GH Perry
GK Smyth
GK Smyth
GM Cooper
H Fu
H Yao
Heidi Rosenbaum
I Vroh Bi
J Doebley
J Lai
J Messing
J Sebat
Jacob Kitzman
JD Storey
Jeffrey A. Jeddeloh
JF Doebley
JM Kidd
Joseph R. Ecker
JS Beckmann
K Ohtsu
KA Frazer
KA Palaisa
Kai Ying
L Feuk
M Golubovsky
M Morgante
M Stam
M Yamasaki
MD Yandeau-Nelson
ME Hurles
MI Tenaillon
Nathan M. Springer
NM Springer
Patrick S. Schnable
PS Schnable
Q Wang
R Pilu
R Redon
R Song
RA Swanson-Wagner
RA Swanson-Wagner
RA Welch
RM Stupar
RR Selzer
S Brunner
S Liu
SA Flint-Garcia
SB Cannon
SI Wright
SJ Emrich
SM Adawy
SM Smith
SP Moose
SW Scherer
TA Graubert
Tieming Ji
TJ Albert
TK Wolfgruber
Todd Richmond
V Guryev
W. Brad Barbazuk
WB Barbazuk
Wei Wu
WK Chen
WL Brown
Y Fu
Yan Fu
Yi Jia
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

Following the domestication of maize over the past ∼10,000 years, breeders have exploited the extensive genetic diversity of this species to mold its phenotype to meet human needs. The extent of structural variation, including copy number variation (CNV) and presence/absence variation (PAV), which are thought to contribute to the extraordinary phenotypic diversity and plasticity of this important crop, have not been elucidated. Whole-genome, array-based, comparative genomic hybridization (CGH) revealed a level of structural diversity between the inbred lines B73 and Mo17 that is unprecedented among higher eukaryotes. A detailed analysis of altered segments of DNA conservatively estimates that there are several hundred CNV sequences among the two genotypes, as well as several thousand PAV sequences that are present in B73 but not Mo17. Haplotype-specific PAVs contain hundreds of single-copy, expressed genes that may contribute to heterosis and to the extraordinary phenotypic diversity of this important crop

Digital Repository @ Iowa State University (ISU)

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Maternal hyperleptinemia is associated with male offspring’s altered vascular function and structure in mice

Author: A Aceti
A Ericsson
A Mehlem
A Perez-Perez
A Perez-Perez
A Singhal
AL Schenewerk
AM Samuelsson
AM Samuelsson
AR Day
ARS Day
AU Momin
AV Mousiolis
B Erdos
C Aoqui
C Remacle
C Stocker
C Torrens
Christopher A. Foote
CJ Stocker
Constantino C. Reyes-Aldasoro
CS Mantzoros
CY Ooi
D Islami
DA Lawlor
DW Wilde
FE Ramos-Alves
FI Ramirez-Perez
FM Souza-Smith
Francisco I. Ramirez-Perez
G Bifulco
H Yamashita
H Yamashita
HD Intengan
HD Intengan
HJ Jang
Ho-Hsiang Wu
HY Park
I Floris
I Khan
IY Khan
J Gui
J Udagawa
JA Castorena-Gonzalez
JA Kim
JF Caro
JM Ategbo
JS Harrod
K Holemans
K Linnemann
KA McLachlan
KA Pennington
Kathleen A. Pennington
KE Pollock
Kelly E. Pollock
L Garcia-Vargas
L Maple-Brown
LA Martinez-Lemus
LA Martinez-Lemus
LA Martinez-Lemus
Laura C. Schulz
LC Schulz
LC Schulz
Luis A. Martinez-Lemus
M Gil-Ortega
M Groenink
M Kawamura
MC Staiculescu
MJ Durand
MJ Mulvany
MJ Mulvany
MP Magarinos
N Jansson
O Yilmaz
Omonseigho O. Talton
OO Talton
P Cameo
R Nadif
R Wolk
RC Kaufmann
RM Weisbrod
SB Bender
SL Kirk
SM Wells
SO Rocha
Tieming Ji
TL Kleinschmidt
UD Dincer
X Cheng
Yunchao Su
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2016
Field of study

Children of mothers with gestational diabetes have greater risk of developing hypertension but little is known about the mechanisms by which this occurs. The objective of this study was to test the hypothesis that high maternal concentrations of leptin during pregnancy, which are present in mothers with gestational diabetes and/or obesity, alter blood pressure, vascular structure and vascular function in offspring. Wildtype (WT) offspring of hyperleptinemic, normoglycemic, Lepr db/+ dams were compared to genotype matched offspring of WT-control dams. Vascular function was assessed in male offspring at 6, and at 31 weeks of age after half the offspring had been fed a high fat, high sucrose diet (HFD) for 6 weeks. Blood pressure was increased by HFD but not affected by maternal hyperleptinemia. On a standard diet, offspring of hyperleptinemic dams had outwardly remodeled mesenteric arteries and an enhanced vasodilatory response to insulin. In offspring of WT but not Leprdb/+ dams, HFD induced vessel hypertrophy and enhanced vasodilatory responses to acetylcholine, while HFD reduced insulin responsiveness in offspring of hyperleptinemic dams. Offspring of hyperleptinemic dams had stiffer arteries regardless of diet. Therefore, while maternal hyperleptinemia was largely beneficial to offspring vascular health under astandard diet, it had detrimental effects in offspring fed HFD. These results suggest that circulating maternal leptin concentrations may interact with other factors in the pre- and post-natal environments to contribute to altered vascular function in offspring of diabetic pregnancie

Public Library of Science (PLOS)

City Research Online

Crossref

Directory of Open Access Journals

PubMed Central

Borrowing information across genes and experiments for improved error variance estimation in microarray data analysis and statistical inferences for gene expression heterosis

Author: Ji Tieming
Publication venue
Publication date: 01/01/2012
Field of study

The advancement in microarray technology enables the simultaneous measurement of expression levels of thousands of genes. However, due to the relatively high cost of making a replicate in a microarray experiment, the number of replicates in a single experiment is typically small. This results in the "small n, large p" problem for statistical inferences, where there are gene expression measurements for many genes, but only a few biological replicates (or observations) for each gene. In this dissertation, we develop statistical models and methods for microarray data to borrow information across genes and/or even across experiments to improve statistical inferences for specific biological questions. In Chapter 2, we develop statistical methods to improve the estimation of gene expression error variances. Good estimation of error variances is crucial for detecting differentially expressed genes (genes that differ in mean expression level across treatments or conditions of interest). Since the sample size available for each gene is often low, the usual unbiased estimator of the error variance can be unreliable. Shrinkage methods, including empirical Bayes approaches that borrow information across genes to produce more stable estimates, have been developed in recent years. Because the same microarray platform is often used for at least several experiments to study similar biological systems, there is an opportunity to improve variance estimation further by borrowing information not only across genes but also across experiments. We propose a lognormal model for error variances that involves random gene effects and random experiment effects. Based on the model, we develop an empirical Bayes estimator of the error variance for each combination of gene and experiment and call this estimator BAGE because information is Borrowed Across Genes and Experiments. A permutation strategy is used to make inference about the differential expression status of each gene. Simulation studies with data generated from different probability models and real microarray data show that our method outperforms existing approaches. In Chapter 3, we develop statistical methods to improve the estimation and testing of gene expression heterosis. Heterosis, also known as the hybrid vigor, refers to the superior phenotype of the hybrid offspring relative to its two inbred parents. Though the heterosis phenomenon has been extensively utilized in agriculture for over a century, the molecular basis is still unknown. In an effort to understand the basic mechanisms responsible for the phenotypic heterosis at the molecular level, researchers have begun to compare expression levels of thousands of genes in the parental inbred lines and their offspring to find genes that exhibit gene expression heterosis. In our study, we focus on three types of gene expression heterosis: high-parent heterosis, low-parent heterosis and mid-parent heterosis. Currently, the sample average method is the most commonly used method for estimation and testing of gene expression heterosis. However, the sample average estimators underestimate high-parent heterosis and low-parent heterosis, which consequently leads to loss of power in hypothesis testing. Though the sample average estimator for mid-parent heterosis is unbiased, with only a few replicates in a typical microarray experiment, estimation is highly variable. To improve the estimation and testing of all three types of gene expression heterosis, we develop a hierarchical model, which permits information sharing across genes. Based on the model, we derive empirical Bayes estimators, and test gene expression heterosis using posterior probabilities. The effectiveness of our approach is demonstrated through simulations based on two real heterosis microarray experiments as well as hypothetical probability models that violate our model assumptions. Chapter 4 presents statistical analysis of a soil-based carbon sequestration experiment. Driven by global climate change due to the increasing level of atmospheric carbon dioxide, researchers have proposed a soil-based carbon sequestration approach. A soil-based carbon sequestration approach reduces carbon dioxide emission from crop residues after harvesting and sequesters more carbon into the land as a soil nutrient. Previous research has reported significant differences across species in their rates of residue decomposition and the amount of carbon dioxide emission. Because the biomass composition varies across maize genotypes, we hypothesize that there are also differences among genotypes within the maize species in their rates of biomass decomposition and abilities of carbon sequestration. We designed and performed a longitudinal experiment to measure the amount of carbon dioxide flux from crop stover samples of 14 maize varieties. Flux observations for more than 150 days were collected. We modeled the logarithm of carbon dioxide flux as a linear function of genotype, day, and genotype-by-day interaction effects as well as several other important fixed and random factors. The analysis results show significant differences among maize varieties with respect to the accumulated carbon dioxide flux from crop residues as well as flux pattern over time. We also investigate relationships of carbon dioxide emission and several potentially influential chemical compounds in the maize residue biomass composition. These results suggest the potential for development of "carbon capturing crops" through bioengineering or hybrid methods.</p

Digital Repository @ Iowa State University (ISU)

Detecting differentially expressed genes for syndromes by considering change in mean and dispersion simultaneously

Author: Chenchen Ma
Tieming Ji
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/09/2018
Field of study

Abstract Background Using next-generation sequencing technology to measure gene expression, an empirically intriguing question concerns the identification of differentially expressed genes across treatment groups. Existing methods aim to identify genes whose mean expressions differ among treatment groups by assuming equal dispersion across all groups. For syndromes, however, various combinations of gene expression alterations can result in the same disease, leading to greater heteroscedasticity in the biological replicates in the disease group compared to the normal group. Traditional methods that only consider changes in the mean will fail to fully analyze gene expression in such a scenario. In addition, sequencing technology is relatively expensive; most labs can only afford a few replicates per treatment group, which poses further challenges to reliably estimating the mean and dispersion under each treatment condition. Results We designed an empirical Bayes method and a pooled permutation test to simultaneously consider the change in mean and dispersion across treatment groups. We further computed confidence intervals based on Bayes estimates to identify differentially expressed genes that are unique to each disease sample as well as those that are common across all disease samples. We illustrated our method by applying it to gene expression data from a large offspring syndrome experiment, which motivated this study. We compared our method to competing approaches through simulation studies that mimicked the real datasets to demonstrate the effectiveness of our proposed method. Conclusions We will show that, compared to popular methods that only aim to find the difference in the mean, our method can capture greater variation in the disease group to effectively identify differentially expressed genes for syndromes

Directory of Open Access Journals