11 research outputs found
Clustering and Differential Alignment Algorithm: Identification of Early Stage Regulators in the <i>Arabidopsis thaliana</i> Iron Deficiency Response
<div><p>Time course transcriptome datasets are commonly used to predict key gene regulators associated with stress responses and to explore gene functionality. Techniques developed to extract causal relationships between genes from high throughput time course expression data are limited by low signal levels coupled with noise and sparseness in time points. We deal with these limitations by proposing the Cluster and Differential Alignment Algorithm (CDAA). This algorithm was designed to process transcriptome data by first grouping genes based on stages of activity and then using similarities in gene expression to predict influential connections between individual genes. Regulatory relationships are assigned based on pairwise alignment scores generated using the expression patterns of two genes and some inferred delay between the regulator and the observed activity of the target. We applied the CDAA to an iron deficiency time course microarray dataset to identify regulators that influence 7 target transcription factors known to participate in the <i>Arabidopsis thaliana</i> iron deficiency response. The algorithm predicted that 7 regulators previously unlinked to iron homeostasis influence the expression of these known transcription factors. We validated over half of predicted influential relationships using qRT-PCR expression analysis in mutant backgrounds. One predicted regulator-target relationship was shown to be a direct binding interaction according to yeast one-hybrid (Y1H) analysis. These results serve as a proof of concept emphasizing the utility of the CDAA for identifying unknown or missing nodes in regulatory cascades, providing the fundamental knowledge needed for constructing predictive gene regulatory networks. We propose that this tool can be used successfully for similar time course datasets to extract additional information and infer reliable regulatory connections for individual genes.</p></div
Regulatory interactions inference algorithms.
<p>Regulatory interactions inference algorithms.</p
Regulatory relationships predicted by the CDAA.
<p>Predicted regulations between 7 early stage transcription factors and 7 known iron homeostasis transcription factors. Edges indicating positive regulations are green and edges indicating negative regulations are red.</p
Expression validation of predicted targets in mutant regulator backgrounds.
<p>Root tissue was collected from seedlings grown 4 days on iron sufficient media and transferred to iron deficient media for 3 days. Expression values are normalized to <i><i>β</i>-tubulin</i> and to WT (Col-0) expression for each gene. Error bars indicate ±SEM (n = 4). Mutant backgrounds are (A)<i>obp4-1</i>, (B)<i>wrky57-3</i>, (C)<i>etf9-1</i>, (D)<i>col4-1</i>, (E)<i>asil2-1</i>, and (F)<i>myb55-1</i>. Asterisk indicates significant difference from WT (Student’s t-test, <i>p</i> < 0.05).</p
Gene to Stage Assignment.
<p>Genes active before the Initiation-Response (I-R) boundary are assigned to the INITIATION STAGE. Genes that start their activity after the I-R boundary are assigned to the RESPONSE STAGE. Primary response genes are active right after the I-R boundary and Secondary response genes are active later.</p
Average number of changes above the threshold per gene.
<p>Changes in expression (<i>s</i><sub><i>n</i></sub>(<i>g</i><sub><i>i</i></sub>, <i>k</i>), <i>k</i> = 1,…,4) for Initiation stage genes were thresholded with a range of cutoff values. The graph shows the average number of changes that exceed the threshold per gene out of 4 possible changes.</p
Number of genes in each gene set (cardinality).
<p>Gene set <sub><i>n</i></sub>, <i>n</i> = 1,…,6, is comprised of genes whose maximum change occurs over the interval (<i>t</i><sub><i>n</i></sub>, <i>t</i><sub><i>n</i>+1</sub>).</p
5 points based subclustering.
<p>Clustering based on centered expression values . <i>n</i><sub><i>genes</i></sub>—number of genes in each cluster and <i>n</i><sub><i>TF</i></sub>—number of Transcription Factors in each cluster.</p