20 research outputs found

    Heartbeat of the Sun from Principal Component Analysis and prediction of solar activity on a millenium timescale

    Get PDF
    yesWe derive two principal components (PCs) of temporal magnetic field variations over the solar cycles 21–24 from full disk magnetograms covering about 39% of data variance, with σ = 0.67. These PCs are attributed to two main magnetic waves travelling from the opposite hemispheres with close frequencies and increasing phase shift. Using symbolic regeression analysis we also derive mathematical formulae for these waves and calculate their summary curve which we show is linked to solar activity index. Extrapolation of the PCs backward for 800 years reveals the two 350-year grand cycles superimposed on 22 year-cycles with the features showing a remarkable resemblance to sunspot activity reported in the past including the Maunder and Dalton minimum. The summary curve calculated for the next millennium predicts further three grand cycles with the closest grand minimum occurring in the forthcoming cycles 26–27 with the two magnetic field waves separating into the opposite hemispheres leading to strongly reduced solar activity. These grand cycle variations are probed by α − Ω dynamo model with meridional circulation. Dynamo waves are found generated with close frequencies whose interaction leads to beating effects responsible for the grand cycles (350–400 years) superimposed on a standard 22 year cycle. This approach opens a new era in investigation and confident prediction of solar activity on a millenium timescale

    Multivariate curve resolution of time course microarray data

    Get PDF
    BACKGROUND: Modeling of gene expression data from time course experiments often involves the use of linear models such as those obtained from principal component analysis (PCA), independent component analysis (ICA), or other methods. Such methods do not generally yield factors with a clear biological interpretation. Moreover, implicit assumptions about the measurement errors often limit the application of these methods to log-transformed data, destroying linear structure in the untransformed expression data. RESULTS: In this work, a method for the linear decomposition of gene expression data by multivariate curve resolution (MCR) is introduced. The MCR method is based on an alternating least-squares (ALS) algorithm implemented with a weighted least squares approach. The new method, MCR-WALS, extracts a small number of basis functions from untransformed microarray data using only non-negativity constraints. Measurement error information can be incorporated into the modeling process and missing data can be imputed. The utility of the method is demonstrated through its application to yeast cell cycle data. CONCLUSION: Profiles extracted by MCR-WALS exhibit a strong correlation with cell cycle-associated genes, but also suggest new insights into the regulation of those genes. The unique features of the MCR-WALS algorithm are its freedom from assumptions about the underlying linear model other than the non-negativity of gene expression, its ability to analyze non-log-transformed data, and its use of measurement error information to obtain a weighted model and accommodate missing measurements

    Magnitude and Timing of Leaf Damage Affect Seed Production in a Natural Population of Arabidopsis thaliana (Brassicaceae)

    Get PDF
    Background: The effect of herbivory on plant fitness varies widely. Understanding the causes of this variation is of considerable interest because of its implications for plant population dynamics and trait evolution. We experimentally defoliated the annual herb Arabidopsis thaliana in a natural population in Sweden to test the hypotheses that (a) plant fitness decreases with increasing damage, (b) tolerance to defoliation is lower before flowering than during flowering, and (c) defoliation before flowering reduces number of seeds more strongly than defoliation during flowering, but the opposite is true for effects on seed size. Methodology/Principal Findings: In a first experiment, between 0 and 75% of the leaf area was removed in May from plants that flowered or were about to start flowering. In a second experiment, 0, 25%, or 50% of the leaf area was removed from plants on one of two occasions, in mid April when plants were either in the vegetative rosette or bolting stage, or in mid May when plants were flowering. In the first experiment, seed production was negatively related to leaf area removed, and at the highest damage level, also mean seed size was reduced. In the second experiment, removal of 50% of the leaf area reduced seed production by 60% among plants defoliated early in the season at the vegetative rosettes, and by 22% among plants defoliated early in the season at the bolting stage, but did not reduce seed output of plants defoliated one month later. No seasonal shift in the effect of defoliation on seed size was detected. Conclusions/Significance: The results show that leaf damage may reduce the fitness of A. thaliana, and suggest that in this population leaf herbivores feeding on plants before flowering should exert stronger selection on defence traits than those feeding on plants during flowering, given similar damage levels

    Genomic Analysis of QTLs and Genes Altering Natural Variation in Stochastic Noise

    Get PDF
    Quantitative genetic analysis has long been used to study how natural variation of genotype can influence an organism's phenotype. While most studies have focused on genetic determinants of phenotypic average, it is rapidly becoming understood that stochastic noise is genetically determined. However, it is not known how many traits display genetic control of stochastic noise nor how broadly these stochastic loci are distributed within the genome. Understanding these questions is critical to our understanding of quantitative traits and how they relate to the underlying causal loci, especially since stochastic noise may be directly influenced by underlying changes in the wiring of regulatory networks. We identified QTLs controlling natural variation in stochastic noise of glucosinolates, plant defense metabolites, as well as QTLs for stochastic noise of related transcripts. These loci included stochastic noise QTLs unique for either transcript or metabolite variation. Validation of these loci showed that genetic polymorphism within the regulatory network alters stochastic noise independent of effects on corresponding average levels. We examined this phenomenon more globally, using transcriptomic datasets, and found that the Arabidopsis transcriptome exhibits significant, heritable differences in stochastic noise. Further analysis allowed us to identify QTLs that control genomic stochastic noise. Some genomic QTL were in common with those altering average transcript abundance, while others were unique to stochastic noise. Using a single isogenic population, we confirmed that natural variation at ELF3 alters stochastic noise in the circadian clock and metabolism. Since polymorphisms controlling stochastic noise in genomic phenotypes exist within wild germplasm for naturally selected phenotypes, this suggests that analysis of Arabidopsis evolution should account for genetic control of stochastic variance and average phenotypes. It remains to be determined if natural genetic variation controlling stochasticity is equally distributed across the genomes of other multi-cellular eukaryotes

    A Systems Biology Approach Identifies a R2R3 MYB Gene Subfamily with Distinct and Overlapping Functions in Regulation of Aliphatic Glucosinolates

    Get PDF
    BACKGROUND: Glucosinolates are natural metabolites in the order Brassicales that defend plants against both herbivores and pathogens and can attract specialized insects. Knowledge about the genes controlling glucosinolate regulation is limited. Here, we identify three R2R3 MYB transcription factors regulating aliphatic glucosinolate biosynthesis in Arabidopsis by combining several systems biology tools. METHODOLOGY/PRINCIPAL FINDINGS: MYB28 was identified as a candidate regulator of aliphatic glucosinolates based on its co-localization within a genomic region controlling variation both in aliphatic glucosinolate content (metabolite QTL) and in transcript level for genes involved in the biosynthesis of aliphatic glucosinolates (expression QTL), as well as its co-expression with genes in aliphatic glucosinolate biosynthesis. A phylogenetic analysis with the R2R3 motif of MYB28 showed that it and two homologues, MYB29 and MYB76, were members of an Arabidopsis-specific clade that included three characterized regulators of indole glucosinolates. Over-expression of the individual MYB genes showed that they all had the capacity to increase the production of aliphatic glucosinolates in leaves and seeds and induce gene expression of aliphatic biosynthetic genes within leaves. Analysis of leaves and seeds of single knockout mutants showed that mutants of MYB29 and MYB76 have reductions in only short-chained aliphatic glucosinolates whereas a mutant in MYB28 has reductions in both short- and long-chained aliphatic glucosinolates. Furthermore, analysis of a double knockout in MYB28 and MYB29 identified an emergent property of the system since the absence of aliphatic glucosinolates in these plants could not be predicted by the chemotype of the single knockouts. CONCLUSIONS/SIGNIFICANCE: It seems that these cruciferous-specific MYB regulatory genes have evolved both overlapping and specific regulatory capacities. This provides a unique system within which to study the evolution of MYB regulatory factors and their downstream targets

    Post hoc pattern matching: assigning significance to statistically defined expression patterns in single channel microarray data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Researchers using RNA expression microarrays in experimental designs with more than two treatment groups often identify statistically significant genes with ANOVA approaches. However, the ANOVA test does not discriminate which of the multiple treatment groups differ from one another. Thus, <it>post hoc </it>tests, such as linear contrasts, template correlations, and pairwise comparisons are used. Linear contrasts and template correlations work extremely well, especially when the researcher has <it>a priori </it>information pointing to a particular pattern/template among the different treatment groups. Further, all pairwise comparisons can be used to identify particular, treatment group-dependent patterns of gene expression. However, these approaches are biased by the researcher's assumptions, and some treatment-based patterns may fail to be detected using these approaches. Finally, different patterns may have different probabilities of occurring by chance, importantly influencing researchers' conclusions about a pattern and its constituent genes.</p> <p>Results</p> <p>We developed a four step, <it>post hoc </it>pattern matching (PPM) algorithm to automate single channel gene expression pattern identification/significance. First, 1-Way Analysis of Variance (ANOVA), coupled with <it>post hoc </it>'all pairwise' comparisons are calculated for all genes. Second, for each ANOVA-significant gene, all pairwise contrast results are encoded to create unique pattern ID numbers. The # genes found in each pattern in the data is identified as that pattern's 'actual' frequency. Third, using Monte Carlo simulations, those patterns' frequencies are estimated in random data ('random' gene pattern frequency). Fourth, a Z-score for overrepresentation of the pattern is calculated ('actual' against 'random' gene pattern frequencies). We wrote a Visual Basic program (StatiGen) that automates PPM procedure, constructs an Excel workbook with standardized graphs of overrepresented patterns, and lists of the genes comprising each pattern. The visual basic code, installation files for StatiGen, and sample data are available as supplementary material.</p> <p>Conclusion</p> <p>The PPM procedure is designed to augment current microarray analysis procedures by allowing researchers to incorporate all of the information from post hoc tests to establish unique, overarching gene expression patterns in which there is no overlap in gene membership. In our hands, PPM works well for studies using from three to six treatment groups in which the researcher is interested in treatment-related patterns of gene expression. Hardware/software limitations and extreme number of theoretical expression patterns limit utility for larger numbers of treatment groups. Applied to a published microarray experiment, the StatiGen program successfully flagged patterns that had been manually assigned in prior work, and further identified other gene expression patterns that may be of interest. Thus, over a moderate range of treatment groups, PPM appears to work well. It allows researchers to assign statistical probabilities to patterns of gene expression that fit <it>a priori </it>expectations/hypotheses, it preserves the data's ability to show the researcher interesting, yet unanticipated gene expression patterns, and assigns the majority of ANOVA-significant genes to non-overlapping patterns.</p
    corecore