160 research outputs found

    Combining classifiers to predict gene function in Arabidopsis thaliana using large-scale gene expression measurements

    Get PDF
    <p>Abstract</p> <p>Background</p> <p><it>Arabidopsis thaliana </it>is the model species of current plant genomic research with a genome size of 125 Mb and approximately 28,000 genes. The function of half of these genes is currently unknown. The purpose of this study is to infer gene function in Arabidopsis using machine-learning algorithms applied to large-scale gene expression data sets, with the goal of identifying genes that are potentially involved in plant response to abiotic stress.</p> <p>Results</p> <p>Using in house and publicly available data, we assembled a large set of gene expression measurements for <it>A. thaliana</it>. Using those genes of known function, we first evaluated and compared the ability of basic machine-learning algorithms to predict which genes respond to stress. Predictive accuracy was measured using ROC<sub>50 </sub>and precision curves derived through cross validation. To improve accuracy, we developed a method for combining these classifiers using a weighted-voting scheme. The combined classifier was then trained on genes of known function and applied to genes of unknown function, identifying genes that potentially respond to stress. Visual evidence corroborating the predictions was obtained using electronic Northern analysis. Three of the predicted genes were chosen for biological validation. Gene knockout experiments confirmed that all three are involved in a variety of stress responses. The biological analysis of one of these genes (At1g16850) is presented here, where it is shown to be necessary for the normal response to temperature and NaCl.</p> <p>Conclusion</p> <p>Supervised learning methods applied to large-scale gene expression measurements can be used to predict gene function. However, the ability of basic learning methods to predict stress response varies widely and depends heavily on how much dimensionality reduction is used. Our method of combining classifiers can improve the accuracy of such predictions – in this case, predictions of genes involved in stress response in plants – and it effectively chooses the appropriate amount of dimensionality reduction automatically. The method provides a useful means of identifying genes in <it>A. thaliana </it>that potentially respond to stress, and we expect it would be useful in other organisms and for other gene functions.</p

    C-terminal motif prediction in eukaryotic proteomes using comparative genomics and statistical over-representation across protein families

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The carboxy termini of proteins are a frequent site of activity for a variety of biologically important functions, ranging from post-translational modification to protein targeting. Several short peptide motifs involved in protein sorting roles and dependent upon their proximity to the C-terminus for proper function have already been characterized. As a limited number of such motifs have been identified, the potential exists for genome-wide statistical analysis and comparative genomics to reveal novel peptide signatures functioning in a C-terminal dependent manner. We have applied a novel methodology to the prediction of C-terminal-anchored peptide motifs involving a simple z-statistic and several techniques for improving the signal-to-noise ratio.</p> <p>Results</p> <p>We examined the statistical over-representation of position-specific C-terminal tripeptides in 7 eukaryotic proteomes. Sequence randomization models and simple-sequence masking were applied to the successful reduction of background noise. Similarly, as C-terminal homology among members of large protein families may artificially inflate tripeptide counts in an irrelevant and obfuscating manner, gene-family clustering was performed prior to the analysis in order to assess tripeptide over-representation across protein families as opposed to across all proteins. Finally, comparative genomics was used to identify tripeptides significantly occurring in multiple species. This approach has been able to predict, to our knowledge, all C-terminally anchored targeting motifs present in the literature. These include the PTS1 peroxisomal targeting signal (SKL*), the ER-retention signal (K/HDEL*), the ER-retrieval signal for membrane bound proteins (KKxx*), the prenylation signal (CC*) and the CaaX box prenylation motif. In addition to a high statistical over-representation of these known motifs, a collection of significant tripeptides with a high propensity for biological function exists between species, among kingdoms and across eukaryotes. Motifs of note include a serine-acidic peptide (DSD*) as well as several lysine enriched motifs found in nearly all eukaryotic genomes examined.</p> <p>Conclusion</p> <p>We have successfully generated a high confidence representation of eukaryotic motifs anchored at the C-terminus. A high incidence of true-positives in our results suggests that several previously unidentified tripeptide patterns are strong candidates for representing novel peptide motifs of a widely employed nature in the C-terminal biology of eukaryotes. Our application of comparative genomics, statistical over-representation and the adjustment for protein family homology has generated several hypotheses concerning the C-terminal topology as it pertains to sorting and potential protein interaction signals. This approach to background reduction could be expanded for application to protein motif prediction in the protein interior. A parallel N-terminal analysis is presented as supplementary data.</p

    The role of the Arabidopsis FUSCA3 transcription factor during inhibition of seed germination at high temperature

    Get PDF
    Abstract Background Imbibed seeds integrate environmental and endogenous signals to break dormancy and initiate growth under optimal conditions. Seed maturation plays an important role in determining the survival of germinating seeds, for example one of the roles of dormancy is to stagger germination to prevent mass growth under suboptimal conditions. The B3-domain transcription factor FUSCA3 (FUS3) is a master regulator of seed development and an important node in hormonal interaction networks in Arabidopsis thaliana. Its function has been mainly characterized during embryonic development, where FUS3 is highly expressed to promote seed maturation and dormancy by regulating ABA/GA levels. Results In this study, we present evidence for a role of FUS3 in delaying seed germination at supraoptimal temperatures that would be lethal for the developing seedlings. During seed imbibition at supraoptimal temperature, the FUS3 promoter is reactivated and induces de novo synthesis of FUS3 mRNA, followed by FUS3 protein accumulation. Genetic analysis shows that FUS3 contributes to the delay of seed germination at high temperature. Unlike WT, seeds overexpressing FUS3 (ML1:FUS3-GFP) during imbibition are hypersensitive to high temperature and do not germinate, however, they can fully germinate after recovery at control temperature reaching 90% seedling survival. ML1:FUS3-GFP hypersensitivity to high temperature can be partly recovered in the presence of fluridone, an inhibitor of ABA biosynthesis, suggesting this hypersensitivity is due in part to higher ABA level in this mutant. Transcriptomic analysis shows that WT seeds imbibed at supraoptimal temperature activate seed-specific genes and ABA biosynthetic and signaling genes, while inhibiting genes that promote germination and growth, such as GA biosynthetic and signaling genes. Conclusion In this study, we have uncovered a novel function for the master regulator of seed maturation, FUS3, in delaying germination at supraoptimal temperature. Physiologically, this is important since delaying germination has a protective role at high temperature. Transcriptomic analysis of seeds imbibed at supraoptimal temperature reveal that a complex program is in place, which involves not only the regulation of heat and dehydration response genes to adjust cellular functions, but also the activation of seed-specific programs and the inhibition of germination-promoting programs to delay germination

    Current status of the multinational Arabidopsis community

    Get PDF
    Publisher Copyright: © 2020 The Authors. Plant Direct published by American Society of Plant Biologists and the Society for Experimental Biology and John Wiley & Sons LtdThe multinational Arabidopsis research community is highly collaborative and over the past thirty years these activities have been documented by the Multinational Arabidopsis Steering Committee (MASC). Here, we (a) highlight recent research advances made with the reference plant Arabidopsis thaliana; (b) provide summaries from recent reports submitted by MASC subcommittees, projects and resources associated with MASC and from MASC country representatives; and (c) initiate a call for ideas and foci for the “fourth decadal roadmap,” which will advise and coordinate the global activities of the Arabidopsis research community.Peer reviewe

    An extensive (co-)expression analysis tool for the cytochrome P450 superfamily in Arabidopsis thaliana

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Sequencing of the first plant genomes has revealed that cytochromes P450 have evolved to become the largest family of enzymes in secondary metabolism. The proportion of P450 enzymes with characterized biochemical function(s) is however very small. If P450 diversification mirrors evolution of chemical diversity, this points to an unexpectedly poor understanding of plant metabolism. We assumed that extensive analysis of gene expression might guide towards the function of P450 enzymes, and highlight overlooked aspects of plant metabolism.</p> <p>Results</p> <p>We have created a comprehensive database, 'CYPedia', describing P450 gene expression in four data sets: organs and tissues, stress response, hormone response, and mutants of <it>Arabidopsis thaliana</it>, based on public Affymetrix ATH1 microarray expression data. P450 expression was then combined with the expression of 4,130 re-annotated genes, predicted to act in plant metabolism, for co-expression analyses. Based on the annotation of co-expressed genes from diverse pathway annotation databases, co-expressed pathways were identified. Predictions were validated for most P450s with known functions. As examples, co-expression results for P450s related to plastidial functions/photosynthesis, and to phenylpropanoid, triterpenoid and jasmonate metabolism are highlighted here.</p> <p>Conclusion</p> <p>The large scale hypothesis generation tools presented here provide leads to new pathways, unexpected functions, and regulatory networks for many P450s in plant metabolism. These can now be exploited by the community to validate the proposed functions experimentally using reverse genetics, biochemistry, and metabolic profiling.</p

    Project of dimmable lighting system

    Get PDF
    Diplomová práce se zabývá teorií a návrhem umělého osvětlení, nouzového osvětlení a způsobu řízení osvětlení v aule Slezské univerzity v Karviné. Součástí práce je také měření současného stavu osvětlovací soustavy s následným energetickým vyhodnocením a výpočtem úspor elektrické energie. V rámci práce je také zpracován návrh nového svítidla. Cílem diplomové práce je navrhnout osvětlovací soustavu s ohledem na osvětlovací normy a snížit energetickou náročnost nového osvětlení.This thesis focuses on theory and design of artificial lighting, emergency lighting and lighting control method in the Auditorium of the Silesian University in Karvina. The thesis also includes measuring the current state of the lighting system with consequent energy evaluation and calculation of energy savings. As part of the work is also a proposal of new lamps The thesis aims to design a lighting system with regard to the lighting standards to reduce energy demands of the new lighting.410 - Katedra elektroenergetikyvýborn

    The embryonic leaf identity gene FUSCA3 regulates vegetative phase transitions by negatively modulating ethylene-regulated gene expression in Arabidopsis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The embryonic temporal regulator <it>FUSCA3 </it>(<it>FUS3</it>) plays major roles in the establishment of embryonic leaf identity and the regulation of developmental timing. Loss-of-function mutations of this B3 domain transcription factor result in replacement of cotyledons with leaves and precocious germination, whereas constitutive misexpression causes the conversion of leaves into cotyledon-like organs and delays vegetative and reproductive phase transitions.</p> <p>Results</p> <p>Herein we show that activation of FUS3 after germination dampens the expression of genes involved in the biosynthesis and response to the plant hormone ethylene, whereas a loss-of-function <it>fus3 </it>mutant shows many phenotypes consistent with increased ethylene signaling. This <it>FUS3</it>-dependent regulation of ethylene signaling also impinges on timing functions outside embryogenesis. Loss of <it>FUS3 </it>function results in accelerated vegetative phase change, and this is again partially dependent on functional ethylene signaling. This alteration in vegetative phase transition is dependent on both embryonic and vegetative <it>FUS3 </it>function, suggesting that this important transcriptional regulator controls both embryonic and vegetative developmental timing.</p> <p>Conclusion</p> <p>The results of this study indicate that the embryonic regulator <it>FUS3 </it>not only controls the embryonic-to-vegetative phase transition through hormonal (ABA/GA) regulation but also functions postembryonically to delay vegetative phase transitions by negatively modulating ethylene-regulated gene expression.</p

    Current status of the multinational Arabidopsis community

    Get PDF
    The multinational Arabidopsis research community is highly collaborative and over the past thirty years these activities have been documented by the Multinational Arabidopsis Steering Committee (MASC). Here, we (a) highlight recent research advances made with the reference plantArabidopsis thaliana; (b) provide summaries from recent reports submitted by MASC subcommittees, projects and resources associated with MASC and from MASC country representatives; and (c) initiate a call for ideas and foci for the "fourth decadal roadmap," which will advise and coordinate the global activities of the Arabidopsis research community

    CapsID: a web-based tool for developing parsimonious sets of CAPS molecular markers for genotyping

    Get PDF
    BACKGROUND: Genotyping may be carried out by a number of different methods including direct sequencing and polymorphism analysis. For a number of reasons, PCR-based polymorphism analysis may be desirable, owing to the fact that only small amounts of genetic material are required, and that the costs are low. One popular and cheap method for detecting polymorphisms is by using cleaved amplified polymorphic sequence, or CAPS, molecular markers. These are also known as PCR-RFLP markers. RESULTS: We have developed a program, called CapsID, that identifies snip-SNPs (single nucleotide polymorphisms that alter restriction endonuclease cut sites) within a set or sets of reference sequences, designs PCR primers around these, and then suggests the most parsimonious combination of markers for genotyping any individual who is not a member of the reference set. The output page includes biologist-friendly features, such as images of virtual gels to assist in genotyping efforts. CapsID is freely available at . CONCLUSION: CapsID is a tool that can rapidly provide minimal sets of CAPS markers for molecular identification purposes for any biologist working in genetics, community genetics, plant and animal breeding, forensics and other fields
    corecore