353 research outputs found

    Analysis and visualization of Arabidopsis thaliana GWAS using web 2.0 technologies

    Get PDF
    With large-scale genomic data becoming the norm in biological studies, the storing, integrating, viewing and searching of such data have become a major challenge. In this article, we describe the development of an Arabidopsis thaliana database that hosts the geographic information and genetic polymorphism data for over 6000 accessions and genome-wide association study (GWAS) results for 107 phenotypes representing the largest collection of Arabidopsis polymorphism data and GWAS results to date. Taking advantage of a series of the latest web 2.0 technologies, such as Ajax (Asynchronous JavaScript and XML), GWT (Google-Web-Toolkit), MVC (Model-View-Controller) web framework and Object Relationship Mapper, we have created a web-based application (web app) for the database, that offers an integrated and dynamic view of geographic information, genetic polymorphism and GWAS results. Essential search functionalities are incorporated into the web app to aid reverse genetics research. The database and its web app have proven to be a valuable resource to the Arabidopsis community. The whole framework serves as an example of how biological data, especially GWAS, can be presented and accessed through the web. In the end, we illustrate the potential to gain new insights through the web app by two examples, showcasing how it can be used to facilitate forward and reverse genetics research. Database URL: http://arabidopsis.usc.edu

    From Classical to Modern Computational Approaches to Identify Key Genetic Regulatory Components in Plant Biology

    Get PDF
    The selection of plant genotypes with improved productivity and tolerance to environmental constraints has always been a major concern in plant breeding. Classical approaches based on the generation of variability and selection of better phenotypes from large variant collections have improved their efficacy and processivity due to the implementation of molecular biology techniques, particularly genomics, Next Generation Sequencing and other omics such as proteomics and metabolomics. In this regard, the identification of interesting variants before they develop the phenotype trait of interest with molecular markers has advanced the breeding process of new varieties. Moreover, the correlation of phenotype or biochemical traits with gene expression or protein abundance has boosted the identification of potential new regulators of the traits of interest, using a relatively low number of variants. These important breakthrough technologies, built on top of classical approaches, will be improved in the future by including the spatial variable, allowing the identification of gene(s) involved in key processes at the tissue and cell levels

    Association Mapping in Plant Genomes

    Get PDF

    Genetic Control of the Response to Sulfur, Nitrogen, and Phosphorus Supply in Arabidopsis thaliana

    Get PDF
    Sulfur deficiency is a relatively new problem in Europe and the studies on sulfur use efficiency are still lagging behind those on the other major nutrients such as nitrogen or phosphorus. Therefore, the main aim of this work was to improve the understanding of the sulfate assimilation pathway, its regulation and interaction with other elements. In the course of this project natural variation was used to characterise further the regulation of the pathway and to identify new regulatory components. This analysis revealed that the first two enzymes involved in sulfate reduction – ATP sulfurylase and APS reductase – are nearly equally involved in its control but through different mechanisms. Moreover, a Genome-Wide Association Study was conducted on the accumulation of nitrate, phosphate, and sulfate in more than 200 arabidopsis accessions. This analysis resulted in identification of new functions of already known genes which were not previously related to plant nutrition. Additionally, previously undescribed genes were identified disruption of which results in changes in the anion accumulation phenotype. To characterise arabidopsis response to sulfate and/or nitrate deficiency a collection of genetically divergent accessions grown under different nutrition regimes was examined for a number of morphological and metabolic traits. This analysis resulted in dissection of four different patterns of plant response to sulfate availability. Individual accessions were characterised as best adapted to nutrient deficiency. Traits such as biomass allocation or root architecture were suggested as potential targets in the process of developing new crop varieties. This analysis is unique since, to my knowledge, it is the first one which provides the characterisation of arabidopsis response to nutrient availability based on the analysis of such a large number (25) of natural accessions. The results described here provided new insight into sulfate metabolism and can be used to develop new breeding strategies and improve crop yield and quality

    Network and multi-scale signal analysis for the integration of large omic datasets: applications in \u3ci\u3ePopulus trichocarpa\u3c/i\u3e

    Get PDF
    Poplar species are promising sources of cellulosic biomass for biofuels because of their fast growth rate, high cellulose content and moderate lignin content. There is an increasing movement on integrating multiple layers of ’omics data in a systems biology approach to understand gene-phenotype relationships and assist in plant breeding programs. This dissertation involves the use of network and signal processing techniques for the combined analysis of these various data types, for the goals of (1) increasing fundamental knowledge of P. trichocarpa and (2) facilitating the generation of hypotheses about target genes and phenotypes of interest. A data integration “Lines of Evidence” method is presented for the identification and prioritization of target genes involved in functions of interest. A new post-GWAS method, Pleiotropy Decomposition, is presented, which extracts pleiotropic relationships between genes and phenotypes from GWAS results, allowing for identification of genes with signatures favorable to genome editing. Continuous wavelet transform signal processing analysis is applied in the characterization of genome distributions of various features (including variant density, gene density, and methylation profiles) in order to identify chromosome structures such as the centromere. This resulted in the approximate centromere locations on all P. trichocarpa chromosomes, which had previously not been adequately reported in the scientific literature. Discrete wavelet transform signal processing followed by correlation analysis was applied to genomic features from various data types including transposable element density, methylation density, SNP density, gene density, centromere position and putative ancestral centromere position. Subsequent correlation analysis of the resulting wavelet coefficients identified scale-specific relationships between these genomic features, and provide insights into the evolution of the genome structure of P. trichocarpa. These methods have provided strategies to both increase fundamental knowledge about the P. trichocarpa system, as well as to identify new target genes related to biofuels targets. We intend that these approaches will ultimately be used in the designing of better plants for more efficient and sustainable production of bioenergy

    Transcriptome-based Gene Networks for Systems-level Analysis of Plant Gene Functions

    Get PDF
    Present day genomic technologies are evolving at an unprecedented rate, allowing interrogation of cellular activities with increasing breadth and depth. However, we know very little about how the genome functions and what the identified genes do. The lack of functional annotations of genes greatly limits the post-analytical interpretation of new high throughput genomic datasets. For plant biologists, the problem is much severe. Less than 50% of all the identified genes in the model plant Arabidopsis thaliana, and only about 20% of all genes in the crop model Oryza sativa have some aspects of their functions assigned. Therefore, there is an urgent need to develop innovative methods to predict and expand on the currently available functional annotations of plant genes. With open-access catching the ‘pulse’ of modern day molecular research, an integration of the copious amount of transcriptome datasets allows rapid prediction of gene functions in specific biological contexts, which provide added evidence over traditional homology-based functional inference. The main goal of this dissertation was to develop data analysis strategies and tools broadly applicable in systems biology research. Two user friendly interactive web applications are presented: The Rice Regulatory Network (RRN) captures an abiotic-stress conditioned gene regulatory network designed to facilitate the identification of transcription factor targets during induction of various environmental stresses. The Arabidopsis Seed Active Network (SANe) is a transcriptional regulatory network that encapsulates various aspects of seed formation, including embryogenesis, endosperm development and seed-coat formation. Further, an edge-set enrichment analysis algorithm is proposed that uses network density as a parameter to estimate the gain or loss in correlation of pathways between two conditionally independent coexpression networks

    Experimental demonstration and pan-structurome prediction of climate-associated riboSNitches in Arabidopsis

    Get PDF
    BACKGROUND: Genome-wide association studies (GWAS) aim to correlate phenotypic changes with genotypic variation. Upon transcription, single nucleotide variants (SNVs) may alter mRNA structure, with potential impacts on transcript stability, macromolecular interactions, and translation. However, plant genomes have not been assessed for the presence of these structure-altering polymorphisms or "riboSNitches." RESULTS: We experimentally demonstrate the presence of riboSNitches in transcripts of two Arabidopsis genes, ZINC RIBBON 3 (ZR3) and COTTON GOLGI-RELATED 3 (CGR3), which are associated with continentality and temperature variation in the natural environment. These riboSNitches are also associated with differences in the abundance of their respective transcripts, implying a role in regulating the gene's expression in adaptation to local climate conditions. We then computationally predict riboSNitches transcriptome-wide in mRNAs of 879 naturally inbred Arabidopsis accessions. We characterize correlations between SNPs/riboSNitches in these accessions and 434 climate descriptors of their local environments, suggesting a role of these variants in local adaptation. We integrate this information in CLIMtools V2.0 and provide a new web resource, T-CLIM, that reveals associations between transcript abundance variation and local environmental variation. CONCLUSION: We functionally validate two plant riboSNitches and, for the first time, demonstrate riboSNitch conditionality dependent on temperature, coining the term "conditional riboSNitch." We provide the first pan-genome-wide prediction of riboSNitches in plants. We expand our previous CLIMtools web resource with riboSNitch information and with 1868 additional Arabidopsis genomes and 269 additional climate conditions, which will greatly facilitate in silico studies of natural genetic variation, its phenotypic consequences, and its role in local adaptation

    Identification and Characterization of Stress Responsive Genes in Soybean and Sunflower

    Get PDF
    Stress responsive genes encode proteins involved in plants’ response to abiotic and biotic stresses. Among such stress responsive proteins, proteins encoded by resistance genes (R genes) or nucleotide binding site-leucine-rich repeats (NBS-LRRs) and mitogen-activated protein kinases (MAPKs) are the major groups of proteins regulating biotic and abiotic stresses, respectively. Previous studies in Nepal’s lab at SDSU identified and characterized coiled coil (CC)-NBS-LRRs (CNLs), resistance to powdery mildew8 (RPW8)-NBS-LRRs (RNLs), NBS-LRR (NLs), and MAPK proteins in soybean. This study focuses on R and MAPK genes in the recently sequenced genome of sunflower as well as the toll-interleukin-1 receptor-like nucleotide-binding site leucine-rich repeat (TNL) R genes of soybean. This study also uses greenhouse experiments and RNA sequencing (RNA-seq) data to characterize stress responsive genes involved in interaction effects of soybean aphid (SBA) and soybean cyst nematode (SCN) interactions on soybean. Thus the major objectives of this dissertation work were to 1) explore the TNL genes in soybean and R (CNL, TNL, RNL) genes in sunflower genomes to assess how they may have evolved and their possible role in resistance against pathogens using available transcriptomic data, 2) identify and characterize MAPK genes in sunflower, and 3) characterize induced susceptibility effects of soybean-soybean aphid and interaction effects of soybean soybean aphid-soybean cyst nematode on soybean. In this dissertation, we used in silico approaches to report genome-wide identification and characterization of soybean TNL proteins as well as sunflower R and MAPK proteins. In order to achieve these objectives, numerous bioinformatics tools were utilized: hidden markov model (HMM) profilings were performed, and annotation of protein domains were conducted. Maximum Likelihood phylogenetic trees were constructed, and nonsynonymous substitutions per nonsynonymous site to synonymous substitutions per synonymous site ratios (Ka/Ks) as a proxy for selection pressure of R genes were calculated. In addition, chromosomal distribution, intron-exon architecture; synteny as well as gene expression patterns were assessed. In order to characterize stress responsive genes involved in defense responses, we used soybean aphid (Aphis glycines; SBA) and soybean cyst nematode (Heterodera glycines; SCN) to infest soybean cultivars. We conducted greenhouse experiments to characterize induced susceptibility effects of soybean-SBA interaction, and three-way interactions among soybean, SBA, and SCN. We utilized both demographic and genetic (RNA-seq) datasets to characterize the genes involved in such interactions using biotype 1, biotype 2 soybean aphids and HG type 0 SCN on soybean. FastQC, Btrim, Trimmomatic, Salmon, iDEP, MapMan tools were used to assess the quality, trim, map, assemble, visualize, pathway analysis and biological significance of RNA sequencing data to host genome. We identified an inventory of 117 of 153 regular TNL genes in soybean, and 352 NBS-encoding genes (100 CNLs, 77 TNLs, 13 RNLs, and 162 NLs), 28 MPKs and eight MKKs in sunflower through in silico analyses. R genes in soybean and sunflower formed several gene clusters suggesting their origin by tandem duplications. The selection pressure analysis revealed R genes experiencing purifying selection (Ka/Ks \u3c 1) in both soybean and sunflower. Sunflower MAP Kinases revealed within and between clade functional divergence, and MKK3 orthologues were highly conserved across the species representing diverse taxonomic groups of the plant kingdom. Demographic data obtained from greenhouse experiments showed that induced susceptibility as initial feeding with virulent SBA (biotype 2) increased the population of subsequent avirulent SBA (biotype 1) in both susceptible and resistant cultivars. In the three-way interaction among soybean, SBA, and SCN, the number of SCN eggs was significantly greater on the susceptible cultivar and there was no effect in the resistant cultivar in the presence of SBA. The SBA population density was negatively affected by SCN populations. RNA-seq analysis in both studies have revealed differentially expressed genes (DEGs) and transcription factor (TF) binding motifs, which were enriched for various biological processes and pathways at different time points. The DEGs were common and unique in susceptible and resistant cultivars and treatments that were enriched for various biological processes and pathways. These DEGs were also functionally related to known defense mechanisms previously reported in various hostaphid and host-nematode systems. The responses to aphid biotype 1 infestation in the presence or absence of inducer population (biotype 2) at two time points (day1 and 11 post inducer infestation) revealed significant differences on the gene enrichment and regulation in SBA resistant and susceptible cultivars. For instance, enrichment analysis showed ‘response to chitin’, ‘lignin catabolic and metabolic process’, ‘asparagine metabolic process’, ‘response to chemical’ unique to treatment with no inducer population, whereas, ‘response to reactive oxygen species’, ‘photosynthesis’, ‘regulation of endopeptidase activity’ unique to treatment with inducer population. Likewise, Soybean-SBA-SCN interaction study showed enrichment of genes in ‘Plant Pathogen Interaction’ and ‘cutin, suberine, and wax biosynthesis’ pathways at 5 (days post SBA infestation) dpi; ‘isoflavonoid biosynthesis’ and ‘one carbon pool by folate’ pathways enriched at 30 dpi in SCN resistant and susceptible cultivars. Overall, the results from this study have improved the current understanding of diversity and evolution of MAPK and R genes in sunflower and soybean, as well as have first time reported a molecular characterization of induced susceptibility effects due to SBA on soybean, and soybean- SBA-SCN interactions, which has a direct implication in disease and pest management

    Genomic analyses provide insights into peach local adaptation and responses to climate change

    Get PDF
    The environment has constantly shaped plant genomes, but the genetic bases underlying how plants adapt to environmental influences remain largely unknown. We constructed a high-density genomic variation map of 263 geographically representative peach landraces and wild relatives. A combination of whole-genome selection scans and genome-wide environmental association studies (GWEAS) was performed to reveal the genomic bases of peach adaptation to diverse climates. A total of 2092 selective sweeps that underlie local adaptation to both mild and extreme climates were identified, including 339 sweeps conferring genomic pattern of adaptation to high altitudes. Using genome-wide environmental association studies (GWEAS), a total of 2755 genomic loci strongly associated with 51 specific environmental variables were detected. The molecular mechanism underlying adaptive evolution of high drought, strong UVB, cold hardiness, sugar content, flesh color, and bloom date were revealed. Finally, based on 30 yr of observation, a candidate gene associated with bloom date advance, representing peach responses to global warming, was identified. Collectively, our study provides insights into molecular bases of how environments have shaped peach genomes by natural selection and adds candidate genes for future studies on evolutionary genetics, adaptation to climate changes, and breeding.info:eu-repo/semantics/publishedVersio

    PREDICTING COMPLEX PHENOTYPE-GENOTYPE RELATIONSHIPS IN GRASSES: A SYSTEMS GENETICS APPROACH

    Get PDF
    It is becoming increasingly urgent to identify and understand the mechanisms underlying complex traits. Expected increases in the human population coupled with climate change make this especially urgent for grasses in the Poaceae family because these serve as major staples of the human and livestock diets worldwide. In particular, Oryza sativa (rice), Triticum spp. (wheat), Zea mays (maize), and Saccharum spp. (sugarcane) are among the top agricultural commodities. Molecular marker tools such as linkage-based Quantitative Trait Loci (QTL) mapping, Genome-Wide Association Studies (GWAS), Multiple Marker Assisted Selection (MMAS), and Genome Selection (GS) techniques offer promise for understanding the mechanisms behind complex traits and to improve breeding programs. These methods have shown some success. Often, however, they cannot identify the causal genes underlying traits nor the biological context in which those genes function. To improve our understanding of complex traits as well improve breeding techniques, additional tools are needed to augment existing methods. This work proposes a knowledge-independent systems-genetic paradigm that integrates results from genetic studies such as QTL mapping, GWAS and mutational insertion lines such as Tos17 with gene co-expression networks for grasses--in particular for rice. The techniques described herein attempt to overcome the bias of limited human knowledge by relying solely on the underlying signals within the data to capture a holistic representation of gene interactions for a species. Through integration of gene co-expression networks with genetic signal, modules of genes can be identified with potential effect for a given trait, and the biological function of those interacting genes can be determined
    • 

    corecore