109 research outputs found
SGN Database: From QTLs to Genomes
Quantitative trait loci (QTL) analysis is used to dissect the genetic basis underlying polygenic traits. Several public databases have been storing and making QTL data available to research communities. To our knowledge, current QTL databases rely on manual curation where curators read literature and extract relevant QTL information to store in databases. Evidently, this approach is expensive in terms of expert manpower and time use and limits the type of data that can be curated. At the Solanaceae Genomics Network (SGN) ("http://sgn.cornell.edu":http://sgn.cornell.edu), we have developed a database to store raw phenotype and genotype data from QTL studies, perform, on the fly, QTL analysis using R/QTL statistical software ("http://www.rqtl.org":http://www.rqtl.org) and visualize QTLs on a genetic map. Users can identify peak, and flanking markers for QTLs of traits of interest. The QTL database is integrated with other SGN databases (eg. Marker, BACs, and Unigenes), and analysis tools such as the Comparative Map Viewer. Using the comparative map viewer, users can compare chromosome with QTL regions to genetic maps of interest from the same or different Solanaceae species. As the tomato genome sequencing advances, users can also identify corresponding BAC sequences or locations on the tomato physical map, which can be suggestive of candidate genes for a trait of interest.

Furthermore at SGN, images, quantitative phenotype and genotype data, publications, genetic maps generated by QTL studies are displayed and available for download. Currently, data from three F2 and two backcross population QTL studies on fruit morphology traits (18 – 46 traits per population) is available at the SGN website for viewing at population, accession, and trait levels. Traits are described using ontology terms. Phenotype data is presented in tabular and graphical formats such as frequency distributions with basic descriptive statistics. Mapping data showing location of parental alleles on individual accession genetic maps is also available.

SGN is a public database hosted at Boyce Thomson Institute, Cornell University, and funded by USDA CSREES and NSF
solQTL: a tool for QTL analysis, visualization and linking to genomes at SGN database
BACKGROUND: A common approach to understanding the genetic basis of complex traits is through identification of associated quantitative trait loci (QTL). Fine mapping QTLs requires several generations of backcrosses and analysis of large populations, which is time-consuming and costly effort. Furthermore, as entire genomes are being sequenced and an increasing amount of genetic and expression data are being generated, a challenge remains: linking phenotypic variation to the underlying genomic variation. To identify candidate genes and understand the molecular basis underlying the phenotypic variation of traits, bioinformatic approaches are needed to exploit information such as genetic map, expression and whole genome sequence data of organisms in biological databases. DESCRIPTION: The Sol Genomics Network (SGN, http://solgenomics.net) is a primary repository for phenotypic, genetic, genomic, expression and metabolic data for the Solanaceae family and other related Asterids species and houses a variety of bioinformatics tools. SGN has implemented a new approach to QTL data organization, storage, analysis, and cross-links with other relevant data in internal and external databases. The new QTL module, solQTL, http://solgenomics.net/qtl/, employs a user-friendly web interface for uploading raw phenotype and genotype data to the database, R/QTL mapping software for on-the-fly QTL analysis and algorithms for online visualization and cross-referencing of QTLs to relevant datasets and tools such as the SGN Comparative Map Viewer and Genome Browser. Here, we describe the development of the solQTL module and demonstrate its application. CONCLUSIONS: solQTL allows Solanaceae researchers to upload raw genotype and phenotype data to SGN, perform QTL analysis and dynamically cross-link to relevant genetic, expression and genome annotations. Exploration and synthesis of the relevant data is expected to help facilitate identification of candidate genes underlying phenotypic variation and markers more closely linked to QTLs. solQTL is freely available on SGN and can be used in private or public mode
solGS: a webbased tool for genomic selection
Background: Genomic selection (GS) promises to improve accuracy in estimating breeding values and genetic gain for quantitative traits compared to traditional breeding methods. Its reliance on high-throughput genome-wide markers and statistical complexity, however, is a serious challenge in data management, analysis, and sharing. A bioinformatics infrastructure for data storage and access, and user-friendly web-based tool for analysis and sharing output is needed to make GS more practical for breeders.
Results: We have developed a web-based tool, called solGS, for predicting genomic estimated breeding values (GEBVs) of individuals, using a Ridge-Regression Best Linear Unbiased Predictor (RR-BLUP) model. It has an intuitive web-interface for selecting a training population for modeling and estimating genomic estimated breeding values of selection candidates. It estimates phenotypic correlation and heritability of traits and selection indices of individuals. Raw data is stored in a generic database schema, Chado Natural Diversity, co-developed by multiple database groups. Analysis output is graphically visualized and can be interactively explored online or downloaded in text format. An instance of its implementation can be accessed at the NEXTGEN Cassava breeding database, http://cassavabase.org/solgs.
Conclusions: solGS enables breeders to store raw data and estimate GEBVs of individuals online, in an intuitive and interactive workflow. It can be adapted to any breeding program.Background: Genomic selection (GS) promises to improve accuracy in estimating breeding values and genetic gain for quantitative traits compared to traditional breeding methods. Its reliance on high-throughput genome-wide markers and statistical complexity, however, is a serious challenge in data management, analysis, and sharing. A bioinformatics infrastructure for data storage and access, and user-friendly web-based tool for analysis and sharing output is needed to make GS more practical for breeders.
Results: We have developed a web-based tool, called solGS, for predicting genomic estimated breeding values (GEBVs) of individuals, using a Ridge-Regression Best Linear Unbiased Predictor (RR-BLUP) model. It has an intuitive web-interface for selecting a training population for modeling and estimating genomic estimated breeding values of selection candidates. It estimates phenotypic correlation and heritability of traits and selection indices of individuals. Raw data is stored in a generic database schema, Chado Natural Diversity, co-developed by multiple database groups. Analysis output is graphically visualized and can be interactively explored online or downloaded in text format. An instance of its implementation can be accessed at the NEXTGEN Cassava breeding database, http://cassavabase.org/solgs.
Conclusions: solGS enables breeders to store raw data and estimate GEBVs of individuals online, in an intuitive and interactive workflow. It can be adapted to any breeding program.Background: Genomic selection (GS) promises to improve accuracy in estimating breeding values and genetic gain for quantitative traits compared to traditional breeding methods. Its reliance on high-throughput genome-wide markers and statistical complexity, however, is a serious challenge in data management, analysis, and sharing. A bioinformatics infrastructure for data storage and access, and user-friendly web-based tool for analysis and sharing output is needed to make GS more practical for breeders.
Results: We have developed a web-based tool, called solGS, for predicting genomic estimated breeding values (GEBVs) of individuals, using a Ridge-Regression Best Linear Unbiased Predictor (RR-BLUP) model. It has an intuitive web-interface for selecting a training population for modeling and estimating genomic estimated breeding values of selection candidates. It estimates phenotypic correlation and heritability of traits and selection indices of individuals. Raw data is stored in a generic database schema, Chado Natural Diversity, co-developed by multiple database groups. Analysis output is graphically visualized and can be interactively explored online or downloaded in text format. An instance of its implementation can be accessed at the NEXTGEN Cassava breeding database, http://cassavabase.org/solgs.
Conclusions: solGS enables breeders to store raw data and estimate GEBVs of individuals online, in an intuitive and interactive workflow. It can be adapted to any breeding program.Background: Genomic selection (GS) promises to improve accuracy in estimating breeding values and genetic gain for quantitative traits compared to traditional breeding methods. Its reliance on high-throughput genome-wide markers and statistical complexity, however, is a serious challenge in data management, analysis, and sharing. A bioinformatics infrastructure for data storage and access, and user-friendly web-based tool for analysis and sharing output is needed to make GS more practical for breeders.
Results: We have developed a web-based tool, called solGS, for predicting genomic estimated breeding values (GEBVs) of individuals, using a Ridge-Regression Best Linear Unbiased Predictor (RR-BLUP) model. It has an intuitive web-interface for selecting a training population for modeling and estimating genomic estimated breeding values of selection candidates. It estimates phenotypic correlation and heritability of traits and selection indices of individuals. Raw data is stored in a generic database schema, Chado Natural Diversity, co-developed by multiple database groups. Analysis output is graphically visualized and can be interactively explored online or downloaded in text format. An instance of its implementation can be accessed at the NEXTGEN Cassava breeding database, http://cassavabase.org/solgs.
Conclusions: solGS enables breeders to store raw data and estimate GEBVs of individuals online, in an intuitive and interactive workflow. It can be adapted to any breeding program
Co-ordinated regulation of flowering time, plant architecture and growth by FASCICULATE: the pepper orthologue of SELF PRUNING
Wild peppers (Capsicum spp.) are either annual or perennial in their native habitat and their shoot architecture is dictated by their sympodial growth habit. To study shoot architecture in pepper, sympodial development is described in wild type and in the classical recessive fasciculate (fa) mutation. The basic sympodial unit in wild-type pepper comprises two leaves and a single terminal flower. fasciculate plants are characterized by the formation of floral clusters separated by short internodes and miniature leaves and by early flowering. Developmental analysis of these clusters revealed shorter sympodial units and, often, precocious termination prior to sympodial leaf formation. fa was mapped to pepper chromosome 6, in a region corresponding to the tomato SELF-PRUNING (SP) locus, the homologue of TFL1 of Arabidopsis. Sequence comparison between wild-type and fa plants revealed a duplication of the second exon in the mutants' orthologue of SP, leading to the formation of a premature stop codon. Ectopic expression of FASCICULATE complemented the Arabidopsis tfl1 mutant plants and as expected, stimulated late flowering. In agreement with the major effect of FASCICULATE imposed on sympodial development, the gene transcripts were localized to the centre of sympodial shoots but could not be detected in the primary shoot. The wide range of pleiotropic effects on plant architecture mediated by a single âfloweringâ gene, suggests that it is used to co-ordinate many developmental events, and thus may underlie some of the widespread variation in the Solanaceae shoot architecture
The Sol Genomics Network (solgenomics.net): growing tomatoes using Perl
The Sol Genomics Network (SGN; http://solgenomics.net/) is a clade-oriented database (COD) containing biological data for species in the Solanaceae and their close relatives, with data types ranging from chromosomes and genes to phenotypes and accessions. SGN hosts several genome maps and sequences, including a pre-release of the tomato (Solanum lycopersicum cv Heinz 1706) reference genome. A new transcriptome component has been added to store RNA-seq and microarray data. SGN is also an open source software project, continuously developing and improving a complex system for storing, integrating and analyzing data. All code and development work is publicly visible on GitHub (http://github.com). The database architecture combines SGN-specific schemas and the community-developed Chado schema (http://gmod.org/wiki/Chado) for compatibility with other genome databases. The SGN curation model is community-driven, allowing researchers to add and edit information using simple web tools. Currently, over a hundred community annotators help curate the database. SGN can be accessed at http://solgenomics.net/
The Sol Genomics Network (SGN)--from genotype to phenotype to breeding
The Sol Genomics Network (SGN, http://solgenomics.net) is a web portal with genomic and phenotypic data, and analysis tools for the Solanaceae family and close relatives. SGN hosts whole genome data for an increasing number of Solanaceae family members including tomato, potato, pepper, eggplant, tobacco and Nicotiana benthamiana. The database also stores loci and phenotype data, which researchers can upload and edit with user-friendly web interfaces. Tools such as BLAST, GBrowse and JBrowse for browsing genomes, expression and map data viewers, a locus community annotation system and a QTL analysis tools are available. A new tool was recently implemented to improve Virus-Induced Gene Silencing (VIGS) constructs called the SGN VIGS tool. With the growing genomic and phenotypic data in the database, SGN is now advancing to develop new web-based breeding tools and implement the code and database structure for other species or clade-specific databases.Peer reviewe
TILLING - a shortcut in functional genomics
Recent advances in large-scale genome sequencing projects have opened up new possibilities for the application of conventional mutation techniques in not only forward but also reverse genetics strategies. TILLING (Targeting Induced Local Lesions IN Genomes) was developed a decade ago as an alternative to insertional mutagenesis. It takes advantage of classical mutagenesis, sequence availability and high-throughput screening for nucleotide polymorphisms in a targeted sequence. The main advantage of TILLING as a reverse genetics strategy is that it can be applied to any species, regardless of its genome size and ploidy level. The TILLING protocol provides a high frequency of point mutations distributed randomly in the genome. The great mutagenic potential of chemical agents to generate a high rate of nucleotide substitutions has been proven by the high density of mutations reported for TILLING populations in various plant species. For most of them, the analysis of several genes revealed 1 mutation/200â500Â kb screened and much higher densities were observed for polyploid species, such as wheat. High-throughput TILLING permits the rapid and low-cost discovery of new alleles that are induced in plants. Several research centres have established a TILLING public service for various plant species. The recent trends in TILLING procedures rely on the diversification of bioinformatic tools, new methods of mutation detection, including mismatch-specific and sensitive endonucleases, but also various alternatives for LI-COR screening and single nucleotide polymorphism (SNP) discovery using next-generation sequencing technologies. The TILLING strategy has found numerous applications in functional genomics. Additionally, wide applications of this throughput method in basic and applied research have already been implemented through modifications of the original TILLING strategy, such as Ecotilling or Deletion TILLING
An Induced Mutation in Tomato eIF4E Leads to Immunity to Two Potyviruses
BACKGROUND: The characterization of natural recessive resistance genes and Arabidopsis virus-resistant mutants have implicated translation initiation factors of the eIF4E and eIF4G families as susceptibility factors required for virus infection and resistance function. METHODOLOGY/PRINCIPAL FINDINGS: To investigate further the role of translation initiation factors in virus resistance we set up a TILLING platform in tomato, cloned genes encoding for translation initiation factors eIF4E and eIF4G and screened for induced mutations that lead to virus resistance. A splicing mutant of the eukaryotic translation initiation factor, S.l_eIF4E1 G1485A, was identified and characterized with respect to cap binding activity and resistance spectrum. Molecular analysis of the transcript of the mutant form showed that both the second and the third exons were miss-spliced, leading to a truncated mRNA. The resulting truncated eIF4E1 protein is also impaired in cap-binding activity. The mutant line had no growth defect, likely because of functional redundancy with others eIF4E isoforms. When infected with different potyviruses, the mutant line was immune to two strains of Potato virus Y and Pepper mottle virus and susceptible to Tobacco each virus. CONCLUSIONS/SIGNIFICANCE: Mutation analysis of translation initiation factors shows that translation initiation factors of the eIF4E family are determinants of plant susceptibility to RNA viruses and viruses have adopted strategies to use different isoforms. This work also demonstrates the effectiveness of TILLING as a reverse genetics tool to improve crop species. We have also developed a complete tool that can be used for both forward and reverse genetics in tomato, for both basic science and crop improvement. By opening it to the community, we hope to fulfill the expectations of both crop breeders and scientists who are using tomato as their model of study
- âŠ