20 research outputs found
SGN Database: From QTLs to Genomes
Quantitative trait loci (QTL) analysis is used to dissect the genetic basis underlying polygenic traits. Several public databases have been storing and making QTL data available to research communities. To our knowledge, current QTL databases rely on manual curation where curators read literature and extract relevant QTL information to store in databases. Evidently, this approach is expensive in terms of expert manpower and time use and limits the type of data that can be curated. At the Solanaceae Genomics Network (SGN) ("http://sgn.cornell.edu":http://sgn.cornell.edu), we have developed a database to store raw phenotype and genotype data from QTL studies, perform, on the fly, QTL analysis using R/QTL statistical software ("http://www.rqtl.org":http://www.rqtl.org) and visualize QTLs on a genetic map. Users can identify peak, and flanking markers for QTLs of traits of interest. The QTL database is integrated with other SGN databases (eg. Marker, BACs, and Unigenes), and analysis tools such as the Comparative Map Viewer. Using the comparative map viewer, users can compare chromosome with QTL regions to genetic maps of interest from the same or different Solanaceae species. As the tomato genome sequencing advances, users can also identify corresponding BAC sequences or locations on the tomato physical map, which can be suggestive of candidate genes for a trait of interest.

Furthermore at SGN, images, quantitative phenotype and genotype data, publications, genetic maps generated by QTL studies are displayed and available for download. Currently, data from three F2 and two backcross population QTL studies on fruit morphology traits (18 – 46 traits per population) is available at the SGN website for viewing at population, accession, and trait levels. Traits are described using ontology terms. Phenotype data is presented in tabular and graphical formats such as frequency distributions with basic descriptive statistics. Mapping data showing location of parental alleles on individual accession genetic maps is also available.

SGN is a public database hosted at Boyce Thomson Institute, Cornell University, and funded by USDA CSREES and NSF
Plant Metabolic Pathways in MetaCyc and SolCyc
MetaCyc is a metabolic encyclopedia of experimentally validated biochemical pathways curated from scientific literature, that spans all organisms, with an emphasis on plants and microbes. The Pathway tools is a complex curation software suite that enables curation of reactions, construction of pathways and annotation with one or more representative enzymes, that include information such as substrate specificity, kinetic properties, activators, inhibitors, cofactor requirements, genes if cloned and links to external databases. In addition curators are able to provide concise, review-level summaries and extensive literature citations. The present database release includes more than 1200 pathways from more than 1549 organisms, 7312 reactions, 5127 enzymes, 4748 genes, 7234 chemical compounds, curated from 17916 citations. The MetaCyc database is the reference database on which the pathways are predicted from annotated genomes by PathoLogic called Pathway/genome Databases (PGDB's). The Biocyc Database ("biocyc.org":http://biocyc.org) has a collection over 300 PGDB's. Each BioCyc Database describes the genome and predicted metabolic pathways of a single organism, which are then taken up by interested groups for curation. SolCyc is one such PGDB, developed for the clade oriented Solanceae Genomics Network (SGN) database. It has predicted metabolic pathway databases of significant species belonging to Solanaceae and includes Lycocyc(tomato), Solacyc (eggplant), Nicotianacyc (tobacco),Petuniacyc (Petunia), Capcyc (Capsicum) , Potatocyc (potato). An interactive webinterface has been developed for the seamless flow of information from the SGN phenotype and locus database with SolCyc. This facilitates researchers with the capacity to search for underlying metabolic pathway information of genes and phenotypes that has been curated into the SolCyc database
The SOL Genomics Network Model: Making Community Annotation Work
The concept of community annotation is a growing discipline for achieving participation of the research community in depositing up‐to‐date knowledge in biological databases.
The Solanaceae Genomics Network ("SGN":http://sgn.cornell.edu/) is a clade‐oriented database (COD) focusing on plants of the nightshade family, including tomato, potato, pepper, eggplant, and tobacco, and is one of the bioinformatics nodes of the international tomato genome sequencing project. One of our major efforts is linking Solanaceae phenotype information with the underlying genes, and subsequently the genome. As part of this goal, SGN has introduced a database for locus names and descriptors, and a database for phenotypes of natural and induced variation. These two databases have web interfaces that allow cross references, associations with tomato gene models, and in‐house curated information of sequences, literature, ontologies, gene networks, and the Solanaceae biochemical pathways database ("SolCyc":http://solcyc.sgn.cornell.edu). All of our curator tools are open for online community annotation, through specially assigned “submitter” accounts. 

Currently the community database consists of 5,548 phenotyped accessions, and 5,739 curated loci, out of which more than 300 loci where contributed or annotated by 66 active submitters, creating a database that is truly community driven.
This framework is easily adaptable for other projects working on other taxa (for example see "http://chlamybase.org":http://chlamybase.org), greatly expanding the application of this user‐friendly online annotation system. Community participation is fostered by an active outreach program that includes contacting potential submitters via emails, at meetings and conferences, and by promoting featured user submitted annotations on the SGN homepage. The source code and database schema for all SGN functionalities are freely available. Please contact SGN at "sgn‐feedback[at]sgn.cornell.edu":mailto:[email protected] for more information
solQTL: a tool for QTL analysis, visualization and linking to genomes at SGN database
BACKGROUND: A common approach to understanding the genetic basis of complex traits is through identification of associated quantitative trait loci (QTL). Fine mapping QTLs requires several generations of backcrosses and analysis of large populations, which is time-consuming and costly effort. Furthermore, as entire genomes are being sequenced and an increasing amount of genetic and expression data are being generated, a challenge remains: linking phenotypic variation to the underlying genomic variation. To identify candidate genes and understand the molecular basis underlying the phenotypic variation of traits, bioinformatic approaches are needed to exploit information such as genetic map, expression and whole genome sequence data of organisms in biological databases. DESCRIPTION: The Sol Genomics Network (SGN, http://solgenomics.net) is a primary repository for phenotypic, genetic, genomic, expression and metabolic data for the Solanaceae family and other related Asterids species and houses a variety of bioinformatics tools. SGN has implemented a new approach to QTL data organization, storage, analysis, and cross-links with other relevant data in internal and external databases. The new QTL module, solQTL, http://solgenomics.net/qtl/, employs a user-friendly web interface for uploading raw phenotype and genotype data to the database, R/QTL mapping software for on-the-fly QTL analysis and algorithms for online visualization and cross-referencing of QTLs to relevant datasets and tools such as the SGN Comparative Map Viewer and Genome Browser. Here, we describe the development of the solQTL module and demonstrate its application. CONCLUSIONS: solQTL allows Solanaceae researchers to upload raw genotype and phenotype data to SGN, perform QTL analysis and dynamically cross-link to relevant genetic, expression and genome annotations. Exploration and synthesis of the relevant data is expected to help facilitate identification of candidate genes underlying phenotypic variation and markers more closely linked to QTLs. solQTL is freely available on SGN and can be used in private or public mode
The Sol Genomics Network (SGN)--from genotype to phenotype to breeding
The Sol Genomics Network (SGN, http://solgenomics.net) is a web portal with genomic and phenotypic data, and analysis tools for the Solanaceae family and close relatives. SGN hosts whole genome data for an increasing number of Solanaceae family members including tomato, potato, pepper, eggplant, tobacco and Nicotiana benthamiana. The database also stores loci and phenotype data, which researchers can upload and edit with user-friendly web interfaces. Tools such as BLAST, GBrowse and JBrowse for browsing genomes, expression and map data viewers, a locus community annotation system and a QTL analysis tools are available. A new tool was recently implemented to improve Virus-Induced Gene Silencing (VIGS) constructs called the SGN VIGS tool. With the growing genomic and phenotypic data in the database, SGN is now advancing to develop new web-based breeding tools and implement the code and database structure for other species or clade-specific databases.Peer reviewe
Recommended from our members
Gramene QTL database: development, content and applications
Gramene is a comparative information resource for plants that integrates data across diverse data domains. In this article,
we describe the development of a quantitative trait loci (QTL) database and illustrate how it can be used to facilitate
both the forward and reverse genetics research. The QTL database contains the largest online collection of rice QTL data
in the world. Using flanking markers as anchors, QTLs originally reported on individual genetic maps have been systematically
aligned to the rice sequence where they can be searched as standard genomic features. Researchers can determine
whether a QTL co-localizes with other QTLs detected in independent experiments and can combine data from multiple
studies to improve the resolution of a QTL position. Candidate genes falling within a QTL interval can be identified and
their relationship to particular phenotypes can be inferred based on functional annotations provided by ontology terms.
Mutations identified in functional genomics populations and association mapping panels can be aligned with QTL regions
to facilitate fine mapping and validation of gene–phenotype associations. By assembling and integrating diverse types
of data and information across species and levels of biological complexity, the QTL database enhances the potential
to understand and utilize QTL information in biological research
Recommended from our members
Gramene: a growing plant comparative genomics resource
Gramene (www.gramene.org) is a curated resource
for genetic, genomic and comparative genomics
data for the major crop species, including rice,
maize, wheat and many other plant (mainly grass)
species. Gramene is an open-source project.
All data and software are freely downloadable
through the ftp site (ftp.gramene.org/pub/gramene)
and available for use without restriction. Gramene’s
core data types include genome assembly and
annotations, other DNA/mRNA sequences, genetic
and physical maps/markers, genes, quantitative
trait loci (QTLs), proteins, ontologies, literature
and comparative mappings. Since our last NAR
publication 2 years ago, we have updated these data
types to include new datasets and new connections
among them. Completely new features include
rice pathways for functional annotation of rice
genes; genetic diversity data from rice, maize and
wheat to show genetic variations among different
germplasms; large-scale genome comparisons
among Oryza sativa and its wild relatives for
evolutionary studies; and the creation of orthologous
gene sets and phylogenetic trees among
rice, Arabidopsis thaliana, maize, poplar and several
animal species (for reference purpose). We have
significantly improved the web interface in order
to provide a more user-friendly browsing
experience, including a dropdown navigation
menu system, unified web page for markers,
genes, QTLs and proteins, and enhanced quick
search functions.This is the publisher’s final pdf. The published article is copyrighted by the author(s) and published by Oxford University Press. The published article can be found at: http://nar.oxfordjournals.org/
A Community-Based Annotation Framework for Linking Solanaceae Genomes with Phenomes1[C][OA]
The amount of biological data available in the public domain is growing exponentially, and there is an increasing need for infrastructural and human resources to organize, store, and present the data in a proper context. Model organism databases (MODs) invest great efforts to functionally annotate genomes and phenomes by in-house curators. The SOL Genomics Network (SGN; http://www.sgn.cornell.edu) is a clade-oriented database (COD), which provides a more scalable and comparative framework for biological information. SGN has recently spearheaded a new approach by developing community annotation tools to expand its curational capacity. These tools effectively allow some curation to be delegated to qualified researchers, while, at the same time, preserving the in-house curators' full editorial control. Here we describe the background, features, implementation, results, and development road map of SGN's community annotation tools for curating genotypes and phenotypes. Since the inception of this project in late 2006, interest and participation from the Solanaceae research community has been strong and growing continuously to the extent that we plan to expand the framework to accommodate more plant taxa. All data, tools, and code developed at SGN are freely available to download and adapt