18 research outputs found

    SynFind: Compiling Syntenic Regions across Any Set of Genomes on Demand

    Get PDF
    The identification of conserved syntenic regions enables discovery of predicted locations for orthologous and homeologous genes, evenwhennosuchgeneispresent.Thiscapabilitymeansthatsynteny-basedmethodsarefarmoreeffectivethansequencesimilaritybased methods in identifying true-negatives, a necessity forstudying gene loss and gene transposition. However, the identification of syntenicregionsrequirescomplexanalyseswhichmustberepeatedforpairwisecomparisonsbetweenanytwospecies.Therefore,as the number of published genomes increases, there is a growing demand for scalable, simple-to-use applications to perform comparative genomic analyses that cater to both gene family studies and genome-scale studies. We implemented SynFind, a web-based tool that addresses this need. Given one query genome, SynFind is capable of identifying conserved syntenic regions in any set of targetgenomes.SynFindiscapableofreportingper-geneinformation,usefulforresearchersstudyingspecificgenefamilies,aswellas genome-wide data sets of syntenic gene and predicted gene locations, critical for researchers focused on large-scale genomic analyses. Inference of syntenic homologs provides the basis for correlation of functional changes around genes of interests between related organisms. Deployed on the CoGe online platform, SynFind is connected to the genomic data from over 15,000 organisms from all domains of life as well as supporting multiple releases of the same organism. SynFind makes use of a powerful job execution framework that promises scalability and reproducibility. SynFind can be accessed at http://genomevolution.org/CoGe/SynFind.pl. A video tutorial of SynFind using Phytophthrora as an example is available at http://www.youtube.com/watch?v=2Agczny9Nyc

    SynFind: Compiling Syntenic Regions across Any Set of Genomes on Demand

    Get PDF
    The identification of conserved syntenic regions enables discovery of predicted locations for orthologous and homeologous genes, evenwhennosuchgeneispresent.Thiscapabilitymeansthatsynteny-basedmethodsarefarmoreeffectivethansequencesimilaritybased methods in identifying true-negatives, a necessity forstudying gene loss and gene transposition. However, the identification of syntenicregionsrequirescomplexanalyseswhichmustberepeatedforpairwisecomparisonsbetweenanytwospecies.Therefore,as the number of published genomes increases, there is a growing demand for scalable, simple-to-use applications to perform comparative genomic analyses that cater to both gene family studies and genome-scale studies. We implemented SynFind, a web-based tool that addresses this need. Given one query genome, SynFind is capable of identifying conserved syntenic regions in any set of targetgenomes.SynFindiscapableofreportingper-geneinformation,usefulforresearchersstudyingspecificgenefamilies,aswellas genome-wide data sets of syntenic gene and predicted gene locations, critical for researchers focused on large-scale genomic analyses. Inference of syntenic homologs provides the basis for correlation of functional changes around genes of interests between related organisms. Deployed on the CoGe online platform, SynFind is connected to the genomic data from over 15,000 organisms from all domains of life as well as supporting multiple releases of the same organism. SynFind makes use of a powerful job execution framework that promises scalability and reproducibility. SynFind can be accessed at http://genomevolution.org/CoGe/SynFind.pl. A video tutorial of SynFind using Phytophthrora as an example is available at http://www.youtube.com/watch?v=2Agczny9Nyc

    SyMAP v3.4: a turnkey synteny system with application to plant genomes

    Get PDF
    SyMAP (Synteny Mapping and Analysis Program) was originally developed to compute synteny blocks between a sequenced genome and a FPC map, and has been extended to support pairs of sequenced genomes. SyMAP uses MUMmer to compute the raw hits between the two genomes, which are then clustered and filtered using the optional gene annotation. The filtered hits are input to the synteny algorithm, which was designed to discover duplicated regions and form larger-scale synteny blocks, where intervening micro-rearrangements are allowed. SyMAP provides extensive interactive Java displays at all levels of resolution along with simultaneous displays of multiple aligned pairs. The synteny blocks from multiple chromosomes may be displayed in a high-level dot plot or three-dimensional view, and the user may then drill down to see the details of a region, including the alignments of the hits to the gene annotation. These capabilities are illustrated by showing their application to the study of genome duplication, differential gene loss and transitive homology between sorghum, maize and rice. The software may be used from a website or standalone for the best performance. A project manager is provided to organize and automate the analysis of multi-genome groups. The software is freely distributed at http://www.agcol.arizona.edu/software/symap

    Sequencing, Mapping, and Analysis of 27,455 Maize Full-Length cDNAs

    Get PDF
    Full-length cDNA (FLcDNA) sequencing establishes the precise primary structure of individual gene transcripts. From two libraries representing 27 B73 tissues and abiotic stress treatments, 27,455 high-quality FLcDNAs were sequenced. The average transcript length was 1.44 kb including 218 bases and 321 bases of 5′ and 3′ UTR, respectively, with 8.6% of the FLcDNAs encoding predicted proteins of fewer than 100 amino acids. Approximately 94% of the FLcDNAs were stringently mapped to the maize genome. Although nearly two-thirds of this genome is composed of transposable elements (TEs), only 5.6% of the FLcDNAs contained TE sequences in coding or UTR regions. Approximately 7.2% of the FLcDNAs are putative transcription factors, suggesting that rare transcripts are well-enriched in our FLcDNA set. Protein similarity searching identified 1,737 maize transcripts not present in rice, sorghum, Arabidopsis, or poplar annotated genes. A strict FLcDNA assembly generated 24,467 non-redundant sequences, of which 88% have non-maize protein matches. The FLcDNAs were also assembled with 41,759 FLcDNAs in GenBank from other projects, where semi-strict parameters were used to identify 13,368 potentially unique non-redundant sequences from this project. The libraries, ESTs, and FLcDNA sequences produced from this project are publicly available. The annotated EST and FLcDNA assemblies are available through the maize FLcDNA web resource (www.maizecdna.org)

    Balancing, Proportionality, and Constitutional Rights

    Get PDF
    In the theory and practice of constitutional adjudication, proportionality review plays a crucial role. At a theoretical level, it lies at core of the debate on rights adjudication; in judicial practice, it is a widespread decision-making model characterizing the action of constitutional, supra-national and international courts. Despite its circulation and centrality in contemporary legal discourse, proportionality in rights-adjudication is still extremely controversial. It raises normative questions—concerning its justification and limits—and descriptive questions—regarding its nature and distinctive features. The chapter addresses both orders of questions. Part I centres on the justification of proportionality review, the connection between proportionality, balancing and theories of rights and the critical aspects of this connection. Part II identifies and analyses the different forms of proportionality both in review, as a template for rights-adjudication, and of review, as a way of defining the scope and limits of adjudication

    fRNAkenseq: a fully powered-by-CyVerse cloud integrated RNA-sequencing analysis tool

    No full text
    Background: Decreasing costs make RNA sequencing technologies increasingly affordable for biologists. However, many researchers who can now afford sequencing lack access to resources necessary for downstream analysis. This means that even as algorithms to process RNA-Seq data improve, many biologists still struggle to manage the sheer volume of data produced by next generation sequencing (NGS) technologies. Scalable bioinformatics tools that exploit multiple platforms are needed to democratize bioinformatics resources in the sequencing era. This is essential for equipping many research groups in the life sciences with the tools to process the increasingly unwieldy datasets they produce. Methods: One strategy to address this challenge is to develop a modern generation of sequence analysis tools capable of seamless data sharing and communication. Such tools will provide interoperability through offerings of interlinked resources. Systems of interlinked, scalable resources, which often incorporate cloud data storage, are broadly referred to as cyberinfrastructure. Cyberinfrastructure integrated tools will help researchers to robustly analyze large scale datasets by efficiently sharing data burdens across a distributed architecture. Additionally, interoperability will allow emerging tools to cross-adapt features of existing tools. It is important that these tools are designed to be easy to use for biologists. Results: We introduce fRNAkenseq, a powered-by-CyVerse RNA sequencing analysis tool that exhibits interoperability with other resources and meets the needs of biologists for comprehensive, easy to use RNA sequencing analysis. fRNAkenseq leverages a complex set of Application Programming Interfaces (APIs) associated with the NSF-funded cyberinfrastructure project, CyVerse, to execute FASTQ-to-differential expression RNA-Seq analyses. Integrating across bioinformatics platforms, fRNAkenseq also exploits cloud integration and cross-talk with another CyVerse associated tool, CoGe. fRNAkenseq offers novel features for the biologist such as more robust and comprehensive pipelines for enrichment than those currently available by default in a single tool, whether they are cloud-based or local installation. Importantly, cross-talk with CoGe allows fRNAkenseq users to execute RNA-Seq pipelines on an inventory of 47,000 archived genomes stored in CoGe or upload their own draft genome.Open access journalThis item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at [email protected]

    PAVE: Program for assembling and viewing ESTs

    No full text
    Abstract Background New sequencing technologies are rapidly emerging. Many laboratories are simultaneously working with the traditional Sanger ESTs and experimenting with ESTs generated by the 454 Life Science sequencers. Though Sanger ESTs have been used to generate contigs for many years, no program takes full advantage of the 5' and 3' mate-pair information, hence, many tentative transcripts are assembled into two separate contigs. The new 454 technology has the benefit of high-throughput expression profiling, but introduces time and space problems for assembling large contigs. Results The PAVE (Program for Assembling and Viewing ESTs) assembler takes advantage of the 5' and 3' mate-pair information by requiring that the mate-pairs be assembled into the same contig and joined by n's if the two sub-contigs do not overlap. It handles the depth of 454 data sets by "burying" similar ESTs during assembly, which retains the expression level information while circumventing time and space problems. PAVE uses MegaBLAST for the clustering step and CAP3 for assembly, however it assembles incrementally to enforce the mate-pair constraint, bury ESTs, and reduce incorrect joins and splits. The PAVE data management system uses a MySQL database to store multiple libraries of ESTs along with their metadata; the management system allows multiple assemblies with variations on libraries and parameters. Analysis routines provide standard annotation for the contigs including a measure of differentially expressed genes across the libraries. A Java viewer program is provided for display and analysis of the results. Our results clearly show the benefit of using the PAVE assembler to explicitly use mate-pair information and bury ESTs for large contigs. Conclusion The PAVE assembler provides a software package for assembling Sanger and/or 454 ESTs. The assembly software, data management software, Java viewer and user's guide are freely available.</p
    corecore