267 research outputs found

    ORegAnno 3.0: A community-driven resource for curated regulatory annotation

    Get PDF
    The Open Regulatory Annotation database (ORegAnno) is a resource for curated regulatory annotation. It contains information about regulatory regions, transcription factor binding sites, RNA binding sites, regulatory variants, haplotypes, and other regulatory elements. ORegAnno differentiates itself from other regulatory resources by facilitating crowd-sourced interpretation and annotation of regulatory observations from the literature and highly curated resources. It contains a comprehensive annotation scheme that aims to describe both the elements and outcomes of regulatory events. Moreover, ORegAnno assembles these disparate data sources and annotations into a single, high quality catalogue of curated regulatory information. The current release is an update of the database previously featured in the NAR Database Issue, and now contains 1 948 307 records, across 18 species, with a combined coverage of 334 215 080 bp. Complete records, annotation, and other associated data are available for browsing and download at http://www.oreganno.org/

    Bacterial endosymbiont Cardinium cSfur genome sequence provides insights for understanding the symbiotic relationship in Sogatella furcifera host

    Get PDF
    Background: Sogatella furcifera is a migratory pest that damages rice plants and causes severe economic losses. Due to its ability to annually migrate long distances, S.furcifera has emerged as a major pest of rice in several Asian countries. Symbiotic relationships of inherited bacteria with terrestrial arthropods have significant implications. The genus Cardinium is present in many types of arthropods, where it influences some host characteristics. We present a report of a newly # identified strain of the bacterial endosymbiont Cardinium cSfur in S. furcifera. Result: From the whole genome of S. furcifera previously sequenced by our laboratory, we assembled the whole genome sequence of Cardinium cSfur. The sequence comprised 1,103,593 bp with a GC content of 39.2%. The phylogenetic tree of the Bacteroides phylum to which Cardinium cSfur belongs suggests that Cardinium cSfur is closely related to the other strains (Cardinium cBtQ1 and cEper1) that are members of the Amoebophilaceae family. Genome comparison between the host-dependent endosymbiont including Cardinium cSfur and freeliving bacteria revealed that the endosymbiont has a smaller genome size and lower GC content, and has lost some genes related to metabolism because of its special environment, which is similar to the genome pattern observed in other insect symbionts. Cardinium cSfur has limited metabolic capability, which makes it less contributive to metabolic and biosynthetic processes in its host. From our findings, we inferred that, to compensate for its limited metabolic capability, Cardinium cSfur harbors a relatively high proportion of transport proteins, which might act as the hub between it and its host. With its acquisition of the whole operon related to biotin synthesis and glycolysis related genes through HGT event, Cardinium cSfur seems to be undergoing changes while establishing a symbiotic relationship with its host. Conclusion: A novel bacterial endosymbiont strain (Cardinium cSfur) has been discovered. A genomic analysis of the endosymbiont in S. furcifera suggests that its genome has undergone certain changes to facilitate its settlement in the host. The envisaged potential reproduction manipulative ability of the new endosymbiont strain in its S. furcifera host has vital implications in designing eco-friendly approaches to combat the insect pest

    Rice TOGO Browser: A Platform to Retrieve Integrated Information on Rice Functional and Applied Genomics

    Get PDF
    The Rice TOGO Browser is an online public resource designed to facilitate integration and visualization of mapping data of bacterial artificial chromosome (BAC)/P1-derived artificial chromosome (PAC) clones, genes, restriction fragment length polymorphism (RFLP)/simple sequence repeat (SSR) markers and phenotype data represented as quantitative trait loci (QTLs) onto the genome sequence, and to provide a platform for more efficient utilization of genome information from the point of view of applied genomics as well as functional genomics. Three search options, namely keyword search, region search and trait search, generate various types of data in a user-friendly interface with three distinct viewers, a chromosome viewer, an integrated map viewer and a sequence viewer, thereby providing the opportunity to view the position of genes and/or QTLs at the chromosomal level and to retrieve any sequence information in a user-defined genome region. Furthermore, the gene list, marker list and genome sequence in a specified region delineated by RFLP/SSR markers and any sequences designed as primers can be viewed and downloaded to support forward genetics approaches. An additional feature of this database is the graphical viewer for BLAST search to reveal information not only for regions with significant sequence similarity but also for regions adjacent to those with similarity but with no hits between sequences. An easy to use and intuitive user interface can help a wide range of users in retrieving integrated mapping information including agronomically important traits on the rice genome sequence. The database can be accessed at http://agri-trait.dna.affrc.go.jp/

    Tackling hypotheticals in helminth genomes

    Get PDF
    Advancements in genome sequencing have led to the rapid accumulation of uncharacterized ‘hypothetical proteins’ in the public databases. Here we provide a community perspective and some best-practice approaches for the accurate functional annotation of uncharacterized genomic sequences

    In Silico Identification of Short Nucleotide Sequences Associated with Gene Expression of Pollen Development in Rice

    Get PDF
    Microarray analysis of tiny amounts of RNA extracted from plant section samples prepared by laser microdissection (LM) can provide high-quality information on gene expression in specified plant cells at various stages of development. Having joined the LM-microarray analysis project, we utilized such genome-wide gene expression data from developing rice pollen cells to identify candidates for cis-regulatory elements for specific gene expression in these cells. We first found a few clusters of gene expression patterns based on the data from LM-microarrays. On one gene cluster in which the members were specifically expressed at the bicellular and mature pollen mitotic stages, we identified gene cluster fingerprints (GCFs), each of which consists of a short nucleotide representing the gene cluster. We expected that these GCFs would contain cis-regulatory elements for stage- and tissue-specific gene expression, and we further identified groups of GCFs with common core sequences. Some criteria, such as frequency of occurrence in the gene cluster in contrast to the total tested gene set, flanking sequence preference and distribution of combined GCF sets in the gene regions, allowed us to limit candidates for cis-regulatory sequences for specific gene expression in rice pollen cells to at least 20 sets of combined GCFs. This approach should provide a general purpose algorithm for identifying short nucleotides associated with specific gene expression
    corecore