44 research outputs found

    ABrowse - a customizable next-generation genome browser framework

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>With the rapid growth of genome sequencing projects, genome browser is becoming indispensable, not only as a visualization system but also as an interactive platform to support open data access and collaborative work. Thus a customizable genome browser framework with rich functions and flexible configuration is needed to facilitate various genome research projects.</p> <p>Results</p> <p>Based on next-generation web technologies, we have developed a general-purpose genome browser framework ABrowse which provides interactive browsing experience, open data access and collaborative work support. By supporting Google-map-like smooth navigation, ABrowse offers end users highly interactive browsing experience. To facilitate further data analysis, multiple data access approaches are supported for external platforms to retrieve data from ABrowse. To promote collaborative work, an online user-space is provided for end users to create, store and share comments, annotations and landmarks. For data providers, ABrowse is highly customizable and configurable. The framework provides a set of utilities to import annotation data conveniently. To build ABrowse on existing annotation databases, data providers could specify SQL statements according to database schema. And customized pages for detailed information display of annotation entries could be easily plugged in. For developers, new drawing strategies could be integrated into ABrowse for new types of annotation data. In addition, standard web service is provided for data retrieval remotely, providing underlying machine-oriented programming interface for open data access.</p> <p>Conclusions</p> <p>ABrowse framework is valuable for end users, data providers and developers by providing rich user functions and flexible customization approaches. The source code is published under GNU Lesser General Public License v3.0 and is accessible at <url>http://www.abrowse.org/</url>. To demonstrate all the features of ABrowse, a live demo for <it>Arabidopsis thaliana </it>genome has been built at <url>http://arabidopsis.cbi.edu.cn/</url>.</p

    Expression pattern divergence of duplicated genes in rice

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genome-wide duplication is ubiquitous during diversification of the angiosperms, and gene duplication is one of the most important mechanisms for evolutionary novelties. As an indicator of functional evolution, the divergence of expression patterns following duplication events has drawn great attention in recent years. Using large-scale whole-genome microarray data, we systematically analyzed expression divergence patterns of rice genes from block, tandem and dispersed duplications.</p> <p>Results</p> <p>We found a significant difference in expression divergence patterns for the three types of duplicated gene pairs. Expression correlation is significantly higher for gene pairs from block and tandem duplications than those from dispersed duplications. Furthermore, a significant correlation was observed between the expression divergence and the synonymous substitution rate which is an approximate proxy of divergence time. Thus, both duplication types and divergence time influence the difference in expression divergence. Using a linear model, we investigated the influence of these two variables and found that the difference in expression divergence between block and dispersed duplicates is attributed largely to their different divergence time. In addition, the difference in expression divergence between tandem and the other two types of duplicates is attributed to both divergence time and duplication type.</p> <p>Conclusion</p> <p>Consistent with previous studies on <it>Arabidopsis</it>, our results revealed a significant difference in expression divergence between the types of duplicated genes and a significant correlation between expression divergence and synonymous substitution rate. We found that the attribution of duplication mode to the expression divergence implies a different evolutionary course of duplicated genes.</p

    BOAT: Basic Oligonucleotide Alignment Tool

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Next-generation DNA sequencing technologies generate tens of millions of sequencing reads in one run. These technologies are now widely used in biology research such as in genome-wide identification of polymorphisms, transcription factor binding sites, methylation states, and transcript expression profiles. Mapping the sequencing reads to reference genomes efficiently and effectively is one of the most critical analysis tasks. Although several tools have been developed, their performance suffers when both multiple substitutions and insertions/deletions (indels) occur together.</p> <p>Results</p> <p>We report a new algorithm, Basic Oligonucleotide Alignment Tool (BOAT) that can accurately and efficiently map sequencing reads back to the reference genome. BOAT can handle several substitutions and indels simultaneously, a useful feature for identifying SNPs and other genomic structural variations in functional genomic studies. For better handling of low-quality reads, BOAT supports a "3'-end Trimming Mode" to build local optimized alignment for sequencing reads, further improving sensitivity. BOAT calculates an E-value for each hit as a quality assessment and provides customizable post-mapping filters for further mapping quality control.</p> <p>Conclusion</p> <p>Evaluations on both real and simulation datasets suggest that BOAT is capable of mapping large volumes of short reads to reference sequences with better sensitivity and lower memory requirement than other currently existing algorithms. The source code and pre-compiled binary packages of BOAT are publicly available for download at <url>http://boat.cbi.pku.edu.cn</url> under GNU Public License (GPL). BOAT can be a useful new tool for functional genomics studies.</p

    WebLab: a data-centric, knowledge-sharing bioinformatic platform

    Get PDF
    With the rapid progress of biological research, great demands are proposed for integrative knowledge-sharing systems to efficiently support collaboration of biological researchers from various fields. To fulfill such requirements, we have developed a data-centric knowledge-sharing platform WebLab for biologists to fetch, analyze, manipulate and share data under an intuitive web interface. Dedicated space is provided for users to store their input data and analysis results. Users can upload local data or fetch public data from remote databases, and then perform analysis using more than 260 integrated bioinformatic tools. These tools can be further organized as customized analysis workflows to accomplish complex tasks automatically. In addition to conventional biological data, WebLab also provides rich supports for scientific literatures, such as searching against full text of uploaded literatures and exporting citations into various well-known citation managers such as EndNote and BibTex. To facilitate team work among colleagues, WebLab provides a powerful and flexible sharing mechanism, which allows users to share input data, analysis results, scientific literatures and customized workflows to specified users or groups with sophisticated privilege settings. WebLab is publicly available at http://weblab.cbi.pku.edu.cn, with all source code released as Free Software

    PlantTFDB 2.0: update and improvement of the comprehensive plant transcription factor database

    Get PDF
    We updated the plant transcription factor (TF) database to version 2.0 (PlantTFDB 2.0, http://planttfdb.cbi.pku.edu.cn) which contains 53 319 putative TFs predicted from 49 species. We made detailed annotation including general information, domain feature, gene ontology, expression pattern and ortholog groups, as well as cross references to various databases and literature citations for these TFs classified into 58 newly defined families with computational approach and manual inspection. Multiple sequence alignments and phylogenetic trees for each family can be shown as Weblogo pictures or downloaded as text files. We have redesigned the user interface in the new version. Users can search TFs with much more flexibility through the improved advanced search page, and the search results can be exported into various formats for further analysis. In addition, we now provide web service for advanced users to access PlantTFDB 2.0 more efficiently

    Geochemistry of soil gas in the seismic fault zone produced by the Wenchuan Ms 8.0 earthquake, southwestern China

    Get PDF
    The spatio-temporal variations of soil gas in the seismic fault zone produced by the 12 May 2008 Wenchuan Ms 8.0 earthquake were investigated based on the field measurements of soil gas concentrations after the main shock. Concentrations of He, H2, CO2, CH4, O2, N2, Rn, and Hg in soil gas were measured in the field at eight short profiles across the seismic rupture zone in June and December 2008 and July 2009. Soil-gas concentrations of more than 800 sampling sites were obtained. The data showed that the magnitudes of the He and H2 anomalies of three surveys declined significantly with decreasing strength of the aftershocks with time. The maximum concentrations of He and H2 (40 and 279.4 ppm, respectively) were found in three replicates at the south part of the rupture zone close to the epicenter. The spatio-temporal variations of CO2, Rn, and Hg concentrations differed obviously between the north and south parts of the fault zone. The maximum He and H2 concentrations in Jun 2008 occurred near the parts of the rupture zone where vertical displacements were larger. The anomalies of He, H2, CO2, Rn, and Hg concentrations could be related to the variation in the regional stress field and the aftershock activity

    Rice-Map: a new-generation rice genome browser

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The concurrent release of rice genome sequences for two subspecies (<it>Oryza sativa </it>L. ssp. <it>japonica </it>and <it>Oryza sativa </it>L. ssp. <it>indica</it>) facilitates rice studies at the whole genome level. Since the advent of high-throughput analysis, huge amounts of functional genomics data have been delivered rapidly, making an integrated online genome browser indispensable for scientists to visualize and analyze these data. Based on next-generation web technologies and high-throughput experimental data, we have developed Rice-Map, a novel genome browser for researchers to navigate, analyze and annotate rice genome interactively.</p> <p>Description</p> <p>More than one hundred annotation tracks (81 for <it>japonica </it>and 82 for <it>indica</it>) have been compiled and loaded into Rice-Map. These pre-computed annotations cover gene models, transcript evidences, expression profiling, epigenetic modifications, inter-species and intra-species homologies, genetic markers and other genomic features. In addition to these pre-computed tracks, registered users can interactively add comments and research notes to Rice-Map as User-Defined Annotation entries. By smoothly scrolling, dragging and zooming, users can browse various genomic features simultaneously at multiple scales. On-the-fly analysis for selected entries could be performed through dedicated bioinformatic analysis platforms such as WebLab and Galaxy. Furthermore, a BioMart-powered data warehouse "Rice Mart" is offered for advanced users to fetch bulk datasets based on complex criteria.</p> <p>Conclusions</p> <p>Rice-Map delivers abundant up-to-date <it>japonica </it>and <it>indica </it>annotations, providing a valuable resource for both computational and bench biologists. Rice-Map is publicly accessible at <url>http://www.ricemap.org/</url>, with all data available for free downloading.</p

    SPD—a web-based secreted protein database

    No full text
    With the improved secreted protein prediction approach and comprehensive data sources, including Swiss-Prot, TrEMBL, RefSeq, Ensembl and CBI-Gene, we have constructed secretomes of human, mouse and rat, with a total of 18 152 secreted proteins. All the entries are ranked according to the prediction confidence. They were further annotated via a proteome annotation pipeline that we developed. We also set up a secreted protein classification pipeline and classified our predicted secreted proteins into different functional categories. To make the dataset more convincing and comprehensive, nine reference datasets are also integrated, such as the secreted proteins from the Gene Ontology Annotation (GOA) system at the European Bioinformatics Institute, and the vertebrate secreted proteins from Swiss-Prot. All these entries were grouped via a TribeMCL based clustering pipeline. We have constructed a webbased secreted protein database, which has been publicly available a

    Indoor environmental quality and pollutant dispersion estimation inside a bus at the downtown areas of Dalian, China

    No full text
    Among most public transport modes, the frequent start-stop urban bus has the most complex micro-environment. Indoor environment quality, airflow patterns, etc. has not been fully understood yet inside buses. In addition, under COVID-19 pandemic, it had been proved aerosol transmission risk might be enhanced inside the buses. Usually, carbon dioxide (CO2) could be considered the index of ventilation effect in enclosed environment, airborne particles are viral carriers. Thus, accurate forecasting of the two abovementioned key pollutants become important. The study analysed the CO2 and airborne particle dispersion inside a bus at the downtown areas of Dalian, China by employing field measurement at spring and autumn, 2021. Temperature, relative humidity, CO2 and airborne particle concentrations were logged by sensors at sampling points respectively, passengers onboard were counted manually. Correlation analysis was conducted and two empirical models for evaluating CO2 and airborne particle were concluded based on the measurement data. From preliminary results, transient concentration of pollutant is almost linearly correlated with cumulative and instant numbers of passenger respectively, with Pearson correlation coefficient larger than 0.8336 for CO2 and 0.8424 for PM2.5. The purpose of the study is to reflect environmental quality inside the bus and provide inspiration into pollution control strategies in buses
    corecore