109 research outputs found

    Identification of responsive gene modules by network-based gene clustering and extending: application to inflammation and angiogenesis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Cell responses to environmental stimuli are usually organized as relatively separate responsive gene modules at the molecular level. Identification of responsive gene modules rather than individual differentially expressed (DE) genes will provide important information about the underlying molecular mechanisms. Most of current methods formulate module identification as an optimization problem: find the active sub-networks in the genome-wide gene network by maximizing the objective function considering the gene differential expression and/or the gene-gene co-expression information. Here we presented a new formulation of this task: a group of closely-connected and co-expressed DE genes in the gene network are regarded as the signatures of the underlying responsive gene modules; the modules can be identified by finding the signatures and then recovering the "missing parts" by adding the intermediate genes that connect the DE genes in the gene network.</p> <p>Results</p> <p>ClustEx, a two-step method based on the new formulation, was developed and applied to identify the responsive gene modules of human umbilical vein endothelial cells (HUVECs) in inflammation and angiogenesis models by integrating the time-course microarray data and genome-wide PPI data. It shows better performance than several available module identification tools by testing on the reference responsive gene sets. Gene set analysis of KEGG pathways, GO terms and microRNAs (miRNAs) target gene sets further supports the ClustEx predictions.</p> <p>Conclusion</p> <p>Taking the closely-connected and co-expressed DE genes in the condition-specific gene network as the signatures of the underlying responsive gene modules provides a new strategy to solve the module identification problem. The identified responsive gene modules of HUVECs and the corresponding enriched pathways/miRNAs provide useful resources for understanding the inflammatory and angiogenic responses of vascular systems.</p

    dbRES: a web-oriented database for annotated RNA editing sites

    Get PDF
    Although a large amount of experimentally derived information about RNA editing sites currently exists, this information has remained scattered in a variety of sources and in diverse data formats. Availability of standard collections for high-quality experimental data will be by of great help for systematic studying of RNA editing, especially for developing computational algorithm to predict RNA editing site. dbRES () is a public database of known RNA editing sites. All sites are manually curated from literature and GenBank annotations. dbRES version 1.1 contains 5437 RNA editing sites of 251 transcripts, covering 96 organisms across plant, metazoan, protozoa, fungi and virus. dbRES provides comprehensive annotations and data summaries, including (but not limited to) transcript sequences, RNA editing types, editing site locations, amino acid changes, organisms, subcellular organelles (if available), cited references, etc. A user-friendly web interface is developed to facilitate both retrieving data and online display of RNA edit site information

    Modularity-based credible prediction of disease genes and detection of disease subtypes on the phenotype-gene heterogeneous network

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Protein-protein interaction networks and phenotype similarity information have been synthesized together to discover novel disease-causing genes. Genetic or phenotypic similarities are manifested as certain modularity properties in a phenotype-gene heterogeneous network consisting of the phenotype-phenotype similarity network, protein-protein interaction network and gene-disease association network. However, the quantitative analysis of modularity in the heterogeneous network and its influence on disease-gene discovery are still unaddressed. Furthermore, the genetic correspondence of the disease subtypes can be identified by marking the genes and phenotypes in the phenotype-gene network. We present a novel network inference method to measure the network modularity, and in particular to suggest the subtypes of diseases based on the heterogeneous network.</p> <p>Results</p> <p>Based on a measure which is introduced to evaluate the closeness between two nodes in the phenotype-gene heterogeneous network, we developed a Hitting-Time-based method, CIPHER-HIT, for assessing the modularity of disease gene predictions and credibly prioritizing disease-causing genes, and then identifying the genetic modules corresponding to potential subtypes of the queried phenotype. The CIPHER-HIT is free to rely on any preset parameters. We found that when taking into account the modularity levels, the CIPHER-HIT method can significantly improve the performance of disease gene predictions, which demonstrates modularity is one of the key features for credible inference of disease genes on the phenotype-gene heterogeneous network. By applying the CIPHER-HIT to the subtype analysis of Breast cancer, we found that the prioritized genes can be divided into two sub-modules, one contains the members of the Fanconi anemia gene family, and the other contains a reported protein complex MRE11/RAD50/NBN.</p> <p>Conclusions</p> <p>The phenotype-gene heterogeneous network contains abundant information for not only disease genes discovery but also disease subtypes detection. The CIPHER-HIT method presented here is effective for network inference, particularly on credible prediction of disease genes and the subtype analysis of diseases, for example Breast cancer. This method provides a promising way to analyze heterogeneous biological networks, both globally and locally.</p

    On the clustering property of the random intersection graphs

    Get PDF
    A random intersection graph \mtl{\mcal{G}_{V,W,p}} is induced from a random bipartite graph \mtl{\mcal{G}^{*}_{V,W,p}} with vertices classes \mtl{V}, \mtl{W} and the edges incident between \mtl{v \in V} and \mtl{w \in W} with probability \mtl{p}. Two vertices in \mtl{V} are considered to be connected with each other if both of them connect with some common vertices in \mtl{W}. The clustering properties of the random intersection graph are investigated completely in this article. Suppose that the vertices number be \mtl{N = \mabs{V}} and \mtl{M=\mabs{W}} and \mtl{M = N^{\alpha},\ p=N^{-\beta}}, where \mtl{\alpha > 0,\, \beta > 0}, we derive the exact expressions of the clustering coefficient \mtl{C_{v}} of vertex \mtl{v} in \mtl{\mcal{G}_{V,W,p}}. The results show that if \mtl{\alpha < 2\beta} and \mtl{\alpha \neq \beta}, \mtl{C_{v}} decreases with the increasing of the graph size; if \mtl{\alpha = \beta} or \mtl{\alpha \geq 2\beta}, the graph has the constant clustering coefficients, in addition, if \mtl{\alpha > 2\beta}, the graph connecChangshui Zhangts almost completely. Therefore, we illustrate the phase transition for the clustering property in the random intersection graphs and give the condition that \mtl{\riG} being high clustering graph

    Identifications of conserved 7-mers in 3'-UTRs and microRNAs in Drosophila

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>MicroRNAs (miRNAs) are a class of endogenous regulatory small RNAs which play an important role in posttranscriptional regulations by targeting mRNAs for cleavage or translational repression. The base-pairing between the 5'-end of miRNA and the target mRNA 3'-UTRs is essential for the miRNA:mRNA recognition. Recent studies show that many seed matches in 3'-UTRs, which are fully complementary to miRNA 5'-ends, are highly conserved. Based on these features, a two-stage strategy can be implemented to achieve the <it>de novo </it>identification of miRNAs by requiring the complete base-pairing between the 5'-end of miRNA candidates and the potential seed matches in 3'-UTRs.</p> <p>Results</p> <p>We presented a new method, which combined multiple pairwise conservation information, to identify the frequently-occurred and conserved 7-mers in 3'-UTRs. A pairwise conservation score (PCS) was introduced to describe the conservation of all 7-mers in 3'-UTRs between any two <it>Drosophila </it>species. Using PCSs computed from 6 pairs of flies, we developed a support vector machine (SVM) classifier ensemble, named Cons-SVM and identified 689 conserved 7-mers including 63 seed matches covering 32 out of 38 known miRNA families in the reference dataset. In the second stage, we searched for 90 nt conserved stem-loop regions containing the complementary sequences to the identified 7-mers and used the previously published miRNA prediction software to analyze these stem-loops. We predicted 47 miRNA candidates in the genome-wide screen.</p> <p>Conclusion</p> <p>Cons-SVM takes advantage of the independent evolutionary information from the 6 pairs of flies and shows high sensitivity in identifying seed matches in 3'-UTRs. Combining the multiple pairwise conservation information by the machine learning approach, we finally identified 47 miRNA candidates in <it>D. melanogaster</it>.</p

    Politeness Principles Difference in Appellations Between English and Chinese

    Get PDF
    Appellation plays a very important role in people’s daily communication and it is necessary to abide by certain principles of politeness in the choice of address forms. This paper mainly studies the reflection of politeness principle in the appellations, the influence factors about using those address forms appropriately, as well as cultural difference between English and Chinese, which can provide some guiding points for people to perform interpersonal and intercultural communication smoothly and establish a good communication relationship

    EsATAC: an easy-to-use systematic pipeline for ATAC-seq data analysis

    Get PDF
    Summary ATAC-seq is rapidly emerging as one of the major experimental approaches to probe chromatin accessibility genome-wide. Here, we present ‘esATAC’, a highly integrated easy-to-use R/Bioconductor package, for systematic ATAC-seq data analysis. It covers essential steps for full analyzing procedure, including raw data processing, quality control and downstream statistical analysis such as peak calling, enrichment analysis and transcription factor footprinting. esATAC supports one command line execution for preset pipelines and provides flexible interfaces for building customized pipelines. Availability and implementation esATAC package is open source under the GPL-3.0 license. It is implemented in R and C++. Source code and binaries for Linux, MAC OS X and Windows are available through Bioconductor (https://www.bioconductor.org/packages/release/bioc/html/esATAC.html). Supplementary information Supplementary data are available at Bioinformatics online. Document type: Articl
    • …