59 research outputs found

    ChIP-Array: combinatory analysis of ChIP-seq/chip and microarray gene expression data to discover direct/indirect targets of a transcription factor

    Get PDF
    Chromatin immunoprecipitation (ChIP) coupled with high-throughput techniques (ChIP-X), such as next generation sequencing (ChIP-Seq) and microarray (ChIPā€“chip), has been successfully used to map active transcription factor binding sites (TFBS) of a transcription factor (TF). The targeted genes can be activated or suppressed by the TF, or are unresponsive to the TF. Microarray technology has been used to measure the actual expression changes of thousands of genes under the perturbation of a TF, but is unable to determine if the affected genes are direct or indirect targets of the TF. Furthermore, both ChIP-X and microarray methods produce a large number of false positives. Combining microarray expression profiling and ChIP-X data allows more effective TFBS analysis for studying the function of a TF. However, current web servers only provide tools to analyze either ChIP-X or expression data, but not both. Here, we present ChIP-Array, a web server that integrates ChIP-X and expression data from human, mouse, yeast, fruit fly and Arabidopsis. This server will assist biologists to detect direct and indirect target genes regulated by a TF of interest and to aid in the functional characterization of the TF. ChIP-Array is available at http://jjwanglab.hku.hk/ChIP-Array, with free access to academic users

    GWASdb: a database for human genetic variants identified by genome-wide association studies

    Get PDF
    Recent advances in genome-wide association studies (GWAS) have enabled us to identify thousands of genetic variants (GVs) that are associated with human diseases. As next-generation sequencing technologies become less expensive, more GVs will be discovered in the near future. Existing databases, such as NHGRI GWAS Catalog, collect GVs with only genome-wide level significance. However, many true disease susceptibility loci have relatively moderate P values and are not included in these databases. We have developed GWASdb that contains 20 times more data than the GWAS Catalog and includes less significant GVs (Pā€‰<ā€‰1.0ā€‰Ć—ā€‰10āˆ’3) manually curated from the literature. In addition, GWASdb provides comprehensive functional annotations for each GV, including genomic mapping information, regulatory effects (transcription factor binding sites, microRNA target sites and splicing sites), amino acid substitutions, evolution, gene expression and disease associations. Furthermore, GWASdb classifies these GVs according to diseases using Disease-Ontology Lite and Human Phenotype Ontology. It can conduct pathway enrichment and PPI network association analysis for these diseases. GWASdb provides an intuitive, multifunctional database for biologists and clinicians to explore GVs and their functional inferences. It is freely available at http://jjwanglab.org/gwasdb and will be updated frequently

    Study of signaling and regulatory networks by computational methods

    No full text
    In order to communicate with extracellular and intracellular environments, cells have to be able to receive signals and via signaling pathways pass them to nucleus, where the genome responds through cascades of gene regulatory networks (GRNs). Many signal receptors on the cell membranes consist of extracellular part that can bind particular molecules, and intracellular part that accomplishes a conformation change or open binding sites for other proteins after the extracellular binding. Consequently, some proteins/enzymes are activated or recruited, and the signaling transduction then onsets via protein-protein interactions (PPIs). Once the signal reaches nucleus and related transcription factors (TFs) are activated, they will in turn regulate the gene transcription and change the gene expression. However, systematic identification of signaling network components and construction of GRNs are not easy to be achieved by pure experiments, which are costly and time-consuming. In this dissertation, we construct a workflow to study the signaling and regulatory networks by applying computational methods on public PPI and omics data. We first explore the phenotypic profiles of genes using a cross-species phenotype network and compile the them into a gene-phenotype association database; then a webserver is proposed using omics data to construct GRNs; next the phenotypic profiles, together with domain-domain interactions, phylogenetic profiles and expression profiles, are used as features to refine the PPI network using random forest classifier. The random walk with restart algorithm is applied to the refined PPI network to identify the signaling network components. The GRNs centered by the predicted TFs are finally constructed and with related omics data or putative binding information. With this workflow, we have successfully identified important TFs, including Pou5f1, Sox2 and Nanog, involved in the cell reprogramming mediated by vitamin C. For interaction partners and co-regulatory targets of core pluripotency factors, large parts of the results (154 out of 229 interactions) have been confirmed by literature. This workflow can also be applied to decode the signaling and regulatory networks via which the genome responds to the environmental stimuli (i.e. drug treatment). The results will not only help biologists and clinicians to better understand biological processes and diseases systematically, but also lead to further studies, such as drug repurposing.published_or_final_versionBiochemistryDoctoralDoctor of Philosoph

    Style sheet for contributors to the Australian Review of Applied Linguistics

    No full text
    Summary: Recent advances in high-throughput sequencing technologies have enabled us to sequence large number of cancer samples to reveal novel insights into oncogenetic mechanisms. However, the presence of intratumoral heterogeneity, normal cell contamination and insufficient sequencing depth, together pose a challenge for detecting somatic mutations. Here we propose a fast and an accurate somatic single-nucleotide variations (SNVs) detection program, FaSD-somatic. The performance of FaSD-somatic is extensively assessed on various types of cancer against several state-of-the-Art somatic SNV detection programs. Benchmarked by somatic SNVs from either existing databases or de novo higher-depth sequencing data, FaSD-somatic has the best overall performance. Furthermore, FaSD-somatic is efficient, it finishes somatic SNV calling within 14 h on 50X whole genome sequencing data in paired samples. ƂĀ© The Author 2014. Published by Oxford University Press. All rights reserved.Link_to_subscribed_fulltex
    • ā€¦
    corecore