59 research outputs found
Editorial: Enhanced biological mechanism study, drug discovery and individualized medicine with single-cell multiomics data and integrative analysis
ChIP-Array: combinatory analysis of ChIP-seq/chip and microarray gene expression data to discover direct/indirect targets of a transcription factor
Chromatin immunoprecipitation (ChIP) coupled with high-throughput techniques (ChIP-X), such as next generation sequencing (ChIP-Seq) and microarray (ChIPāchip), has been successfully used to map active transcription factor binding sites (TFBS) of a transcription factor (TF). The targeted genes can be activated or suppressed by the TF, or are unresponsive to the TF. Microarray technology has been used to measure the actual expression changes of thousands of genes under the perturbation of a TF, but is unable to determine if the affected genes are direct or indirect targets of the TF. Furthermore, both ChIP-X and microarray methods produce a large number of false positives. Combining microarray expression profiling and ChIP-X data allows more effective TFBS analysis for studying the function of a TF. However, current web servers only provide tools to analyze either ChIP-X or expression data, but not both. Here, we present ChIP-Array, a web server that integrates ChIP-X and expression data from human, mouse, yeast, fruit fly and Arabidopsis. This server will assist biologists to detect direct and indirect target genes regulated by a TF of interest and to aid in the functional characterization of the TF. ChIP-Array is available at http://jjwanglab.hku.hk/ChIP-Array, with free access to academic users
GWASdb: a database for human genetic variants identified by genome-wide association studies
Recent advances in genome-wide association studies (GWAS) have enabled us to identify thousands of genetic variants (GVs) that are associated with human diseases. As next-generation sequencing technologies become less expensive, more GVs will be discovered in the near future. Existing databases, such as NHGRI GWAS Catalog, collect GVs with only genome-wide level significance. However, many true disease susceptibility loci have relatively moderate P values and are not included in these databases. We have developed GWASdb that contains 20 times more data than the GWAS Catalog and includes less significant GVs (Pā<ā1.0āĆā10ā3) manually curated from the literature. In addition, GWASdb provides comprehensive functional annotations for each GV, including genomic mapping information, regulatory effects (transcription factor binding sites, microRNA target sites and splicing sites), amino acid substitutions, evolution, gene expression and disease associations. Furthermore, GWASdb classifies these GVs according to diseases using Disease-Ontology Lite and Human Phenotype Ontology. It can conduct pathway enrichment and PPI network association analysis for these diseases. GWASdb provides an intuitive, multifunctional database for biologists and clinicians to explore GVs and their functional inferences. It is freely available at http://jjwanglab.org/gwasdb and will be updated frequently
Recommended from our members
Exploring genetic associations with ceRNA regulation in the human genome
Abstract Competing endogenous RNAs (ceRNAs) are RNA molecules that sequester shared microRNAs (miRNAs) thereby affecting the expression of other targets of the miRNAs. Whether genetic variants in ceRNA can affect its biological function and disease development is still an open question. Here we identified a large number of genetic variants that are associated with ceRNA's function using Geuvaids RNA-seq data for 462 individuals from the 1000 Genomes Project. We call these loci competing endogenous RNA expression quantitative trait loci or ācerQTLā, and found that a large number of them were unexplored in conventional eQTL mapping. We identified many cerQTLs that have undergone recent positive selection in different human populations, and showed that single nucleotide polymorphisms in gene 3ĪUTRs at the miRNA seed binding regions can simultaneously regulate gene expression changes in both cis and trans by the ceRNA mechanism. We also discovered that cerQTLs are significantly enriched in traits/diseases associated variants reported from genome-wide association studies in the miRNA binding sites, suggesting that disease susceptibilities could be attributed to ceRNA regulation. Further in vitro functional experiments demonstrated that a cerQTL rs11540855 can regulate ceRNA function. These results provide a comprehensive catalog of functional non-coding regulatory variants that may be responsible for ceRNA crosstalk at the post-transcriptional level
Study of signaling and regulatory networks by computational methods
In order to communicate with extracellular and intracellular environments, cells have to be able to receive signals and via signaling pathways pass them to nucleus, where the genome responds through cascades of gene regulatory networks (GRNs). Many signal receptors on the cell membranes consist of extracellular part that can bind particular molecules, and intracellular part that accomplishes a conformation change or open binding sites for other proteins after the extracellular binding. Consequently, some proteins/enzymes are activated or recruited, and the signaling transduction then onsets via protein-protein interactions (PPIs). Once the signal reaches nucleus and related transcription factors (TFs) are activated, they will in turn regulate the gene transcription and change the gene expression. However, systematic identification of signaling network components and construction of GRNs are not easy to be achieved by pure experiments, which are costly and time-consuming.
In this dissertation, we construct a workflow to study the signaling and regulatory networks by applying computational methods on public PPI and omics data. We first explore the phenotypic profiles of genes using a cross-species phenotype network and compile the them into a gene-phenotype association database; then a webserver is proposed using omics data to construct GRNs; next the phenotypic profiles, together with domain-domain interactions, phylogenetic profiles and expression profiles, are used as features to refine the PPI network using random forest classifier. The random walk with restart algorithm is applied to the refined PPI network to identify the signaling network components. The GRNs centered by the predicted TFs are finally constructed and with related omics data or putative binding information. With this workflow, we have successfully identified important TFs, including Pou5f1, Sox2 and Nanog, involved in the cell reprogramming mediated by vitamin C. For interaction partners and co-regulatory targets of core pluripotency factors, large parts of the results (154 out of 229 interactions) have been confirmed by literature. This workflow can also be applied to decode the signaling and regulatory networks via which the genome responds to the environmental stimuli (i.e. drug treatment). The results will not only help biologists and clinicians to better understand biological processes and diseases systematically, but also lead to further studies, such as drug repurposing.published_or_final_versionBiochemistryDoctoralDoctor of Philosoph
Recommended from our members
REAL-TIME MONITORING TECHNOLOGY OF INLET DISTORTION SIGNAL
In the flight test, the matching compatibility of the fighter inlet and the engine is the key to
the test flight of the engine performance quality. Especially at high incidence of high attack
angle and over stall maneuver, the characteristics of the inlet are very important to the engine.
The current traditional test scheme cannot real-time telemeter the inlet distortion signal.This
paper aims at the problem by designing an embedded airborne real-time processing unit
which can real-time calculate and telemetry of the inlet distortion signal. Then the results are
displayed in the ground monitoring station by cloud image mode. So we can evaluate the
matching performance of the inlet and engine during aircraft flight, and achieve the aim of
prejudging the risk of inlet distortion.International Foundation for TelemeteringProceedings from the International Telemetering Conference are made available by the International Foundation for Telemetering and the University of Arizona Libraries. Visit http://www.telemetry.org/index.php/contact-us if you have questions about items in this collection
Style sheet for contributors to the Australian Review of Applied Linguistics
Summary: Recent advances in high-throughput sequencing technologies have enabled us to sequence large number of cancer samples to reveal novel insights into oncogenetic mechanisms. However, the presence of intratumoral heterogeneity, normal cell contamination and insufficient sequencing depth, together pose a challenge for detecting somatic mutations. Here we propose a fast and an accurate somatic single-nucleotide variations (SNVs) detection program, FaSD-somatic. The performance of FaSD-somatic is extensively assessed on various types of cancer against several state-of-the-Art somatic SNV detection programs. Benchmarked by somatic SNVs from either existing databases or de novo higher-depth sequencing data, FaSD-somatic has the best overall performance. Furthermore, FaSD-somatic is efficient, it finishes somatic SNV calling within 14 h on 50X whole genome sequencing data in paired samples. ĆĀ© The Author 2014. Published by Oxford University Press. All rights reserved.Link_to_subscribed_fulltex
EpiRegNet: Constructing epigenetic regulatory network from high throughput gene expression data for humans
- ā¦