66 research outputs found

    A data integration approach to mapping OCT4 gene regulatory networks operative in embryonic stem cells and embryonal carcinoma cells

    Get PDF
    It is essential to understand the network of transcription factors controlling self-renewal of human embryonic stem cells (ESCs) and human embryonal carcinoma cells (ECs) if we are to exploit these cells in regenerative medicine regimes. Correlating gene expression levels after RNAi-based ablation of OCT4 function with its downstream targets enables a better prediction of motif-specific driven expression modules pertinent for self-renewal and differentiation of embryonic stem cells and induced pluripotent stem cells.We initially identified putative direct downstream targets of OCT4 by employing CHIP-on-chip analysis. A comparison of three peak analysis programs revealed a refined list of OCT4 targets in the human EC cell line NCCIT, this list was then compared to previously published OCT4 CHIP-on-chip datasets derived from both ES and EC cells. We have verified an enriched POU-motif, discovered by a de novo approach, thus enabling us to define six distinct modules of OCT4 binding and regulation of its target genes.A selection of these targets has been validated, like NANOG, which harbours the evolutionarily conserved OCT4-SOX2 binding motif within its proximal promoter. Other validated targets, which do not harbour the classical HMG motif are USP44 and GADD45G, a key regulator of the cell cycle. Over-expression of GADD45G in NCCIT cells resulted in an enrichment and up-regulation of genes associated with the cell cycle (CDKN1B, CDKN1C, CDK6 and MAPK4) and developmental processes (BMP4, HAND1, EOMES, ID2, GATA4, GATA5, ISL1 and MSX1). A comparison of positively regulated OCT4 targets common to EC and ES cells identified genes such as NANOG, PHC1, USP44, SOX2, PHF17 and OCT4, thus further confirming their universal role in maintaining self-renewal in both cell types. Finally we have created a user-friendly database (http://biit.cs.ut.ee/escd/), integrating all OCT4 and stem cell related datasets in both human and mouse ES and EC cells.In the current era of systems biology driven research, we envisage that our integrated embryonic stem cell database will prove beneficial to the booming field of ES, iPS and cancer research

    Gene expression signatures defining fundamental biological processes in pluripotent, early, and late differentiated embryonic stem cells

    Get PDF
    Investigating the molecular mechanisms controlling the in vivo developmental program postembryogenesis is challenging and time consuming. However, the developmental program can be partly recapitulated in vitro by the use of cultured embryonic stem cells (ESCs). Similar to the totipotent cells of the inner cell mass, gene expression and morphological changes in cultured ESCs occur hierarchically during their differentiation, with epiblast cells developing first, followed by germ layers and finally somatic cells. Combination of high throughput -omics technologies with murine ESCs offers an alternative approach for studying developmental processes toward organ-specific cell phenotypes. We have made an attempt to understand differentiation networks controlling embryogenesis in vivo using a time kinetic, by identifying molecules defining fundamental biological processes in the pluripotent state as well as in early and the late differentiation stages of ESCs. Our microarray data of the differentiation of the ESCs clearly demonstrate that the most critical early differentiation processes occur at days 2 and 3 of differentiation. Besides monitoring well-annotated markers pertinent to both self-renewal and potency (capacity to differentiate to different cell lineage), we have identified candidate molecules for relevant signaling pathways. These molecules can be further investigated in gain and loss-of-function studies to elucidate their role for pluripotency and differentiation. As an example, siRNA knockdown of MageB16, a gene highly expressed in the pluripotent state, has proven its influence in inducing differentiation when its function is repressed

    DNA methylation changes in endometrium and correlation with gene expression during the transition from pre-receptive to receptive phase

    Get PDF
    The inner uterine lining (endometrium) is a unique tissue going through remarkable changes each menstrual cycle. Endometrium has its characteristic DNA methylation profile, although not much is known about the endometrial methylome changes throughout the menstrual cycle. The impact of methylome changes on gene expression and thereby on the function of the tissue, including establishing receptivity to implanting embryo, is also unclear. Therefore, this study used genome-wide technologies to characterize the methylome and the correlation between DNA methylation and gene expression in endometrial biopsies collected from 17 healthy fertile-aged women from pre-receptive and receptive phase within one menstrual cycle. Our study showed that the overall methylome remains relatively stable during this stage of the menstrual cycle, with small-scale changes affecting 5% of the studied CpG sites (22,272 out of studied 437,022 CpGs, FDR <0.05). Of differentially methylated CpG sites with the largest absolute changes in methylation level, approximately 30% correlated with gene expression measured by RNA sequencing, with negative correlations being more common in 5 ' UTR and positive correlations in the gene 'Body' region. According to our results, extracellular matrix organization and immune response are the pathways most affected by methylation changes during the transition from pre-receptive to receptive phase.Peer reviewe

    HENA, heterogeneous network-based data set for Alzheimer's disease.

    Get PDF
    Alzheimer's disease and other types of dementia are the top cause for disabilities in later life and various types of experiments have been performed to understand the underlying mechanisms of the disease with the aim of coming up with potential drug targets. These experiments have been carried out by scientists working in different domains such as proteomics, molecular biology, clinical diagnostics and genomics. The results of such experiments are stored in the databases designed for collecting data of similar types. However, in order to get a systematic view of the disease from these independent but complementary data sets, it is necessary to combine them. In this study we describe a heterogeneous network-based data set for Alzheimer's disease (HENA). Additionally, we demonstrate the application of state-of-the-art graph convolutional networks, i.e. deep learning methods for the analysis of such large heterogeneous biological data sets. We expect HENA to allow scientists to explore and analyze their own results in the broader context of Alzheimer's disease research

    G = MAT: Linking Transcription Factor Expression and DNA Binding Data

    Get PDF
    Transcription factors are proteins that bind to motifs on the DNA and thus affect gene expression regulation. The qualitative description of the corresponding processes is therefore important for a better understanding of essential biological mechanisms. However, wet lab experiments targeted at the discovery of the regulatory interplay between transcription factors and binding sites are expensive. We propose a new, purely computational method for finding putative associations between transcription factors and motifs. This method is based on a linear model that combines sequence information with expression data. We present various methods for model parameter estimation and show, via experiments on simulated data, that these methods are reliable. Finally, we examine the performance of this model on biological data and conclude that it can indeed be used to discover meaningful associations. The developed software is available as a web tool and Scilab source code at http://biit.cs.ut.ee/gmat/

    VisHiC—hierarchical functional enrichment analysis of microarray data

    Get PDF
    Measuring gene expression levels with microarrays is one of the key technologies of modern genomics. Clustering of microarray data is an important application, as genes with similar expression profiles may be regulated by common pathways and involved in related functions. Gene Ontology (GO) analysis and visualization allows researchers to study the biological context of discovered clusters and characterize genes with previously unknown functions. We present VisHiC (Visualization of Hierarchical Clustering), a web server for clustering and compact visualization of gene expression data combined with automated function enrichment analysis. The main output of the analysis is a dendrogram and visual heatmap of the expression matrix that highlights biologically relevant clusters based on enriched GO terms, pathways and regulatory motifs. Clusters with most significant enrichments are contracted in the final visualization, while less relevant parts are hidden altogether. Such a dense representation of microarray data gives a quick global overview of thousands of transcripts in many conditions and provides a good starting point for further analysis. VisHiC is freely available at http://biit.cs.ut.ee/vishic

    GraphWeb: mining heterogeneous biological networks for gene modules with functional significance

    Get PDF
    Deciphering heterogeneous cellular networks with embedded modules is a great challenge of current systems biology. Experimental and computational studies construct complex networks of molecules that describe various aspects of the cell such as transcriptional regulation, protein interactions and metabolism. Groups of interacting genes and proteins reflect network modules that potentially share regulatory mechanisms and relate to common function. Here, we present GraphWeb, a public web server for biological network analysis and module discovery. GraphWeb provides methods to: (1) integrate heterogeneous and multispecies data for constructing directed and undirected, weighted and unweighted networks; (ii) discover network modules using a variety of algorithms and topological filters and (iii) interpret modules using functional knowledge of the Gene Ontology and pathways, as well as regulatory features such as binding motifs and microRNA targets. GraphWeb is designed to analyse individual or multiple merged networks, search for conserved features across multiple species, mine large biological networks for smaller modules, discover novel candidates and connections for known pathways and compare results of high-throughput datasets. The GraphWeb is available at http://biit.cs.ut.ee/graphweb/

    An exploratory phenome wide association study linking asthma and liver disease genetic variants to electronic health records from the Estonian Biobank

    Get PDF
    <div><p>The Estonian Biobank, governed by the Institute of Genomics at the University of Tartu (Biobank), has stored genetic material/DNA and continuously collected data since 2002 on a total of 52,274 individuals representing ~5% of the Estonian adult population and is increasing. To explore the utility of data available in the Biobank, we conducted a phenome-wide association study (PheWAS) in two areas of interest to healthcare researchers; asthma and liver disease. We used 11 asthma and 13 liver disease-associated single nucleotide polymorphisms (SNPs), identified from published genome-wide association studies, to test our ability to detect established associations. We confirmed 2 asthma and 5 liver disease associated variants at nominal significance and directionally consistent with published results. We found 2 associations that were opposite to what was published before (rs4374383:AA increases risk of NASH/NAFLD, rs11597086 increases ALT level). Three SNP-diagnosis pairs passed the phenome-wide significance threshold: rs9273349 and E06 (thyroiditis, p = 5.50x10<sup>-8</sup>); rs9273349 and E10 (type-1 diabetes, p = 2.60x10<sup>-7</sup>); and rs2281135 and K76 (non-alcoholic liver diseases, including NAFLD, p = 4.10x10<sup>-7</sup>). We have validated our approach and confirmed the quality of the data for these conditions. Importantly, we demonstrate that the extensive amount of genetic and medical information from the Estonian Biobank can be successfully utilized for scientific research.</p></div

    Local Renyi entropic profiles of DNA sequences

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In a recent report the authors presented a new measure of continuous entropy for DNA sequences, which allows the estimation of their randomness level. The definition therein explored was based on the Rényi entropy of probability density estimation (pdf) using the Parzen's window method and applied to Chaos Game Representation/Universal Sequence Maps (CGR/USM). Subsequent work proposed a fractal pdf kernel as a more exact solution for the iterated map representation. This report extends the concepts of continuous entropy by defining DNA sequence entropic profiles using the new pdf estimations to refine the density estimation of motifs.</p> <p>Results</p> <p>The new methodology enables two results. On the one hand it shows that the entropic profiles are directly related with the statistical significance of motifs, allowing the study of under and over-representation of segments. On the other hand, by spanning the parameters of the kernel function it is possible to extract important information about the scale of each conserved DNA region. The computational applications, developed in Matlab m-code, the corresponding binary executables and additional material and examples are made publicly available at <url>http://kdbio.inesc-id.pt/~svinga/ep/</url>.</p> <p>Conclusion</p> <p>The ability to detect local conservation from a scale-independent representation of symbolic sequences is particularly relevant for biological applications where conserved motifs occur in multiple, overlapping scales, with significant future applications in the recognition of foreign genomic material and inference of motif structures.</p
    corecore