49 research outputs found

    Cell-cell interactome of the hematopoietic niche and its changes in acute myeloid leukemia.

    Get PDF
    The bone marrow (BM) is a complex microenvironment, coordinating the production of billions of blood cells every day. Despite its essential role and its relevance to hematopoietic diseases, this environment remains poorly characterized. Here we present a high-resolution characterization of the niche in health and acute myeloid leukemia (AML) by establishing a single-cell gene expression database of 339,381 BM cells. We found significant changes in cell type proportions and gene expression in AML, indicating that the entire niche is disrupted. We then predicted interactions between hematopoietic stem and progenitor cells (HSPCs) and other BM cell types, revealing a remarkable expansion of predicted interactions in AML that promote HSPC-cell adhesion, immunosuppression, and cytokine signaling. In particular, predicted interactions involving transforming growth factor β1 (TGFB1) become widespread, and we show that this can drive AML cell quiescence in vitro. Our results highlight potential mechanisms of enhanced AML-HSPC competitiveness and a skewed microenvironment, fostering AML growth

    Allele-specific transcriptional elongation regulates monoallelic expression of the IGF2BP1 gene

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Random monoallelic expression contributes to phenotypic variation of cells and organisms. However, the epigenetic mechanisms by which individual alleles are randomly selected for expression are not known. Taking cues from chromatin signatures at imprinted gene loci such as the insulin-like growth factor 2 gene 2 (<it>IGF2</it>), we evaluated the contribution of CTCF, a zinc finger protein required for parent-of-origin-specific expression of the <it>IGF2 </it>gene, as well as a role for allele-specific association with DNA methylation, histone modification and RNA polymerase II.</p> <p>Results</p> <p>Using array-based chromatin immunoprecipitation, we identified 293 genomic loci that are associated with both CTCF and histone H3 trimethylated at lysine 9 (H3K9me3). A comparison of their genomic positions with those of previously published monoallelically expressed genes revealed no significant overlap between allele-specifically expressed genes and colocalized CTCF/H3K9me3. To analyze the contributions of CTCF and H3K9me3 to gene regulation in more detail, we focused on the monoallelically expressed <it>IGF2BP1 </it>gene. <it>In vitro </it>binding assays using the CTCF target motif at the <it>IGF2BP1 </it>gene, as well as allele-specific analysis of cytosine methylation and CTCF binding, revealed that CTCF does not regulate mono- or biallelic <it>IGF2BP1 </it>expression. Surprisingly, we found that RNA polymerase II is detected on both the maternal and paternal alleles in B lymphoblasts that express <it>IGF2BP1 </it>primarily from one allele. Thus, allele-specific control of RNA polymerase II elongation regulates the allelic bias of <it>IGF2BP1 </it>gene expression.</p> <p>Conclusions</p> <p>Colocalization of CTCF and H3K9me3 does not represent a reliable chromatin signature indicative of monoallelic expression. Moreover, association of individual alleles with both active (H3K4me3) and silent (H3K27me3) chromatin modifications (allelic bivalent chromatin) or with RNA polymerase II also fails to identify monoallelically expressed gene loci. The selection of individual alleles for expression occurs in part during transcription elongation.</p

    Altered Expression of ACOX2 In Non-Small Cell Lung Cancer

    Get PDF
    Peroxisomes are organelles that play essential roles in many metabolic processes, but also play roles in innate immunity, signal transduction, aging and cancer. One of the main functions of peroxisomes is the processing of very-long chain fatty acids into metabolites that can be directed to the mitochondria. One key family of enzymes in this process are the peroxisomal acyl-CoA oxidases (ACOX1, ACOX2 and ACOX3), the expression of which has been shown to be dysregulated in some cancers. Very little is however known about the expression of this family of oxidases in non-small cell lung cancer (NSCLC). ACOX2 has however been suggested to be elevated at the mRNA level in over 10% of NSCLC, and in the present study using both standard and bioinformatics approaches we show that expression of ACOX2 is significantly altered in NSCLC. ACOX2 mRNA expression is linked to a number of mutated genes, and associations between ACOX2 expression and tumour mutational burden and immune cell infiltration were explored. Links between ACOX2 expression and candidate therapies for oncogenic driver mutations such as KRAS were also identified. Furthermore, levels of acyl-CoA oxidases and other associated peroxisomal genes were explored to identify further links between the peroxisomal pathway and NSCLC. The results of this biomarker driven study suggest that ACOX2 may have potential clinical utility in the diagnosis, prognosis and stratification of patients into various therapeutically targetable options

    Machine learning and high-performance computing: Infrastructure and algorithms for the genome-scale study of genetic and epigenetic regulatory mechanisms with applications in neuroscience

    Get PDF
    The advent of next-generation sequencing (NGS) has fundamentally changed modern genomics re-search. These sequencers generate terabytes of data and necessitate the use, not only of high-performance compute (HPC) clusters for data processing and storage, but also of intelligent, scalable algorithms for pattern discovery and data mining. This thesis details the development of infrastructure and algorithms which automate much of this data analysis process allowing bench biologists to remain focused on the scientific questions that drive them, rather than the informatics challenges associated with these new platforms. We describe WASP, one of the first end-to-end systems to handle all aspects of NGS data generation, including sample submission, laboratory information management system (LIMS) functionality, and assay-specific processing pipelines. Furthermore, we present two machine learning algorithms for the secondary analysis of ChIP-seq data, the first, based on the use of self-organising maps (SOMs) for improved de novo motif discovery, and the second, which uses genetic algorithms (GAs) to automatically cluster transcription factor binding motifs. Finally, we present an application of this infrastructure and these techniques to the study of the role of the TBX1 transcription factor in 22q11.2 Deletion Syndrome, examining its putative role in neural development, adult neurogenesis, autism spectrum disorder (ASD), and schizophrenia

    Allele-specific transcriptional elongation regulates monoallelic expression of the IGF2BP1 gene

    Get PDF
    Background Random monoallelic expression contributes to phenotypic variation of cells and organisms. However, the epigenetic mechanisms by which individual alleles are randomly selected for expression are not known. Taking cues from chromatin signatures at imprinted gene loci such as the insulin-like growth factor 2 gene 2 (IGF2), we evaluated the contribution of CTCF, a zinc finger protein required for parent-of-origin-specific expression of the IGF2 gene, as well as a role for allele-specific association with DNA methylation, histone modification and RNA polymerase II. Results Using array-based chromatin immunoprecipitation, we identified 293 genomic loci that are associated with both CTCF and histone H3 trimethylated at lysine 9 (H3K9me3). A comparison of their genomic positions with those of previously published monoallelically expressed genes revealed no significant overlap between allele-specifically expressed genes and colocalized CTCF/H3K9me3. To analyze the contributions of CTCF and H3K9me3 to gene regulation in more detail, we focused on the monoallelically expressed IGF2BP1 gene. In vitro binding assays using the CTCF target motif at the IGF2BP1 gene, as well as allele-specific analysis of cytosine methylation and CTCF binding, revealed that CTCF does not regulate mono- or biallelic IGF2BP1 expression. Surprisingly, we found that RNA polymerase II is detected on both the maternal and paternal alleles in B lymphoblasts that express IGF2BP1 primarily from one allele. Thus, allele-specific control of RNA polymerase II elongation regulates the allelic bias of IGF2BP1 gene expression. Conclusions Colocalization of CTCF and H3K9me3 does not represent a reliable chromatin signature indicative of monoallelic expression. Moreover, association of individual alleles with both active (H3K4me3) and silent (H3K27me3) chromatin modifications (allelic bivalent chromatin) or with RNA polymerase II also fails to identify monoallelically expressed gene loci. The selection of individual alleles for expression occurs in part during transcription elongation

    Alignment-free clustering of transcription factor binding motifs using a genetic-k-medoids approach

    Get PDF
    Journal articleBackground: Familial binding profiles (FBPs) represent the average binding specificity for a group of structurally related DNA-binding proteins. The construction of such profiles allows the classification of novel motifs based on similarity to known families, can help to reduce redundancy in motif databases and de novo prediction algorithms, and can provide valuable insights into the evolution of binding sites. Many current approaches to automated motif clustering rely on progressive tree-based techniques, and can suffer from so-called frozen sub-alignments, where motifs which are clustered early on in the process remain 'locked' in place despite the potential for better placement at a later stage. In order to avoid this scenario, we have developed a genetic-k-medoids approach which allows motifs to move freely between clusters at any point in the clustering process.Results: We demonstrate the performance of our algorithm, GMACS, on multiple benchmark motif datasets, comparing results obtained with current leading approaches. The first dataset includes 355 position weight matrices from the TRANSFAC database and indicates that the k-mer frequency vector approach used in GMACS outperforms other motif comparison techniques. We then cluster a set of 79 motifs from the JASPAR database previously used in several motif clustering studies and demonstrate that GMACS can produce a higher number of structurally homogeneous clusters than other methods without the need for a large number of singletons. Finally, we show the robustness of our algorithm to noise on multiple synthetic datasets consisting of known motifs convolved with varying degrees of noise.Conclusions: Our proposed algorithm is generally applicable to any DNA or protein motifs, can produce highly stable and biologically meaningful clusters, and, by avoiding the problem of frozen sub-alignments, can provide improved results when compared with existing techniques on benchmark datasets.Science Foundation Ireland (Grant Number 05/RFP/CMS0001)peer-reviewe

    Alignment-free clustering of transcription factor binding motifs using a genetic-k-medoids approach

    No full text
    Background: Familial binding profiles (FBPs) represent the average binding specificity for a group of structurally related DNA-binding proteins. The construction of such profiles allows the classification of novel motifs based on similarity to known families, can help to reduce redundancy in motif databases and de novo prediction algorithms, and can provide valuable insights into the evolution of binding sites. Many current approaches to automated motif clustering rely on progressive tree-based techniques, and can suffer from so-called frozen sub-alignments, where motifs which are clustered early on in the process remain 'locked' in place despite the potential for better placement at a later stage. In order to avoid this scenario, we have developed a genetic-k-medoids approach which allows motifs to move freely between clusters at any point in the clustering process. Results: We demonstrate the performance of our algorithm, GMACS, on multiple benchmark motif datasets, comparing results obtained with current leading approaches. The first dataset includes 355 position weight matrices from the TRANSFAC database and indicates that the k-mer frequency vector approach used in GMACS outperforms other motif comparison techniques. We then cluster a set of 79 motifs from the JASPAR database previously used in several motif clustering studies and demonstrate that GMACS can produce a higher number of structurally homogeneous clusters than other methods without the need for a large number of singletons. Finally, we show the robustness of our algorithm to noise on multiple synthetic datasets consisting of known motifs convolved with varying degrees of noise. Conclusions: Our proposed algorithm is generally applicable to any DNA or protein motifs, can produce highly stable and biologically meaningful clusters, and, by avoiding the problem of frozen sub-alignments, can provide improved results when compared with existing techniques on benchmark datasets

    Alignment-free clustering of transcription factor binding motifs using a genetic-k-medoids approach

    No full text
    Background: Familial binding profiles (FBPs) represent the average binding specificity for a group of structurally related DNA-binding proteins. The construction of such profiles allows the classification of novel motifs based on similarity to known families, can help to reduce redundancy in motif databases and de novo prediction algorithms, and can provide valuable insights into the evolution of binding sites. Many current approaches to automated motif clustering rely on progressive tree-based techniques, and can suffer from so-called frozen sub-alignments, where motifs which are clustered early on in the process remain \u27locked\u27 in place despite the potential for better placement at a later stage. In order to avoid this scenario, we have developed a genetic-k-medoids approach which allows motifs to move freely between clusters at any point in the clustering process. Results: We demonstrate the performance of our algorithm, GMACS, on multiple benchmark motif datasets, comparing results obtained with current leading approaches. The first dataset includes 355 position weight matrices from the TRANSFAC database and indicates that the k-mer frequency vector approach used in GMACS outperforms other motif comparison techniques. We then cluster a set of 79 motifs from the JASPAR database previously used in several motif clustering studies and demonstrate that GMACS can produce a higher number of structurally homogeneous clusters than other methods without the need for a large number of singletons. Finally, we show the robustness of our algorithm to noise on multiple synthetic datasets consisting of known motifs convolved with varying degrees of noise. Conclusions: Our proposed algorithm is generally applicable to any DNA or protein motifs, can produce highly stable and biologically meaningful clusters, and, by avoiding the problem of frozen sub-alignments, can provide improved results when compared with existing techniques on benchmark datasets
    corecore