34 research outputs found
An experimentally validated network of nine haematopoietic transcription factors reveals mechanisms of cell state stability.
Transcription factor (TF) networks determine cell-type identity by establishing and maintaining lineage-specific expression profiles, yet reconstruction of mammalian regulatory network models has been hampered by a lack of comprehensive functional validation of regulatory interactions. Here, we report comprehensive ChIP-Seq, transgenic and reporter gene experimental data that have allowed us to construct an experimentally validated regulatory network model for haematopoietic stem/progenitor cells (HSPCs). Model simulation coupled with subsequent experimental validation using single cell expression profiling revealed potential mechanisms for cell state stabilisation, and also how a leukaemogenic TF fusion protein perturbs key HSPC regulators. The approach presented here should help to improve our understanding of both normal physiological and disease processes.Research in the authors’ laboratories was supported by Bloodwise, The Wellcome Trust, Cancer Research UK, the Biotechnology and Biological Sciences Research Council, the National Institute of Health Research, the Medical Research Council, the MRC Molecular Haematology Unit (Oxford) core award, a Weizmann-UK “Making Connections” grant (Oxford) and core support grants by the Wellcome Trust to the Cambridge Institute for Medical Research (100140) and Wellcome Trust–MRC Cambridge Stem Cell Institute (097922).This is the final version of the article. It first appeared from eLife via http://dx.doi.org/10.7554/eLife.1146
A GWAS sequence variant for platelet volume marks an alternative DNM3 promoter in megakaryocytes near a MEIS1 binding site
We recently identified 68 genomic loci where common sequence variants are associated with platelet count and volume. Platelets are formed in the bone marrow by megakaryocytes, which are derived from hematopoietic stem cells by a process mainly controlled by transcription factors. The homeobox transcription factor MEIS1 is uniquely transcribed in megakaryocytes and not in the other lineage-committed blood cells. By ChIP-seq, we show that 5 of the 68 loci pinpoint a MEIS1 binding event within a group of 252 MK-overexpressed genes. In one such locus in DNM3, regulating platelet volume, the MEIS1 binding site falls within a region acting as an alternative promoter that is solely used in megakaryocytes, where allelic variation dictates different levels of a shorter transcript. The importance of dynamin activity to the latter stages of thrombopoiesis was confirmed by the observation that the inhibitor Dynasore reduced murine proplatelet for-mation in vitro
Transcriptional diversity during lineage commitment of human blood progenitors.
Blood cells derive from hematopoietic stem cells through stepwise fating events. To characterize gene expression programs driving lineage choice, we sequenced RNA from eight primary human hematopoietic progenitor populations representing the major myeloid commitment stages and the main lymphoid stage. We identified extensive cell type-specific expression changes: 6711 genes and 10,724 transcripts, enriched in non-protein-coding elements at early stages of differentiation. In addition, we found 7881 novel splice junctions and 2301 differentially used alternative splicing events, enriched in genes involved in regulatory processes. We demonstrated experimentally cell-specific isoform usage, identifying nuclear factor I/B (NFIB) as a regulator of megakaryocyte maturation-the platelet precursor. Our data highlight the complexity of fating events in closely related progenitor populations, the understanding of which is essential for the advancement of transplantation and regenerative medicine.The work described in this article was primarily supported by the European Commission Seventh Framework Program through the BLUEPRINT grant with code HEALTH-F5-2011-282510 (D.H., F.B., G.C., J.H.A.M., K.D., L.C., M.F., S.C., S.F., and S.P.G.). Research in the Ouwehand laboratory is further supported by program grants from the National Institute for Health Research (NIHR, www.nihr.ac.uk; to A.A., M.K., P.P., S.B.G.J., S.N., and W.H.O.) and the British Heart Foundation under nos. RP-PG-0310-1002 and RG/09/12/28096 (www.bhf.org.uk; to A.R. and W.J.A.). K.F. and M.K. were supported by Marie Curie funding from the NETSIM FP7 program funded by the European Commission. The laboratory receives funding from the NHS Blood and Transplant for facilities. The Cambridge BioResource (www.cambridgebioresource.org.uk), the Cell Phenotyping Hub, and the Cambridge Translational GenOmics laboratory (www.catgo.org.uk) are supported by an NIHR grant to the Cambridge NIHR Biomedical Research Centre (BRC). The BRIDGE-Bleeding and Platelet Disorders Consortium is supported by the NIHR BioResource—Rare Diseases (http://bioresource.nihr.ac.uk/; to E.T., N.F., and Whole Exome Sequencing effort). Research in the Soranzo laboratory (L.V., N.S., and S. Watt) is further supported by the Wellcome Trust (Grant Codes WT098051 and WT091310) and the EU FP7 EPIGENESYS initiative (Grant Code 257082). Research in the Cvejic laboratory (A. Cvejic and C.L.) is funded by the Cancer Research UK under grant no. C45041/A14953. S.J.S. is funded by NIHR. M.E.F. is supported by a British Heart Foundation Clinical Research Training Fellowship, no. FS/12/27/29405. E.B.-M. is supported by a Wellcome Trust grant, no. 084183/Z/07/Z. Research in the Laffan laboratory is supported by Imperial College BRC. F.A.C., C.L., and S. Westbury are supported by Medical Research Council Clinical Training Fellowships, and T.B. by a British Society of Haematology/NHS Blood and Transplant grant. R.J.R. is a Principal Research Fellow of the Wellcome Trust, grant no. 082961/Z/07/Z. Research in the Flicek laboratory is also supported by the Wellcome Trust (grant no. 095908) and EMBL. Research in the Bertone laboratory is supported by EMBL. K.F. and C.v.G. are supported by FWO-Vlaanderen through grant G.0B17.13N. P.F. is a compensated member of the Omicia Inc. Scientific Advisory Board. This study made use of data generated by the UK10K Consortium, derived from samples from the Cohorts arm of the project.This is the author’s version of the work. It is posted here by permission of the AAAS for personal use, not for redistribution. The definitive version was published in Science on 26/9/14 in volume 345, number 6204, DOI: 10.1126/science.1251033. This version will be under embargo until the 26th of March 2015
A Detailed Catalogue of Multi-Omics Methodologies for Identification of Putative Biomarkers and Causal Molecular Networks in Translational Cancer Research
Recent advances in sequencing and biotechnological methodologies have led to the generation of large volumes of molecular data of different omics layers, such as genomics, transcriptomics, proteomics and metabolomics. Integration of these data with clinical information provides new opportunities to discover how perturbations in biological processes lead to disease. Using data-driven approaches for the integration and interpretation of multi-omics data could stably identify links between structural and functional information and propose causal molecular networks with potential impact on cancer pathophysiology. This knowledge can then be used to improve disease diagnosis, prognosis, prevention, and therapy. This review will summarize and categorize the most current computational methodologies and tools for integration of distinct molecular layers in the context of translational cancer research and personalized therapy. Additionally, the bioinformatics tools Multi-Omics Factor Analysis (MOFA) and netDX will be tested using omics data from public cancer resources, to assess their overall robustness, provide reproducible workflows for gaining biological knowledge from multi-omics data, and to comprehensively understand the significantly perturbed biological entities in distinct cancer types. We show that the performed supervised and unsupervised analyses result in meaningful and novel findings
Mapping Cancer Registry Data to the Episode Domain of the Observational Medical Outcomes Partnership Model (OMOP)
A great challenge in the use of standardized cancer registry data is deriving reliable, evidence-based results from large amounts of data. A solution could be its mapping to a common data model such as OMOP, which represents knowledge in a unified semantic base, enabling decentralized analysis. The recently released Episode Domain of the OMOP CDM allows episodic modelling of a patient’ disease and treatment phases. In this study, we mapped oncology registry data to the Episode Domain. A total of 184,718 Episodes could be implemented, with the Concept of Cancer Drug Treatment most frequently. Additionally, source data were mapped to new terminologies as part of the release. It was possible to map ≈ 73.8% of the source data to the respective OMOP standard. Best mapping was achieved in the Procedure Domain with 98.7%. To evaluate the implementation, the survival probabilities of the CDM and source system were calculated (n = 2756/2902, median OAS = 82.2/91.1 months, 95% Cl = 77.4–89.5/84.4–100.9). In conclusion, the new release of the CDM increased its applicability, especially in observational cancer research. Regarding the mapping, a higher score could be achieved if terminologies which are frequently used in Europe are included in the Standardized Vocabulary Metadata Repository
Mapping Cancer Registry Data to the Episode Domain of the Observational Medical Outcomes Partnership Model (OMOP)
A great challenge in the use of standardized cancer registry data is deriving reliable, evidence-based results from large amounts of data. A solution could be its mapping to a common data model such as OMOP, which represents knowledge in a unified semantic base, enabling decentralized analysis. The recently released Episode Domain of the OMOP CDM allows episodic modelling of a patient’ disease and treatment phases. In this study, we mapped oncology registry data to the Episode Domain. A total of 184,718 Episodes could be implemented, with the Concept of Cancer Drug Treatment most frequently. Additionally, source data were mapped to new terminologies as part of the release. It was possible to map ≈ 73.8% of the source data to the respective OMOP standard. Best mapping was achieved in the Procedure Domain with 98.7%. To evaluate the implementation, the survival probabilities of the CDM and source system were calculated (n = 2756/2902, median OAS = 82.2/91.1 months, 95% Cl = 77.4–89.5/84.4–100.9). In conclusion, the new release of the CDM increased its applicability, especially in observational cancer research. Regarding the mapping, a higher score could be achieved if terminologies which are frequently used in Europe are included in the Standardized Vocabulary Metadata Repository
Use of Process Modelling for Optimization of Molecular Tumor Boards
In Molecular Tumor Boards, a team of experts discuss the individual therapy options of a cancer patient based on their individual molecular profile. The process—from recommendation request, through molecular diagnosis, to a personalized therapy recommendation—is complex and time-consuming. Therefore, process optimization is needed to decrease the workload of physicians and to standardize the process. For this purpose, we modeled the current workflow of the Molecular Tumor Board at the University Hospital Hamburg-Eppendorf on Service-Oriented Architecture using Business Process Modeling and Notation to highlight areas for improvement. This identified many manual tasks and an extensive workload for the physician. We then created a novel, simplified, more efficient workflow in which the physician is supported by additional software. In summary, we show that the use of Service-Oriented Architecture using Business Process Modeling and Notation for Molecular Tumor Board processes promotes rapid adaptability, standardization, interoperability, quality assurance, and facilitates collaboration
Untargeted stable isotope-resolved metabolomics to assess the effect of PI3Kβ inhibition on metabolic pathway activities in a PTEN null breast cancer cell line
The combination of high-resolution LC-MS untargeted metabolomics with stable isotope-resolved tracing is a promising approach for the global exploration of metabolic pathway activities. In our established workflow we combine targeted isotopologue feature extraction with the non-targeted X(13)CMS routine. Metabolites, detected by X(13)CMS as differentially labeled between two biological conditions are subsequently integrated into the original targeted library. This strategy enables monitoring of changes in known pathways as well as the discovery of hitherto unknown metabolic alterations. Here, we demonstrate this workflow in a PTEN (phosphatase and tensin homolog) null breast cancer cell line (MDA-MB-468) exploring metabolic pathway activities in the absence and presence of the selective PI3Kβ inhibitor AZD8186. Cells were fed with [U-(13)C] glucose and treated for 1, 3, 6, and 24 h with 0.5 µM AZD8186 or vehicle, extracted by an optimized sample preparation protocol and analyzed by LC-QTOF-MS. Untargeted differential tracing of labels revealed 286 isotope-enriched features that were significantly altered between control and treatment conditions, of which 19 features could be attributed to known compounds from targeted pathways. Other 11 features were unambiguously identified based on data-dependent MS/MS spectra and reference substances. Notably, only a minority of the significantly altered features (11 and 16, respectively) were identified when preprocessing of the same data set (treatment vs. control in 24 h unlabeled samples) was performed with tools commonly used for label-free (i.e. w/o isotopic tracer) non-targeted metabolomics experiments (Profinder´s batch recursive feature extraction and XCMS). The structurally identified metabolites were integrated into the existing targeted isotopologue feature extraction workflow to enable natural abundance correction, evaluation of assay performance and assessment of drug-induced changes in pathway activities. Label incorporation was highly reproducible for the majority of isotopologues in technical replicates with a RSD below 10%. Furthermore, inter-day repeatability of a second label experiment showed strong correlation (Pearson R (2) > 0.99) between tracer incorporation on different days. Finally, we could identify prominent pathway activity alterations upon PI3Kβ inhibition. Besides pathways in central metabolism, known to be changed our workflow revealed additional pathways, like pyrimidine metabolism or hexosamine pathway. All pathways identified represent key metabolic processes associated with cancer metabolism and therapy
Recommended from our members
Characterization of TCF21 Downstream Target Regions Identifies a Transcriptional Network Linking Multiple Independent Coronary Artery Disease Loci.
To functionally link coronary artery disease (CAD) causal genes identified by genome wide association studies (GWAS), and to investigate the cellular and molecular mechanisms of atherosclerosis, we have used chromatin immunoprecipitation sequencing (ChIP-Seq) with the CAD associated transcription factor TCF21 in human coronary artery smooth muscle cells (HCASMC). Analysis of identified TCF21 target genes for enrichment of molecular and cellular annotation terms identified processes relevant to CAD pathophysiology, including "growth factor binding," "matrix interaction," and "smooth muscle contraction." We characterized the canonical binding sequence for TCF21 as CAGCTG, identified AP-1 binding sites in TCF21 peaks, and by conducting ChIP-Seq for JUN and JUND in HCASMC confirmed that there is significant overlap between TCF21 and AP-1 binding loci in this cell type. Expression quantitative trait variation mapped to target genes of TCF21 was significantly enriched among variants with low P-values in the GWAS analyses, suggesting a possible functional interaction between TCF21 binding and causal variants in other CAD disease loci. Separate enrichment analyses found over-representation of TCF21 target genes among CAD associated genes, and linkage disequilibrium between TCF21 peak variation and that found in GWAS loci, consistent with the hypothesis that TCF21 may affect disease risk through interaction with other disease associated loci. Interestingly, enrichment for TCF21 target genes was also found among other genome wide association phenotypes, including height and inflammatory bowel disease, suggesting a functional profile important for basic cellular processes in non-vascular tissues. Thus, data and analyses presented here suggest that study of GWAS transcription factors may be a highly useful approach to identifying disease gene interactions and thus pathways that may be relevant to complex disease etiology