8 research outputs found

    Method development for comparative cancer genomics

    Get PDF
    With the cost of sequencing continuously dropping and the increased availability of se- quencing technologies the coming years will bring a wealth of sequencing data that will be tremendously interesting and challenging to analyse and interpret. It is becoming more and more clear that both the algorithmic approaches as well as the handling of the data itself will prove challenging and some of the legacy approaches and file formats that were developed in the wake of the first large-scale sequencing projects (e.g. the human genome sequencing project) will prove unsuitable or at least inconvenient to use once projects that aim to analyse sequencing data from thousands of samples become the norm rather than the exception. In this thesis I will discuss my work in the field of sequencing analysis during my PhD studies. I will touch upon some of the challenges and problems researchers are facing in the field today and will present the approaches and solutions I have developed to deal with those issues. To this end I will present my work in methods development for sequencing analy- sis in general and specifically in the context of cancer genomics and give examples of the application of those methods in projects that I have been involved in. My two main contribu- tions to available methods for sequencing analysis are the HTSeq Python Library and the h5vc R/Bioconductor package (Anders et al., 2014; Pyl et al., 2014). I co-developed the former with Simon Anders and am the lead developer of the latter. Both pieces of software are available through public repositories, and are well-documented and -maintained. The projects in which those methods have found application are the HeLa Kyoto sequenc- ing project (Landry et al., 2013) and a set of three cancer genomics projects involving cohorts of up to 18 whole genome sequencing (WGS) samples and up to 21 whole exome sequencing (WES) samples, respectively. I will discuss my methodological contributions to these projects as well as relevant biological results in Section 5

    Proteogenomics decodes the evolution of human ipsilateral breast cancer

    Get PDF
    Ipsilateral breast tumor recurrence (IBTR) is a clinically important event, where an isolated in-breast recurrence is a potentially curable event but associated with an increased risk of distant metastasis and breast cancer death. It remains unclear if IBTRs are associated with molecular changes that can be explored as a resource for precision medicine strategies. Here, we employed proteogenomics to analyze a cohort of 27 primary breast cancers and their matched IBTRs to define proteogenomic determinants of molecular tumor evolution. Our analyses revealed a relationship between hormonal receptors status and proliferation levels resulting in the gain of somatic mutations and copy number. This in turn re-programmed the transcriptome and proteome towards a highly replicating and genomically unstable IBTRs, possibly enhanced by APOBEC3B. In order to investigate the origins of IBTRs, a second analysis that included primaries with no recurrence pinpointed proliferation and immune infiltration as predictive of IBTR. In conclusion, our study shows that breast tumors evolve into different IBTRs depending on hormonal status and proliferation and that immune cell infiltration and Ki-67 are significantly elevated in primary tumors that develop IBTR. These results can serve as a starting point to explore markers to predict IBTR formation and stratify patients for adjuvant therapy

    Analysing high-throughput sequencing data in Python with HTSeq 2.0

    No full text
    HTSeq 2.0 provides a more extensive application programming interface including a new representation for sparse genomic data, enhancements for htseq-count to suit single-cell omics, a new script for data using cell and molecular barcodes, improved documentation, testing and deployment, bug fixes and Python 3 support

    Cerebrospinal fluid proteome maps detect pathogen-specific host response patterns in meningitis

    No full text
    Meningitis is a potentially life-threatening infection characterized by the inflammation of the leptomeningeal membranes. Many different viral and bacterial pathogens can cause meningitis, with differences in mortality rates, risk of developing neurological sequelae and treatment options. Here we constructed a compendium of digital cerebrospinal fluid (CSF) proteome maps to define pathogen-specific host response patterns in meningitis. The results revealed a drastic and pathogen-type specific influx of tissue-, cell- and plasma proteins in the CSF, where in particular a large increase of neutrophil derived proteins in the CSF correlated with acute bacterial meningitis. Additionally, both acute bacterial and viral meningitis result in marked reduction of brain-enriched proteins. Generation of a multi-protein LASSO regression model resulted in an 18-protein panel of cell and tissue associated proteins capable of classifying acute bacterial meningitis and viral meningitis. The same protein panel also enabled classification of tick-borne encephalitis, a subgroup of viral meningitis, with high sensitivity and specificity. The work provides insights into pathogen specific host response patterns in CSF from different disease etiologies to support future classification of pathogen-type based on host response patterns in meningitis

    Proteogenomic Workflow Reveals Molecular Phenotypes Related to Breast Cancer Mammographic Appearance

    No full text
    Proteogenomic approaches have enabled the generat̲ion of novel information levels when compared to single omics studies although burdened by extensive experimental efforts. Here, we improved a data-independent acquisition mass spectrometry proteogenomic workflow to reveal distinct molecular features related to mammographic appearances in breast cancer. Our results reveal splicing processes detectable at the protein level and highlight quantitation and pathway complementarity between RNA and protein data. Furthermore, we confirm previously detected enrichments of molecular pathways associated with estrogen receptor-dependent activity and provide novel evidence of epithelial-to-mesenchymal activity in mammography-detected spiculated tumors. Several transcript-protein pairs displayed radically different abundances depending on the overall clinical properties of the tumor. These results demonstrate that there are differentially regulated protein networks in clinically relevant tumor subgroups, which in turn alter both cancer biology and the abundance of biomarker candidates and drug targets

    Analysis of nonleukemic cellular subcompartments reconstructs clonal evolution of acute myeloid leukemia and identifies therapy-resistant preleukemic clones

    No full text
    To acquire a better understanding of clonal evolution of acute myeloid leukemia (AML) and to identify the clone(s) responsible for disease recurrence, we have comparatively studied leukemia-specific mutations by whole-exome-sequencing (WES) of both the leukemia and the nonleukemia compartments derived from the bone marrow of AML patients. The T-lymphocytes, B-lymphocytes and the functionally normal hematopoietic stem cells (HSC), that is, CD34+/CD38−/ALDH+ cells for AML with rare-ALDH+ blasts (<1.9% ALDH+ cells) were defined as the nonleukemia compartments. WES identified 62 point-mutations in the leukemia compartment derived from 12 AML-patients at the time of diagnosis and 73 mutations in 3 matched relapse cases. Most patients (8/12) showed 4 to 6 point-mutations per sample at diagnosis. Other than the mutations in the recurrently mutated genes such as DNMT3A, NRAS and KIT, we were able to identify novel point-mutations that have not yet been described in AML. Some leukemia-specific mutations and cytogenetic abnormalities including DNMT3A(R882H), EZH2(I146T) and inversion(16) were also detectable in the respective T-lymphocytes, B-lymphocytes and HSC in 5/12 patients, suggesting that preleukemia HSC might represent the source of leukemogenesis for these cases. The leukemic evolution was reconstructed for five cases with detectable preleukemia clones, which were tracked in follow-up and relapse samples. Four of the five patients with detectable preleukemic mutations developed relapse. The presence of leukemia-specific mutations in these nonleukemia compartments, especially after chemotherapy or after allogeneic stem cell transplantation, is highly relevant, as these could be responsible for relapse. This discovery may facilitate the identification of novel targets for long-term cure
    corecore