17 research outputs found

    Distance-based methods for the analysis of Next-Generation sequencing data

    Get PDF
    Die Analyse von NGS Daten ist ein zentraler Aspekt der modernen genomischen Forschung. Bei der Extraktion von Daten aus den beiden am häufigsten verwendeten Quellorganismen bestehen jedoch vielfältige Problemstellungen. Im ersten Kapitel wird ein neuartiger Ansatz vorgestellt welcher einen Abstand zwischen Krebszellinienkulturen auf Grundlage ihrer kleinen genomischen Varianten bestimmt um die Kulturen zu identifizieren. Eine Voll-Exom sequenzierte Kultur wird durch paarweise Vergleiche zu Referenzdatensätzen identifiziert so ein gemessener Abstand geringer ist als dies bei nicht verwandten Kulturen zu erwarten wäre. Die Wirksamkeit der Methode wurde verifiziert, jedoch verbleiben Einschränkung da nur das Sequenzierformat des Voll-Exoms unterstützt wird. Daher wird im zweiten Kapitel eine publizierte Modifikation des Ansatzes vorgestellt welcher die Unterstützung der weitläufig genutzten Bulk RNA sowie der Panel-Sequenzierung ermöglicht. Die Ausweitung der Technologiebasis führt jedoch zu einer Verstärkung von Störeffekten welche zu Verletzungen der mathematischen Konditionen einer Abstandsmetrik führen. Daher werden die entstandenen Verletzungen durch statistische Verfahren zuerst quantifiziert und danach durch dynamische Schwellwertanpassungen erfolgreich kompensiert. Das dritte Kapitel stellt eine neuartige Daten-Aufwertungsmethode (Data-Augmentation) vor welche das Trainieren von maschinellen Lernmodellen in Abwesenheit von neoplastischen Trainingsdaten ermöglicht. Ein abstraktes Abstandsmaß wird zwischen neoplastischen Entitäten sowie Entitäten gesundem Ursprungs mittels einer transkriptomischen Dekonvolution hergestellt. Die Ausgabe der Dekonvolution erlaubt dann das effektive Vorhersagen von klinischen Eigenschaften von seltenen jedoch biologisch vielfältigen Krebsarten wobei die prädiktive Kraft des Verfahrens der des etablierten Goldstandard ebenbürtig ist.The analysis of NGS data is a central aspect of modern Molecular Genetics and Oncology. The first scientific contribution is the development of a method which identifies Whole-exome-sequenced CCL via the quantification of a distance between their sets of small genomic variants. A distinguishing aspect of the method is that it was designed for the computer-based identification of NGS-sequenced CCL. An identification of an unknown CCL occurs when its abstract distance to a known CCL is smaller than is expected due to chance. The method performed favorably during benchmarks but only supported the Whole-exome-sequencing technology. The second contribution therefore extended the identification method by additionally supporting the Bulk mRNA-sequencing technology and Panel-sequencing format. However, the technological extension incurred predictive biases which detrimentally affected the quantification of abstract distances. Hence, statistical methods were introduced to quantify and compensate for confounding factors. The method revealed a heterogeneity-robust benchmark performance at the trade-off of a slightly reduced sensitivity compared to the Whole-exome-sequencing method. The third contribution is a method which trains Machine-Learning models for rare and diverse cancer types. Machine-Learning models are subsequently trained on these distances to predict clinically relevant characteristics. The performance of such-trained models was comparable to that of models trained on both the substituted neoplastic data and the gold-standard biomarker Ki-67. No proliferation rate-indicative features were utilized to predict clinical characteristics which is why the method can complement the proliferation rate-oriented pathological assessment of biopsies. The thesis revealed that the quantification of an abstract distance can address sources of erroneous NGS data analysis

    CancerResource - updated database of cancer-relevant proteins, mutations and interacting drugs

    Get PDF
    Here, we present an updated version of CancerResource, freely available without registration at http://bioinformatics.charite.de/care. With upcoming information on target expression and mutations in patients’ tumors, the need for systems supporting decisions on individual therapy is growing. This knowledge is based on numerous, experimentally validated drug-target interactions and supporting analyses such as measuring changes in gene expression using microarrays and HTS-efforts on cell lines. To enable a better overview about similar drug-target data and supporting information, a series of novel information connections are established and made available as described in the following. CancerResource contains about 91 000 drug-target relations, more than 2000 cancer cell lines and drug sensitivity data for about 50 000 drugs. CancerResource enables the capability of uploading external expression and mutation data and comparing them to the database's cell lines. Target genes and compounds are projected onto cancer-related pathways to get a better overview about how drug-target interactions benefit the treatment of cancer. Features like cellular fingerprints comprising of mutations, expression values and drug-sensitivity data can promote the understanding of genotype to drug sensitivity associations. Ultimately, these profiles can also be used to determine the most effective drug treatment for a cancer cell line most similar to a patient's tumor cells

    Transcriptomic Deconvolution of Neuroendocrine Neoplasms Predicts Clinically Relevant Characteristics

    Get PDF
    Pancreatic neuroendocrine neoplasms (panNENs) are a rare yet diverse type of neoplasia whose precise clinical–pathological classification is frequently challenging. Since incorrect classifications can affect treatment decisions, additional tools which support the diagnosis, such as machine learning (ML) techniques, are critically needed but generally unavailable due to the scarcity of suitable ML training data for rare panNENs. Here, we demonstrate that a multi-step ML framework predicts clinically relevant panNEN characteristics while being exclusively trained on widely available data of a healthy origin. The approach classifies panNENs by deconvolving their transcriptomes into cell type proportions based on shared gene expression profiles with healthy pancreatic cell types. The deconvolution results were found to provide a prognostic value with respect to the prediction of the overall patient survival time, neoplastic grading, and carcinoma versus tumor subclassification. The performance with which a proliferation rate agnostic deconvolution ML model could predict the clinical characteristics was found to be comparable to that of a comparative baseline model trained on the proliferation rate-informed MKI67 levels. The approach is novel in that it complements established proliferation rate-oriented classification schemes whose results can be reproduced and further refined by differentiating between identically graded subgroups. By including non-endocrine cell types, the deconvolution approach furthermore provides an in silico quantification of panNEN dedifferentiation, optimizing it for challenging clinical classification tasks in more aggressive panNEN subtypes.Peer Reviewe

    Discovery and Validation of Novel Biomarkers for Detection of Epithelial Ovarian Cancer

    Get PDF
    Detection of epithelial ovarian cancer (EOC) poses a critical medical challenge. However, novel biomarkers for diagnosis remain to be discovered. Therefore, innovative approaches are of the utmost importance for patient outcome. Here, we present a concept for blood-based biomarker discovery, investigating both epithelial and specifically stromal compartments, which have been neglected in search for novel candidates. We queried gene expression profiles of EOC including microdissected epithelium and adjacent stroma from benign and malignant tumours. Genes significantly differentially expressed within either the epithelial or the stromal compartments were retrieved. The expression of genes whose products are secreted yet absent in the blood of healthy donors were validated in tissue and blood from patients with pelvic mass by NanoString analysis. Results were confirmed by the comprehensive gene expression database, CSIOVDB (Ovarian cancer database of Cancer Science Institute Singapore). The top 25% of candidate genes were explored for their biomarker potential, and twelve were able to discriminate between benign and malignant tumours on transcript levels (p < 0.05). Among them T-cell differentiation protein myelin and lymphocyte (MAL), aurora kinase A (AURKA), stroma-derived candidates versican (VCAN), and syndecan-3 (SDC), which performed significantly better than the recently reported biomarker fibroblast growth factor 18 (FGF18) to discern malignant from benign conditions. Furthermore, elevated MAL and AURKA expression levels correlated significantly with a poor prognosis. We identified promising novel candidates and found the stroma of EOC to be a suitable compartment for biomarker discovery

    Gene Set Enrichment Analysis Reveals Individual Variability in Host Responses in Tuberculosis Patients

    Get PDF
    Group-aggregated responses to tuberculosis (TB) have been well characterized on a molecular level. However, human beings differ and individual responses to infection vary. We have combined a novel approach to individual gene set analysis (GSA) with the clustering of transcriptomic profiles of TB patients from seven datasets in order to identify individual molecular endotypes of transcriptomic responses to TB. We found that TB patients differ with respect to the intensity of their hallmark interferon (IFN) responses, but they also show variability in their complement system, metabolic responses and multiple other pathways. This variability cannot be sufficiently explained with covariates such as gender or age, and the molecular endotypes are found across studies and populations. Using datasets from a Cynomolgus macaque model of TB, we revealed that transcriptional signatures of different molecular TB endotypes did not depend on TB progression post-infection. Moreover, we provide evidence that patients with molecular endotypes characterized by high levels of IFN responses (IFN-rich), suffered from more severe lung pathology than those with lower levels of IFN responses (IFN-low). Harnessing machine learning (ML) models, we derived gene signatures classifying IFN-rich and IFN-low TB endotypes and revealed that the IFN-low signature allowed slightly more reliable overall classification of TB patients from non-TB patients than the IFN-rich one. Using the paradigm of molecular endotypes and the ML-based predictions allows more precisely tailored treatment regimens, predicting treatment-outcome with higher accuracy and therefore bridging the gap between conventional treatment and precision medicine.Peer Reviewe

    Evaluation einer elektronisch unterstĂĽtzten pflegerischen Ăśberleitung zwischen Krankenhaus und Pflegeheim unter Nutzung einer Test-Telematikinfrastruktur: eine Fallanalyse

    No full text
    Background: Improper information transmission can lead to compromised patient safety and quality of life when patients are transferred from one setting to another. Electronic instruments may improve this situation, however, they are rarely used. Objective: The aim of this study therefore was to investigate the technical and organizational feasibility, usability, usefulness and completeness of an electronic instrument that is based on the German HL7 CDA standard for eNursing Summaries.Materials and methods: To this end, a test health telematics infrastructure, which included the German electronic health card, was established and nursing summary application was developed that allowed summary documents to be communicated between a hospital and a nursing home. The users were asked to evaluate the usability of the nursing summary application as well as to compare the usefulness and completeness of electronically and paper transmitted information.Results: This study demonstrated the feasibility of implementing an electronic nursing summary application that was based on the German HL7 CDA standard eNursing Summary and that was integrated in a test health telematics infrastructure. It could also be shown that the users rated this application as usable and that electronically supported patient transfers were superior to paper based ones. The use of the German electronic health card was regarded as a barrier by the users.Discussion: This study emphasizes the feasibility, relevance and barriers of electronically supported transfers of patients with nursing needs. Nurses working in hospitals and long-term care can integrate an application based on the HL7 CDA Standard ePfgebericht into their working processes and get better and more complete information. To ensure continuity of care in a sustainable manner in the future, the German HL7 CDA based eNursing Summary standard should become part of the German telematics infrastructure

    Evaluating a proof-of-concept approach of the german health telematics infrastructure in the context of discharge management

    Get PDF
    Although national eHealth strategies have existed now for more than a decade in many countries, they have been implemented with varying success. In Germany, the eHealth strategy so far has resulted in a roll out of electronic health cards for all citizens in the statutory health insurance, but in no clinically meaningful IT-applications. The aim of this study was to test the technical and organisation feasibility, usability, and utility of an eDischarge application embedded into a laboratory Health Telematics Infrastructure (TI). The tests embraced the exchange of eDischarge summaries based on the multiprofessional HL7 eNursing Summary standard between a municipal hospital and a nursing home. All in all, 36 transmissions of electronic discharge documents took place. They demonstrated the technical-organisation feasibility and resulted in moderate usability ratings. A comparison between eDischarge and paper-based summaries hinted at higher ratings of utility and information completeness for eDischarges. Despite problems with handling the electronic health card, the proof-of-concept for the first clinically meaningful IT-application in the German Health TI could be regarded as successful

    Data from: Evaluation of electronically supported nursing transfers between hospital and nursing home based on a test health telematics infrastructure: a case analysis

    No full text
    Background: Improper information transmission can lead to compromised patient safety and quality of life when patients are transferred from one setting to another. Electronic instruments may improve this situation, however, they are rarely used. Objective: The aim of this study therefore was to investigate the technical and organizational feasibility, usability, usefulness and completeness of an electronic instrument that is based on the German HL7 CDA standard for eNursing Summaries. Materials and methods: To this end, a test health telematics infrastructure, which included the German electronic health card, was established and nursing summary application was developed that allowed summary documents to be communicated between a hospital and a nursing home. The users were asked to evaluate the usability of the nursing summary application as well as to compare the usefulness and completeness of electronically and paper transmitted information. Results: This study demonstrated the feasibility of implementing an electronic nursing summary application that was based on the German HL7 CDA standard eNursing Summary and that was integrated in a test health telematics infrastructure. It could also be shown that the users rated this application as usable and that electronically supported patient transfers were superior to paper based ones. The use of the German electronic health card was regarded as a barrier by the users. Discussion: This study emphasizes the feasibility, relevance and barriers of electronically supported transfers of patients with nursing needs. Nurses working in hospitals and long-term care can integrate an application based on the HL7 CDA Standard ePfgebericht into their working processes and get better and more complete information. To ensure continuity of care in a sustainable manner in the future, the German HL7 CDA based eNursing Summary standard should become part of the German telematics infrastructure
    corecore