2 research outputs found

    Computational Methods towards Personalized Cancer Vaccines and their Application through a Web-based Platform

    Get PDF
    Cancer immunotherapy is a treatment option that involves or uses components of a patient’s immune system. Today, it is heading towards becoming an integral part of treatment plans together with chemotherapy, surgery, and radiotherapy. Personalized epitope-based vaccines (EVs) serve as one strategy that is truly personalized. Each patient possesses a distinct immune system, and each tumor is unique, rendering the design of a potent vaccine challenging and dependent on the patient and the tumor. The potency of a vaccine is reliant on the ability of its constituent epitopes – short, immunogenic antigen fragments – to trigger an immune response. To assess this ability, one has to take into account the individuality of the immune system, among others conditioned by the variability of the human leukocyte antigen (HLA) gene cluster. Determining the HLA genotype with traditional experimental techniques can be time- and cost-intensive. We proposed a novel HLA genotyping algorithm based on integer linear programming that is independent of dedicated data generation for the sole purpose of HLA typing. On publicly available next-generation sequencing (NGS) data, our method outperformed previously published approaches. HLA binding is a prerequisite for T-cell recognition, and precise prediction algorithms exist. However, this information is not sufficient to assess the immunogenic potential of a peptide. To induce an immune response, reactive T-cell clones with receptors specific for a peptide-HLA complex have to be present. We suggested a method for the prediction of immunogenicity that includes peripheral tolerance models, based on gut microbiome data, in addition to central tolerance, previously shown to increase performance. The comparison to a previously published method suggests that the incorporation of gut microbiome data and HLA-binding stability estimates do not enhance prediction performance. High-throughput sequencing provides the basis for the design of personalized EVs. Through genome and transcriptome sequencing of tumor and matched non-malignant tissue samples, cancer-specific mutations can be identified, which can be further validated using other technologies such as mass spectrometry (MS). Multi-omics approaches can result in the acquisition of several hundreds of gigabytes of data. Handling and analysis of such data usually require data management solutions and high-performance computing (HPC) infrastructures. We developed the web-based platform qPortal for data-driven biomedical research that allows users to manage and analyze quantitative biological data intuitively. To emphasize the advantages of our data-driven approach with an integrated workflow system, we conducted a comparison to Galaxy. Building on qPortal, we implemented the web-based platform iVacPortal for the design of personalized EVs to facilitate data management and data analysis in such projects. Further, we applied the implemented methods through iVacPortal in two studies of two distinct cancer entities, indicating the added value of our platform for the assessment of personalized EV candidates and alternative targets for cancer immunotherapy

    Computational Methods for Interactive and Explorative Study Design and Integration of High-throughput Biological Data

    Get PDF
    The increase in the use of high-throughput methods to gain insights into biological systems has come with new challenges. Genomics, transcriptomics, proteomics, and metabolomics lead to a massive amount of data and metadata. While this wealth of information has resulted in many scientific discoveries, new strategies are needed to cope with the ever-growing variety and volume of metadata. Despite efforts to standardize the collection of study metadata, many experiments cannot be reproduced or replicated. One reason for this is the difficulty to provide the necessary metadata. The large sample sizes that modern omics experiments enable, also make it increasingly complicated for scientists to keep track of every sample and the needed annotations. The many data transformations that are often needed to normalize and analyze omics data require a further collection of all parameters and tools involved. A second possible cause is missing knowledge about statistical design of studies, both related to study factors as well as the required sample size to make significant discoveries. In this thesis, we develop a multi-tier model for experimental design and a portlet for interactive web-based study design. Through the input of experimental factors and the number of replicates, users can easily create large, factorial experimental designs. Changes or additional metadata can be quickly uploaded via user-defined spreadsheets including sample identifiers. In order to comply with existing standards and provide users with a quick way to import existing studies, we provide full interoperability with the ISA-Tab format. We show that both data model and portlet are easily extensible to create additional tiers of samples annotated with technology-specific metadata. We tackle the problem of unwieldy experimental designs by creating an aggregation graph. Based on our multi-tier experimental design model, similar samples, their sources, and analytes are summarized, creating an interactive summary graph that focuses on study factors and replicates. Thus, we give researchers a quick overview of sample sizes and the aim of different studies. This graph can be included in our portlets or used as a stand alone application and is compatible with the ISA-Tab format. We show that this approach can be used to explore the quality of publicly available experimental designs and metadata annotation. The third part of this thesis contributes to a more statistically sound experiment planning for differential gene expression experiments. We integrate two tools for the prediction of statistical power and sample size estimation into our portal. This integration enables the use of existing data, in order to arrive at more accurate calculation for sample variability. Additionally, the statistical power of existing experimental designs of certain sample sizes can be analyzed. All results and parameters are stored and can be used for later comparison. Even perfectly planned and annotated experiments cannot eliminate human error. Based on our model we develop an automated workflow for microarray quality control, enabling users to inspect the quality of normalization and cluster samples by study factor levels. We import a publicly available microarray dataset to assess our contributions to reproducibility and explore alternative analysis methods based on statistical power analysis
    corecore