54 research outputs found

    Accurate prediction of major histocompatibility complex class II epitopes by sparse representation via â„“ 1-minimization

    Get PDF
    Background: The major histocompatibility complex (MHC) is responsible for presenting antigens (epitopes) on the surface of antigen-presenting cells (APCs). When pathogen-derived epitopes are presented by MHC class II on an APC surface, T cells may be able to trigger an specific immune response. Prediction of MHC-II epitopes is particularly challenging because the open binding cleft of the MHC-II molecule allows epitopes to bind beyond the peptide binding groove; therefore, the molecule is capable of accommodating peptides of variable length. Among the methods proposed to predict MHC-II epitopes, artificial neural networks (ANNs) and support vector machines (SVMs) are the most effective methods. We propose a novel classification algorithm to predict MHC-II called sparse representation via 1-minimization. Results: We obtained a collection of experimentally confirmed MHC-II epitopes from the Immune Epitope Database and Analysis Resource (IEDB) and applied our 1-minimization algorithm. To benchmark the performance of our proposed algorithm, we compared our predictions against a SVM classifier. We measured sensitivity, specificity abd accuracy; then we used Receiver Operating Characteristic (ROC) analysis to evaluate the performance of our method. The prediction performance of MHC-II epitopes of the 1-minimization algorithm was generally comparable and, in some cases, superior to the standard SVM classification method and overcame the lack of robustness of other methods with respect to outliers. While our method consistently favoured DPPS encoding with the alleles tested, SVM showed a slightly better accuracy when “11-factor” encoding was used. Conclusions: 1-minimization has similar accuracy than SVM, and has additional advantages, such as overcoming the lack of robustness with respect to outliers. With 1-minimization no model selection dependency is involved

    Selected Works in Bioinformatics

    Get PDF
    This book consists of nine chapters covering a variety of bioinformatics subjects, ranging from database resources for protein allergens, unravelling genetic determinants of complex disorders, characterization and prediction of regulatory motifs, computational methods for identifying the best classifiers and key disease genes in large-scale transcriptomic and proteomic experiments, functional characterization of inherently unfolded proteins/regions, protein interaction networks and flexible protein-protein docking. The computational algorithms are in general presented in a way that is accessible to advanced undergraduate students, graduate students and researchers in molecular biology and genetics. The book should also serve as stepping stones for mathematicians, biostatisticians, and computational scientists to cross their academic boundaries into the dynamic and ever-expanding field of bioinformatics

    Computer-based Design of β-sheet Containing Proteins

    Get PDF
    Protein design is an excellent test of the minimal determinants of protein structure. Although 70% of naturally occurring proteins contain β-sheets, most previous design efforts have been limited to α-helix bundle proteins or the redesign of naturally occurring proteins. Here, we test and develop computer based methods for designing proteins rich in β-strands. The molecular modeling program Rosetta was used for three separate design tasks: (1) the design of α/β and α+β proteins with a new method called SEWING, which builds proteins from pieces of naturally occurring proteins, (2) the stabilization of β-sheet proteins via the redesign of surface-facing residues, and (3) the de novo design of β-sandwich proteins. This research showed that it is possible to extend the SEWING method to non-α-helix proteins, allowing the incorporation of structural features found in nature, and that it is possible to dramatically boost protein thermal stability (> 25oC) with the redesign β-sheet surfaces. However, we also found that the de novo design of β-sandwich proteins still remains an elusive goal.Doctor of Philosoph

    Computational approaches for improving treatment and prevention of viral infections

    Get PDF
    The treatment of infections with HIV or HCV is challenging. Thus, novel drugs and new computational approaches that support the selection of therapies are required. This work presents methods that support therapy selection as well as methods that advance novel antiviral treatments. geno2pheno[ngs-freq] identifies drug resistance from HIV-1 or HCV samples that were subjected to next-generation sequencing by interpreting their sequences either via support vector machines or a rules-based approach. geno2pheno[coreceptor-hiv2] determines the coreceptor that is used for viral cell entry by analyzing a segment of the HIV-2 surface protein with a support vector machine. openPrimeR is capable of finding optimal combinations of primers for multiplex polymerase chain reaction by solving a set cover problem and accessing a new logistic regression model for determining amplification events arising from polymerase chain reaction. geno2pheno[ngs-freq] and geno2pheno[coreceptor-hiv2] enable the personalization of antiviral treatments and support clinical decision making. The application of openPrimeR on human immunoglobulin sequences has resulted in novel primer sets that improve the isolation of broadly neutralizing antibodies against HIV-1. The methods that were developed in this work thus constitute important contributions towards improving the prevention and treatment of viral infectious diseases.Die Behandlung von HIV- oder HCV-Infektionen ist herausfordernd. Daher werden neue Wirkstoffe, sowie neue computerbasierte Verfahren benötigt, welche die Therapie verbessern. In dieser Arbeit wurden Methoden zur Unterstützung der Therapieauswahl entwickelt, aber auch solche, welche neuartige Therapien vorantreiben. geno2pheno[ngs-freq] bestimmt, ob Resistenzen gegen Medikamente vorliegen, indem es Hochdurchsatzsequenzierungsdaten von HIV-1 oder HCV Proben mittels Support Vector Machines oder einem regelbasierten Ansatz interpretiert. geno2pheno[coreceptor-hiv2] bestimmt den HIV-2 Korezeptorgebrauch dadurch, dass es einen Abschnitt des viralen Oberflächenproteins mit einer Support Vector Machine analysiert. openPrimeR kann optimale Kombinationen von Primern für die Multiplex-Polymerasekettenreaktion finden, indem es ein Mengenüberdeckungsproblem löst und auf ein neues logistisches Regressionsmodell für die Vorhersage von Amplifizierungsereignissen zurückgreift. geno2pheno[ngs-freq] und geno2pheno[coreceptor-hiv2] ermöglichen die Personalisierung antiviraler Therapien und unterstützen die klinische Entscheidungsfindung. Durch den Einsatz von openPrimeR auf humanen Immunoglobulinsequenzen konnten Primersätze generiert werden, welche die Isolierung von breit neutralisierenden Antikörpern gegen HIV-1 verbessern. Die in dieser Arbeit entwickelten Methoden leisten somit einen wichtigen Beitrag zur Verbesserung der Prävention und Therapie viraler Infektionskrankheiten

    Interpretable methods in cancer diagnostics

    Get PDF
    Cancer is a hard problem. It is hard for the patients, for the doctors and nurses, and for the researchers working on understanding the disease and finding better treatments for it. The challenges faced by a pathologist diagnosing the disease for a patient is not necessarily the same as the ones faced by cell biologists working on experimental treatments and understanding the fundamentals of cancer. In this thesis we work on different challenges faced by both of the above teams. This thesis first presents methods to improve the analysis of the flow cy- tometry data used frequently in the diagnosis process, specifically for the two subtypes of non-Hodgkin Lymphoma which are our focus: Follicular Lymphoma and Diffuse Large B Cell Lymphoma. With a combination of concepts from graph theory, dynamic programming, and machine learning, we present methods to improve the diagnosis process and the analysis of the abovementioned data. The interpretability of the method helps a pathologist to better understand a patient’s disease, which itself improves their choices for a treatment. In the second part, we focus on the analysis of DNA-methylation and gene expression data, both of which presenting the challenge of being very high dimen- sional yet with a few number of samples comparatively. We present an ensemble model which adapts to different patterns seen in each given data, in order to adapt to noise and batch effects. At the same time, the interpretability of our model helps a pathologist to better find and tune the treatment for the patient: a step further towards personalized medicine.Krebs ist ein schweres Problem. Es ist schwer für die Patienten, für die Ärzte und Krankenschwestern und für die Forscher, die daran arbeiten, die Krankheit zu verstehen und eine bessere Behandlung dafür zu finden. Die Herausforderungen, mit denen ein Pathologe konfrontiert ist, um die Krankheit eines Patienten zu diagnostizieren, müssen nicht die gleichen sein, mit denen Zellbiologen konfrontiert sind, die an experimentellen Behandlungen arbeiten und die Grundlagen von Krebs verstehen. In dieser Arbeit beschäftigen wir uns mit verschiedenen Herausforderungen, denen sich beide oben genannten Teams stellen. In dieser Arbeit werden zunächst Methoden vorgestellt, um die Analyse der im Diagnoseverfahren häufig verwendeten Durchflusszytometriedaten zu verbessern, insbesondere für die beiden Subtypen des Non-Hodgkin-Lymphoms, auf die wir uns konzentrieren: das follikuläre Lymphom und das diffuse großzellige B-Zell-Lymphom. Mit einer Kombination von Konzepten aus Graphentheorie, dynamischer Programmierung und künstliche Intelligenz präsentieren wir Methoden zur Verbesserung des Diagnoseprozesses und der Analyse der oben genannten Daten. Die Interpretierbarkeit der Methode hilft einem Pathologen, die Apatientenkrankheit besser zu verstehen, was wiederum seine Wahlmöglichkeiten für eine Behandlung verbessert. Im zweiten Teil konzentrieren wir uns auf die Analyse von DNA-Methylierungsund Genexpressionsdaten, die beide die Herausforderung darstellen, sehr hochdimensional zu sein, jedoch mit nur wenigen Proben im Vergleich.Wir präsentieren ein Zusammenstellungsmodell, das sich an unterschiedliche Muster anpasst, die in den jeweiligen Daten zu sehen sind, um sich an Rauschen und Batch-Effekte anzupassen. Gleichzeitig hilft die Interpretierbarkeit unseres Modells einem Pathologen, die Behandlung für den Patienten besser zu finden und abzustimmen: ein Schritt weiter in Richtung personalisierter Medizin

    Seventh Biennial Report : June 2003 - March 2005

    No full text

    Following the trail of cellular signatures : computational methods for the analysis of molecular high-throughput profiles

    Get PDF
    Over the last three decades, high-throughput techniques, such as next-generation sequencing, microarrays, or mass spectrometry, have revolutionized biomedical research by enabling scientists to generate detailed molecular profiles of biological samples on a large scale. These profiles are usually complex, high-dimensional, and often prone to technical noise, which makes a manual inspection practically impossible. Hence, powerful computational methods are required that enable the analysis and exploration of these data sets and thereby help researchers to gain novel insights into the underlying biology. In this thesis, we present a comprehensive collection of algorithms, tools, and databases for the integrative analysis of molecular high-throughput profiles. We developed these tools with two primary goals in mind. The detection of deregulated biological processes in complex diseases, like cancer, and the identification of driving factors within those processes. Our first contribution in this context are several major extensions of the GeneTrail web service that make it one of the most comprehensive toolboxes for the analysis of deregulated biological processes and signaling pathways. GeneTrail offers a collection of powerful enrichment and network analysis algorithms that can be used to examine genomic, epigenomic, transcriptomic, miRNomic, and proteomic data sets. In addition to approaches for the analysis of individual -omics types, our framework also provides functionality for the integrative analysis of multi-omics data sets, the investigation of time-resolved expression profiles, and the exploration of single-cell experiments. Besides the analysis of deregulated biological processes, we also focus on the identification of driving factors within those processes, in particular, miRNAs and transcriptional regulators. For miRNAs, we created the miRNA pathway dictionary database miRPathDB, which compiles links between miRNAs, target genes, and target pathways. Furthermore, it provides a variety of tools that help to study associations between them. For the analysis of transcriptional regulators, we developed REGGAE, a novel algorithm for the identification of key regulators that have a significant impact on deregulated genes, e.g., genes that show large expression differences in a comparison between disease and control samples. To analyze the influence of transcriptional regulators on deregulated biological processes,, we also created the RegulatorTrail web service. In addition to REGGAE, this tool suite compiles a range of powerful algorithms that can be used to identify key regulators in transcriptomic, proteomic, and epigenomic data sets. Moreover, we evaluate the capabilities of our tool suite through several case studies that highlight the versatility and potential of our framework. In particular, we used our tools to conducted a detailed analysis of a Wilms' tumor data set. Here, we could identify a circuitry of regulatory mechanisms, including new potential biomarkers, that might contribute to the blastemal subtype's increased malignancy, which could potentially lead to new therapeutic strategies for Wilms' tumors. In summary, we present and evaluate a comprehensive framework of powerful algorithms, tools, and databases to analyze molecular high-throughput profiles. The provided methods are of broad interest to the scientific community and can help to elucidate complex pathogenic mechanisms.Heutzutage werden molekulare Hochdurchsatzmessverfahren, wie Hochdurchsatzsequenzierung, Microarrays, oder Massenspektrometrie, regelmäßig angewendet, um Zellen im großen Stil und auf verschiedenen molekularen Ebenen zu charakterisieren. Die dabei generierten Datensätze sind in der Regel hochdimensional und oft verrauscht. Daher werden leistungsfähige computergestützte Anwendungen benötigt, um deren Analyse zu ermöglichen. In dieser Arbeit präsentieren wir eine Reihe von effektiven Algorithmen, Programmen, und Datenbaken für die Analyse von molekularen Hochdurchsetzdatensätzen. Diese Ansätze wurden entwickelt, um deregulierte biologische Prozesse zu untersuchen und in diesen wichtige Schlüsselmoleküle zu identifizieren. Zusätzlich wurden eine Reihe von Analysen durchgeführt um die verschiedenen Methoden zu evaluieren. Zu diesem Zweck haben wir insbesondere eine Wilmstumor Studie durchgeführt, in der wir verschiedene regulatorische Mechanismen und dazugehörige Biomarker identifizieren konnten, die für die erhöhte Malignität von Wilmstumoren mit blastemreichen Subtyp verantwortlich sein könnten. Diese Erkenntnisse könnten in der Zukunft zu einer verbesserten Behandlung dieser Tumore führen. Diese Ergebnisse zeigen eindrucksvoll, dass unsere Ansätze in der Lage sind, verschiedene molekulare Hochdurchsatzmessungen auszuwerten und dabei helfen können pathogene Mechanismen im Zusammenhang mit Krebs oder anderen komplexen Krankheiten aufzuklären

    New Advances in Stem Cell Transplantation

    Get PDF
    This book documents the increased number of stem cell-related research, clinical applications, and views for the future. The book covers a wide range of issues in cell-based therapy and regenerative medicine, and includes clinical and preclinical chapters from the respected authors involved with stem cell studies and research from around the world. It complements and extends the basics of stem cell physiology, hematopoietic stem cells, issues related to clinical problems, tissue typing, cryopreservation, dendritic cells, mesenchymal cells, neuroscience, endovascular cells and other tissues. In addition, tissue engineering that employs novel methods with stem cells is explored. Clearly, the continued use of biomedical engineering will depend heavily on stem cells, and this book is well positioned to provide comprehensive coverage of these developments

    Extracellular Vesicles: Biology and Potentials in Cancer Therapeutics

    Get PDF
    Extracellular vesicles (EVs) are particles wrapped in a lipid bilayer membrane and are naturally released from cells. This kind of cargo vessel is a nanostructure that mainly transfers lipids, proteins, various nucleic acid fragments, and metabolic components to neighboring cells or distant parts of the body through the circulatory system. EVs are of great significance to the communication mechanism between cells. This book collects feature articles to enhance our understanding of the biological characteristics of EVs and their potential applications
    • …
    corecore