11 research outputs found

    IgIDivA : immunoglobulin intraclonal diversification analysis

    Get PDF
    Intraclonal diversification (ID) within the immunoglobulin (IG) genes expressed by B cell clones arises due to ongoing somatic hypermutation (SHM) in a context of continuous interactions with antigen(s). Defining the nature and order of appearance of SHMs in the IG genes can assist in improved understanding of the ID process, shedding light into the ontogeny and evolution of B cell clones in health and disease. Such endeavor is empowered thanks to the introduction of high-throughput sequencing in the study of IG gene repertoires. However, few existing tools allow the identification, quantification and characterization of SHMs related to ID, all of which have limitations in their analysis, highlighting the need for developing a purpose-built tool for the comprehensive analysis of the ID process. In this work, we present the immunoglobulin intraclonal diversification analysis (IgIDivA) tool, a novel methodology for the in-depth qualitative and quantitative analysis of the ID process from high-throughput sequencing data. IgIDivA identifies and characterizes SHMs that occur within the variable domain of the rearranged IG genes and studies in detail the connections between identified SHMs, establishing mutational pathways. Moreover, it combines established and new graph-based metrics for the objective determination of ID level, combined with statistical analysis for the comparison of ID level features for different groups of samples. Of importance, IgIDivA also provides detailed visualizations of ID through the generation of purpose-built graph networks. Beyond the method design, IgIDivA has been also implemented as an R Shiny web application. IgIDivA is freely available at https://bio.tools/igidiv

    Machine Learning techniques in bioinformatics : From data integration to the development of application-oriented tools

    Get PDF
    Els darrers desenvolupaments en Machine Learning tenen com a objectiu automatitzar els mètodes disponibles, convertint-los en universals alhora que requerint el mínim coneixement expert possible. En aquesta tesi, farem un pas enrere. Ens centrarem en les dades, en les seves necessitats específiques i en com extreure'n informació significativa. Això es farà mitjançant la presentació de diferents treballs que destaquen diversos aspectes a tenir en compte a l'hora de desenvolupar tècniques d'aprenentatge automàtic en bioinformàtica. No hi pot haver cap model sense les consideracions adequades sobre les dades. Per tant, a la primera part, deixarem els models de banda i ens centrarem en la integració de dades. Específicament, presentarem un algorisme per a la normalització de dades de microarrays d'expressió gènica provinents de plataformes distintes. Les dades de microarrays estan àmpliament disponibles als repositoris públics i aquests mètodes permeten la seva posterior anàlisi. A la següent part, considerarem dades de seqüència de pèptids i presentarem una eina per a l'extracció de patrons existents en aquests conjunts. El model, basat en xarxes neuronals convolucionals, és de codi obert i es pot utilitzar per a la predicció d'unió de pèptids a MHC de classe II, entre altres aplicacions. La darrera part es dedicarà a l'anàlisi de dades clíniques. Presentarem un estudi de cohort retrospectiu sobre el càncer de pàncrees. Per a aquest estudi, s'ha desenvolupat una eina per a la predicció de resultats clínicament rellevants. Des de la integració de dades fins al desenvolupament d'eines orientades a aplicacions, les tres parts que formen aquesta tesi seran autònomes i abordaran diferents reptes en l'àmbit de les aproximacions basades en dades en bioinformàtica.Los desarrollos recientes en Machine Learning tienen como objetivo automatizar los métodos disponibles, haciéndolos universales y requiriendo el menor conocimiento experto posible. En esta tesis, daremos un paso atrás. Nos centraremos en los datos, sus necesidades específicas y cómo extraer información significativa de ellos. Esto se hará a través de la presentación de diferentes trabajos destacando diversos aspectos a considerar a la hora de desarrollar técnicas de Machine Learning en bioinformática. No puede haber ningún modelo sin las consideraciones adecuadas sobre los datos. Por tanto, en la primera parte, dejaremos los modelos a un lado y nos centraremos en la integración de los datos. Específicamente, presentaremos un algoritmo para la normalización de datos de microarrays de expresión génica provenientes de distintas plataformas. Los datos de microarrays están ampliamente disponibles en repositorios públicos y tales métodos permiten su posterior análisis. En la siguiente parte, consideraremos datos de secuencia de péptidos y presentaremos una herramienta para la extracción de patrones existentes en dichos conjuntos. El modelo, basado en redes neuronales convolucionales, es de código abierto y puede ser usado para la predicción de la unión de péptidos a MHC de clase II, entre otras aplicaciones. La última parte estará dedicada al análisis de datos clínicos. Presentaremos un estudio de cohorte retrospectivo sobre cáncer de páncreas. Para este estudio, se ha desarrollado una herramienta para la predicción de resultados clínicamente relevantes. Desde la integración de datos hasta el desarrollo de herramientas orientadas a aplicaciones, las tres partes que forman esta tesis serán autónomas y cada una abordará diferentes desafíos en el ámbito de las aproximaciones basadas en datos en bioinformática.Recent developments in Machine Learning aim at automatizing available methods, rendering them universal while requiring as little expert-knowledge as possible. In this thesis, we will take a step back. We will focus on the data, their specific needs and how to extract meaningful information out of them. This will be done through the presentation of different works highlighting various aspects to consider when developing Machine Learning techniques in bioinformatics. There cannot be any models without the appropriate considerations on the data. Therefore, in the first part, we will put the models aside and focus on data integration. In more detail, we will present an algorithm for the normalization of gene-expression microarray data across different platforms. Microarray data are widely available in public repositories and such methods enable their subsequent downstream analysis. In the next part, we will consider peptide sequence data and present a tool for the extraction of patterns in such sets. The model, based on convolutional neural networks, is open-source and can be used for peptide MHC-class II binding prediction among other applications. The last part will be dedicated to the analysis of clinical data. We will present a retrospective cohort study on pancreatic cancer. For this study, a tool for the prediction of clinically relevant outcomes has been developed. From data integration to the development of application-oriented tools, the three parts forming this thesis will be self-contained and will each address different challenges in the realm of data-driven approaches in bioinformatics.Universitat Autònoma de Barcelona. Programa de Doctorat en Bioinformàtic

    CuBlock : a cross-platform normalization method for gene-expression microarrays

    Get PDF
    Cross-(multi)platform normalization of gene-expression microarray data remains an unresolved issue. Despite the existence of several algorithms, they are either constrained by the need to normalize all samples of all platforms together, compromising scalability and reuse, by adherence to the platforms of a specific provider, or simply by poor performance. In addition, many of the methods presented in the literature have not been specifically tested against multi-platform data and/or other methods applicable in this context. Thus, we set out to develop a normalization algorithm appropriate for gene-expression studies based on multiple, potentially large microarray sets collected along multiple platforms and at different times, applicable in systematic studies aimed at extracting knowledge from the wealth of microarray data available in public repositories; for example, for the extraction of Real-World Data to complement data from Randomized Controlled Trials. Our main focus or criterion for performance was on the capacity of the algorithm to properly separate samples from different biological groups. We present CuBlock, an algorithm addressing this objective, together with a strategy to validate cross-platform normalization methods. To validate the algorithm and benchmark it against existing methods, we used two distinct datasets, one specifically generated for testing and standardization purposes and one from an actual experimental study. Using these datasets, we benchmarked CuBlock against ComBat (), UPC (), YuGene (), DBNorm (), Shambhala () and a simple log transform as reference. We note that many other popular normalization methods are not applicable in this context. CuBlock was the only algorithm in this group that could always and clearly differentiate the underlying biological groups after mixing the data, from up to six different platforms in this study. CuBlock can be downloaded from . are available at Bioinformatics online

    A decision support system based on artificial intelligence and systems biology for the simulation of pancreatic cancer patient status

    No full text
    Data de publicació electrònica: 31-03-2023Oncology treatments require continuous individual adjustment based on the measurement of multiple clinical parameters. Prediction tools exploiting the patterns present in the clinical data could be used to assist decision making and ease the burden associated to the interpretation of all these parameters. The goal of this study was to predict the evolution of patients with pancreatic cancer at their next visit using information routinely recorded in health records, providing a decision-support system for clinicians. We selected hematological variables as the visit's clinical outcomes, under the assumption that they can be predictive of the evolution of the patient. Multivariate models based on regression trees were generated to predict next-visit values for each of the clinical outcomes selected, based on the longitudinal clinical data as well as on molecular data sets streaming from in silico simulations of individual patient status at each visit. The models predict, with a mean prediction score (balanced accuracy) of 0.79, the evolution trends of eosinophils, leukocytes, monocytes, and platelets. Time span between visits and neutropenia were among the most common factors contributing to the predicted evolution. The inclusion of molecular variables from the systems-biology in silico simulations provided a molecular background for the observed variations in the selected outcome variables, mostly in relation to the regulation of hematopoiesis. In spite of its limitations, this study serves as a proof of concept for the application of next-visit prediction tools in real-world settings, even when available data sets are small.V.J. is part of a project (COSMIC) that has received funding from the European Union's Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement No. 765158. P.M.F. receives funding from the European Union's Horizon 2020 research and innovation program under grant agreement No. 860303. J.M.G.I. receives funding from the European Union's Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement No. 859962

    Methods to develop an in silico clinical trial: computational head-to-head comparison of lisdexamfetamine and methylphenidate

    No full text
    Regulatory agencies encourage computer modeling and simulation to reduce the time and cost of clinical trials. Although still not classified in formal guidelines, system biology-based models represent a powerful tool for generating hypotheses with great molecular detail. Herein, we have applied a mechanistic head-to-head in silico clinical trial (ISCT) between two treatments for attention-deficit/hyperactivity disorder, to wit lisdexamfetamine (LDX) and methylphenidate (MPH). The ISCT was generated through three phases comprising (i) the molecular characterization of drugs and pathologies, (ii) the generation of adult and children virtual populations (vPOPs) totaling 2,600 individuals and the creation of physiologically based pharmacokinetic (PBPK) and quantitative systems pharmacology (QSP) models, and (iii) data analysis with artificial intelligence methods. The characteristics of our vPOPs were in close agreement with real reference populations extracted from clinical trials, as did our PBPK models with in vivo parameters. The mechanisms of action of LDX and MPH were obtained from QSP models combining PBPK modeling of dosing schemes and systems biology-based modeling technology, i.e., therapeutic performance mapping system. The step-by-step process described here to undertake a head-to-head ISCT would allow obtaining mechanistic conclusions that could be extrapolated or used for predictions to a certain extent at the clinical level. Altogether, these computational techniques are proven an excellent tool for hypothesis-generation and would help reach a personalized medicine.This study was funded by Takeda. Public funders provided support for some of the authors' salaries: GJ has received funding from the European Union's Horizon 2020 research and innovation program under the Marie Skłodowska-Curie Grant Agreement No. 765912. VJ is part of a project (COSMIC; www.cosmic-h2020.eu) that has received funding from the European Union's Horizon 2020 research and innovation program under the Marie Skłodowska-Curie Grant Agreement No. 765158

    In-silico simulated prototype-patients using TPMS technology to study a potential adverse effect of sacubitril and valsartan

    Get PDF
    Unveiling the mechanism of action of a drug is key to understand the benefits and adverse reactions of a medication in an organism. However, in complex diseases such as heart diseases there is not a unique mechanism of action but a wide range of different responses depending on the patient. Exploring this collection of mechanisms is one of the clues for a future personalized medicine. The Therapeutic Performance Mapping System (TPMS) is a Systems Biology approach that generates multiple models of the mechanism of action of a drug. Each molecular mechanism generated could be associated to particular individuals, here defined as prototype-patients, hence the generation of models using TPMS technology may be used for detecting adverse effects to specific patients. TPMS operates by (1) modelling the responses in humans with an accurate description of a protein network and (2) applying a Multilayer Perceptron-like and sampling strategy to find all plausible solutions. In the present study, TPMS is applied to explore the diversity of mechanisms of action of the drug combination sacubitril/valsartan. We use TPMS to generate a wide range of models explaining the relationship between sacubitril/valsartan and heart failure (the indication), as well as evaluating their association with macular degeneration (a potential adverse effect). Among the models generated, we identify a set of mechanisms of action associated to a better response in terms of heart failure treatment, which could also be associated to macular degeneration development. Finally, a set of 30 potential biomarkers are proposed to identify mechanisms (or prototype-patients) more prone of suffering macular degeneration when presenting good heart failure response. All prototype-patients models generated are completely theoretical and therefore they do not necessarily involve clinical effects in real patients. Data and accession to software are available at http://sbi.upf.edu/data/tpms/.Public funders provided support for authors salaries: JAP, NFF and BO received support from the Spanish Ministry of Economy (MINECO) [BIO2017-85329-R] [RYC-2015-17519]; “Unidad de Excelencia María de Maeztu”, funded by the Spanish Ministry of Economy [ref: MDM-2014-0370]. The Research Programme on Biomedical Informatics (GRIB) is a member of the Spanish National Bioinformatics Institute (INB), PRB2-ISCIII and is supported by grant PT13/0001/0023, of the PE I+D+i 2013-2016, funded by ISCIII and FEDER. GJ has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 765912. VJ is part of a project (COSMIC; www.cosmic-h2020.eu) that has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 765158. Funding for publication is from Agència de Gestió D'ajuts Universitaris i de Recerca Generalitat de Catalunya [2017SGR00519]. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript

    A quantitative systems pharmacology model for certolizumab pegol treatment in moderate-to-severe psoriasis

    No full text
    Background: Psoriasis is a chronic immune-mediated inflammatory systemic disease with skin manifestations characterized by erythematous, scaly, itchy and/or painful plaques resulting from hyperproliferation of keratinocytes. Certolizumab pegol [CZP], a PEGylated antigen binding fragment of a humanized monoclonal antibody against TNF-alpha, is approved for the treatment of moderate-to-severe plaque psoriasis. Patients with psoriasis present clinical and molecular variability, affecting response to treatment. Herein, we utilized an in silico approach to model the effects of CZP in a virtual population (vPop) with moderate-to-severe psoriasis. Our proof-of-concept study aims to assess the performance of our model in generating a vPop and defining CZP response variability based on patient profiles. Methods: We built a quantitative systems pharmacology (QSP) model of a clinical trial-like vPop with moderate-to-severe psoriasis treated with two dosing schemes of CZP (200 mg and 400 mg, both every two weeks for 16 weeks, starting with a loading dose of CZP 400 mg at weeks 0, 2, and 4). We applied different modelling approaches: (i) an algorithm to generate vPop according to reference population values and comorbidity frequencies in real-world populations; (ii) physiologically based pharmacokinetic (PBPK) models of CZP dosing schemes in each virtual patient; and (iii) systems biology-based models of the mechanism of action (MoA) of the drug. Results: The combination of our different modelling approaches yielded a vPop distribution and a PBPK model that aligned with existing literature. Our systems biology and QSP models reproduced known biological and clinical activity, presenting outcomes correlating with clinical efficacy measures. We identified distinct clusters of virtual patients based on their psoriasis-related protein predicted activity when treated with CZP, which could help unravel differences in drug efficacy in diverse subpopulations. Moreover, our models revealed clusters of MoA solutions irrespective of the dosing regimen employed. Conclusion: Our study provided patient specific QSP models that reproduced clinical and molecular efficacy features, supporting the use of computational methods as modelling strategy to explore drug response variability. This might shed light on the differences in drug efficacy in diverse subpopulations, especially useful in complex diseases such as psoriasis, through the generation of mechanistically based hypotheses.The study was funded by UCB Pharma and Anaxomics Biotech. Copy-editing was funded by UCB Pharma. Article processing fees were provided by UCB Pharma. Public funders provided support for some of the authors’ salaries: VJ has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 765158 (COSMIC; www.cosmic-h2020.eu); GJ has received funding from the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie Grant Agreement No. 765912; FG has received funding from the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie Grant Agreement No. 813545. The funder UCB Biopharma was not involved in the study design, collection, analysis, interpretation of data, the writing of this article, or the decision to submit it for publication

    Table_1_A quantitative systems pharmacology model for certolizumab pegol treatment in moderate-to-severe psoriasis.xlsx

    No full text
    BackgroundPsoriasis is a chronic immune-mediated inflammatory systemic disease with skin manifestations characterized by erythematous, scaly, itchy and/or painful plaques resulting from hyperproliferation of keratinocytes. Certolizumab pegol [CZP], a PEGylated antigen binding fragment of a humanized monoclonal antibody against TNF-alpha, is approved for the treatment of moderate-to-severe plaque psoriasis. Patients with psoriasis present clinical and molecular variability, affecting response to treatment. Herein, we utilized an in silico approach to model the effects of CZP in a virtual population (vPop) with moderate-to-severe psoriasis. Our proof-of-concept study aims to assess the performance of our model in generating a vPop and defining CZP response variability based on patient profiles.MethodsWe built a quantitative systems pharmacology (QSP) model of a clinical trial-like vPop with moderate-to-severe psoriasis treated with two dosing schemes of CZP (200 mg and 400 mg, both every two weeks for 16 weeks, starting with a loading dose of CZP 400 mg at weeks 0, 2, and 4). We applied different modelling approaches: (i) an algorithm to generate vPop according to reference population values and comorbidity frequencies in real-world populations; (ii) physiologically based pharmacokinetic (PBPK) models of CZP dosing schemes in each virtual patient; and (iii) systems biology-based models of the mechanism of action (MoA) of the drug.ResultsThe combination of our different modelling approaches yielded a vPop distribution and a PBPK model that aligned with existing literature. Our systems biology and QSP models reproduced known biological and clinical activity, presenting outcomes correlating with clinical efficacy measures. We identified distinct clusters of virtual patients based on their psoriasis-related protein predicted activity when treated with CZP, which could help unravel differences in drug efficacy in diverse subpopulations. Moreover, our models revealed clusters of MoA solutions irrespective of the dosing regimen employed.ConclusionOur study provided patient specific QSP models that reproduced clinical and molecular efficacy features, supporting the use of computational methods as modelling strategy to explore drug response variability. This might shed light on the differences in drug efficacy in diverse subpopulations, especially useful in complex diseases such as psoriasis, through the generation of mechanistically based hypotheses.</p

    DataSheet_1_A quantitative systems pharmacology model for certolizumab pegol treatment in moderate-to-severe psoriasis.pdf

    No full text
    BackgroundPsoriasis is a chronic immune-mediated inflammatory systemic disease with skin manifestations characterized by erythematous, scaly, itchy and/or painful plaques resulting from hyperproliferation of keratinocytes. Certolizumab pegol [CZP], a PEGylated antigen binding fragment of a humanized monoclonal antibody against TNF-alpha, is approved for the treatment of moderate-to-severe plaque psoriasis. Patients with psoriasis present clinical and molecular variability, affecting response to treatment. Herein, we utilized an in silico approach to model the effects of CZP in a virtual population (vPop) with moderate-to-severe psoriasis. Our proof-of-concept study aims to assess the performance of our model in generating a vPop and defining CZP response variability based on patient profiles.MethodsWe built a quantitative systems pharmacology (QSP) model of a clinical trial-like vPop with moderate-to-severe psoriasis treated with two dosing schemes of CZP (200 mg and 400 mg, both every two weeks for 16 weeks, starting with a loading dose of CZP 400 mg at weeks 0, 2, and 4). We applied different modelling approaches: (i) an algorithm to generate vPop according to reference population values and comorbidity frequencies in real-world populations; (ii) physiologically based pharmacokinetic (PBPK) models of CZP dosing schemes in each virtual patient; and (iii) systems biology-based models of the mechanism of action (MoA) of the drug.ResultsThe combination of our different modelling approaches yielded a vPop distribution and a PBPK model that aligned with existing literature. Our systems biology and QSP models reproduced known biological and clinical activity, presenting outcomes correlating with clinical efficacy measures. We identified distinct clusters of virtual patients based on their psoriasis-related protein predicted activity when treated with CZP, which could help unravel differences in drug efficacy in diverse subpopulations. Moreover, our models revealed clusters of MoA solutions irrespective of the dosing regimen employed.ConclusionOur study provided patient specific QSP models that reproduced clinical and molecular efficacy features, supporting the use of computational methods as modelling strategy to explore drug response variability. This might shed light on the differences in drug efficacy in diverse subpopulations, especially useful in complex diseases such as psoriasis, through the generation of mechanistically based hypotheses.</p

    Pathologische Anatomie und Histologie der membranösen (Paries chorioideus) und der nervösen Wände (Ependym) der Hirnventrikel

    No full text
    corecore