72 research outputs found

    A Data Mining Approach to Building a Predictive Model of Low-Cost Carriers\u27 Presence in the U.S. Domestic Routes

    Get PDF
    The purpose of the study was to build the predictive model of the presence of U.S. low-cost carriers (LCCs) in the domestic network structure. SEMMA (Sample, Explore, Modify, Model, and Assess) schematic in data mining was followed and employed as the primary methodological procedure. Data in the period of 1Q2016-1Q2018 were extracted from the Bureau of Transportation Statistics (DB1B database) and reconstructed to form predictors. Stepwise logistic regression showed a significant predictive performance compared to decision tree technique in terms of fitting measures, which was then used as the concluding model. Significant predictors included: (1) Market concentration positively related with the presence of LCCs, (2) nonstop route associated with the presence of LCCs, (3) market airfare factors negatively related with the presence of LCCs, and (4) origin and destination (O&D) airports being hubs, especially medium hubs, associated with the presence of LCCs. The findings may practically aid network planners in airlines and airports in decision making associated with the presence of LCCs, which ultimately leads to building their more robust and efficient route map

    Large-Scale Structure-Based Prediction of Stable Peptide Binding to Class I HLAs Using Random Forests

    Get PDF
    Prediction of stable peptide binding to Class I HLAs is an important component for designing immunotherapies. While the best performing predictors are based on machine learning algorithms trained on peptide-HLA (pHLA) sequences, the use of structure for training predictors deserves further exploration. Given enough pHLA structures, a predictor based on the residue-residue interactions found in these structures has the potential to generalize for alleles with little or no experimental data. We have previously developed APE-Gen, a modeling approach able to produce pHLA structures in a scalable manner. In this work we use APE-Gen to model over 150,000 pHLA structures, the largest dataset of its kind, which were used to train a structure-based pan-allele model. We extract simple, homogenous features based on residue-residue distances between peptide and HLA, and build a random forest model for predicting stable pHLA binding. Our model achieves competitive AUROC values on leave-one-allele-out validation tests using significantly less data when compared to popular sequence-based methods. Additionally, our model offers an interpretation analysis that can reveal how the model composes the features to arrive at any given prediction. This interpretation analysis can be used to check if the model is in line with chemical intuition, and we showcase particular examples. Our work is a significant step toward using structure to achieve generalizable and more interpretable prediction for stable pHLA binding

    Biodesulphurized subbituminous coal by different fungi and bacteria studied by reductive pyrolysis. Part 1: Initial coal

    Get PDF
    One of the perspective methods for clean solid fuels production is biodesulphurization. In order to increase the effect of this approach it is necessary to apply the advantages of more informative analytical techniques. Atmospheric pressure temperature programming reduction (AP-TPR) coupled with different detection systems gave us ground to attain more satisfactory explanation of the effects of biodesulphurization on the treated solid products. Subbituminous high sulphur coal from ‘‘Pirin” basin (Bulgaria) was selected as a high sulphur containing sample. Different types of microorganisms were chosen and maximal desulphurization of 26% was registered. Biodesulphurization treatments were performed with three types of fungi: ‘‘Trametes Versicolor” – ATCC No. 200801, ‘‘Phanerochaeta Chrysosporium” – ME446, Pleurotus Sajor-Caju and one Mixed Culture of bacteria – ATCC No. 39327. A high degree of inorganic sulphur removal (79%) with Mixed Culture of bacteria and consecutive reduction by 13% for organic sulphur (Sorg) decrease with ‘‘Phanerochaeta Chrysosporium” and ‘‘Trametes Versicolor” were achieved. To follow the Sorg changes a set of different detection systems i.e. AP-TPR coupled ‘‘on-line” with mass spectrometry (AP-TPR/MS), on-line with potentiometry (AP-TPR/pot) and by the ‘‘off-line” AP-TPR/GC/MS analysis was used. The need of applying different atmospheres in pyrolysis experiments was proved and their effects were discussed. In order to reach more precise total sulphur balance, oxygen bomb combustion followed by ion chromatography was used

    3pHLA-score improves structure-based peptide-HLA binding affinity prediction

    Get PDF
    Binding of peptides to Human Leukocyte Antigen (HLA) receptors is a prerequisite for triggering immune response. Estimating peptide-HLA (pHLA) binding is crucial for peptide vaccine target identification and epitope discovery pipelines. Computational methods for binding affinity prediction can accelerate these pipelines. Currently, most of those computational methods rely exclusively on sequence-based data, which leads to inherent limitations. Recent studies have shown that structure-based data can address some of these limitations. In this work we propose a novel machine learning (ML) structure-based protocol to predict binding affinity of peptides to HLA receptors. For that, we engineer the input features for ML models by decoupling energy contributions at different residue positions in peptides, which leads to our novel per-peptide-position protocol. Using Rosetta’s ref2015 scoring function as a baseline we use this protocol to develop 3pHLA-score. Our per-peptide-position protocol outperforms the standard training protocol and leads to an increase from 0.82 to 0.99 of the area under the precision-recall curve. 3pHLA-score outperforms widely used scoring functions (AutoDock4, Vina, Dope, Vinardo, FoldX, GradDock) in a structural virtual screening task. Overall, this work brings structure-based methods one step closer to epitope discovery pipelines and could help advance the development of cancer and viral vaccines

    Revealing unknown protein structures using computational conformational sampling guided by experimental hydrogen-exchange data

    Get PDF
    Both experimental and computational methods are available to gather information about a protein’s conformational space and interpret changes in protein structure. However, experimentally observing and computationally modeling large proteins remain critical challenges for structural biology. Our work aims at addressing these challenges by combining computational and experimental techniques relying on each other to overcome their respective limitations. Indeed, despite its advantages, an experimental technique such as hydrogen-exchange monitoring cannot produce structural models because of its low resolution. Additionally, the computational methods that can generate such models suffer from the curse of dimensionality when applied to large proteins. Adopting a common solution to this issue, we have recently proposed a framework in which our computational method for protein conformational sampling is biased by experimental hydrogen-exchange data. In this paper, we present our latest application of this computational framework: generating an atomic-resolution structural model for an unknown protein state. For that, starting from an available protein structure, we explore the conformational space of this protein, using hydrogen-exchange data on this unknown state as a guide. We have successfully used our computational framework to generate models for three proteins of increasing size, the biggest one undergoing large-scale conformational changes

    PepSim: T-Cell Cross-Reactivity Prediction via Comparison of Peptide Sequence and Peptide-Hla Structure

    Get PDF
    INTRODUCTION: Peptide-HLA class I (pHLA) complexes on the surface of tumor cells can be targeted by cytotoxic T-cells to eliminate tumors, and this is one of the bases for T-cell-based immunotherapies. However, there exist cases where therapeutic T-cells directed towards tumor pHLA complexes may also recognize pHLAs from healthy normal cells. The process where the same T-cell clone recognizes more than one pHLA is referred to as T-cell cross-reactivity and this process is driven mainly by features that make pHLAs similar to each other. T-cell cross-reactivity prediction is critical for designing T-cell-based cancer immunotherapies that are both effective and safe. METHODS: Here we present PepSim, a novel score to predict T-cell cross-reactivity based on the structural and biochemical similarity of pHLAs. RESULTS AND DISCUSSION: We show our method can accurately separate cross-reactive from non-crossreactive pHLAs in a diverse set of datasets including cancer, viral, and self-peptides. PepSim can be generalized to work on any dataset of class I peptide-HLAs and is freely available as a web server at pepsim.kavrakilab.org

    EnGens:a computational framework for generation and analysis of representative protein conformational ensembles

    Get PDF
    Proteins are dynamic macromolecules that perform vital functions in cells. A protein structure determines its function, but this structure is not static, as proteins change their conformation to achieve various functions. Understanding the conformational landscapes of proteins is essential to understand their mechanism of action. Sets of carefully chosen conformations can summarize such complex landscapes and provide better insights into protein function than single conformations. We refer to these sets as representative conformational ensembles. Recent advances in computational methods have led to an increase in the number of available structural datasets spanning conformational landscapes. However, extracting representative conformational ensembles from such datasets is not an easy task and many methods have been developed to tackle it. Our new approach, EnGens (short for ensemble generation), collects these methods into a unified framework for generating and analyzing representative protein conformational ensembles. In this work, we: (1) provide an overview of existing methods and tools for representative protein structural ensemble generation and analysis; (2) unify existing approaches in an open-source Python package, and a portable Docker image, providing interactive visualizations within a Jupyter Notebook pipeline; (3) test our pipeline on a few canonical examples from the literature. Representative ensembles produced by EnGens can be used for many downstream tasks such as protein–ligand ensemble docking, Markov state modeling of protein dynamics and analysis of the effect of single-point mutations.</p

    EnGens: a computational framework for generation and analysis of representative protein conformational ensembles

    Get PDF
    Proteins are dynamic macromolecules that perform vital functions in cells. A protein structure determines its function, but this structure is not static, as proteins change their conformation to achieve various functions. Understanding the conformational landscapes of proteins is essential to understand their mechanism of action. Sets of carefully chosen conformations can summarize such complex landscapes and provide better insights into protein function than single conformations. We refer to these sets as representative conformational ensembles. Recent advances in computational methods have led to an increase in the number of available structural datasets spanning conformational landscapes. However, extracting representative conformational ensembles from such datasets is not an easy task and many methods have been developed to tackle it. Our new approach, EnGens (short for ensemble generation), collects these methods into a unified framework for generating and analyzing representative protein conformational ensembles. In this work, we: (1) provide an overview of existing methods and tools for representative protein structural ensemble generation and analysis; (2) unify existing approaches in an open-source Python package, and a portable Docker image, providing interactive visualizations within a Jupyter Notebook pipeline; (3) test our pipeline on a few canonical examples from the literature. Representative ensembles produced by EnGens can be used for many downstream tasks such as protein–ligand ensemble docking, Markov state modeling of protein dynamics and analysis of the effect of single-point mutations

    Interpreting T-Cell Cross-reactivity through Structure: Implications for TCR-Based Cancer Immunotherapy

    Get PDF
    Immunotherapy has become one of the most promising avenues for cancer treatment, making use of the patient\u27s own immune system to eliminate cancer cells. Clinical trials with T-cell-based immunotherapies have shown dramatic tumor regressions, being effective in multiple cancer types and for many different patients. Unfortunately, this progress was tempered by reports of serious (even fatal) side effects. Such therapies rely on the use of cytotoxic T-cell lymphocytes, an essential part of the adaptive immune system. Cytotoxic T-cells are regularly involved in surveillance and are capable of both eliminating diseased cells and generating protective immunological memory. The specificity of a given T-cell is determined through the structural interaction between the T-cell receptor (TCR) and a peptide-loaded major histocompatibility complex (MHC); i.e., an intracellular peptide-ligand displayed at the cell surface by an MHC molecule. However, a given TCR can recognize different peptide-MHC (pMHC) complexes, which can sometimes trigger an unwanted response that is referred to as T-cell cross-reactivity. This has become a major safety issue in TCR-based immunotherapies, following reports of melanoma-specific T-cells causing cytotoxic damage to healthy tissues (e.g., heart and nervous system). T-cell cross-reactivity has been extensively studied in the context of viral immunology and tissue transplantation. Growing evidence suggests that it is largely driven by structural similarities of seemingly unrelated pMHC complexes. Here, we review recent reports about the existence of pMHC hot-spots for cross-reactivity and propose the existence of a TCR interaction profile (i.e., a refinement of a more general TCR footprint in which some amino acid residues are more important than others in triggering T-cell cross-reactivity). We also make use of available structural data and pMHC models to interpret previously reported cross-reactivity patterns among virus-derived peptides. Our study provides further evidence that structural analyses of pMHC complexes can be used to assess the intrinsic likelihood of cross-reactivity among peptide-targets. Furthermore, we hypothesize that some apparent inconsistencies in reported cross-reactivities, such as a preferential directionality, might also be driven by particular structural features of the targeted pMHC complex. Finally, we explain why TCR-based immunotherapy provides a special context in which meaningful T-cell cross-reactivity predictions can be made

    DINC-COVID : a webserver for ensemble docking with flexible SARS-CoV-2 proteins

    Get PDF
    An unprecedented research effort has been undertaken in response to the ongoing COVID-19 pandemic. This has included the determination of hundreds of crystallographic structures of SARS-CoV-2 proteins, and numerous virtual screening projects searching large compound libraries for potential drug inhibitors. Unfortunately, these initiatives have had very limited success in producing effective inhibitors against SARS-CoV-2 proteins. A reason might be an often overlooked factor in these computational efforts: receptor flexibility. To address this issue we have implemented a computational tool for ensemble docking with SARS-CoV-2 proteins. We have extracted representative ensembles of protein conformations from the Protein Data Bank and from in silico molecular dynamics simulations. Twelve pre-computed ensembles of SARS-CoV-2 protein conformations have now been made available for ensemble docking via a user-friendly webserver called DINC-COVID (dinc-covid.kavrakilab.org). We have validated DINC-COVID using data on tested inhibitors of two SARS-CoV-2 proteins, obtaining good correlations between docking-derived binding energies and experimentally-determined binding affinities. Some of the best results have been obtained on a dataset of large ligands resolved via room temperature crystallography, and therefore capturing alternative receptor conformations. In addition, we have shown that the ensembles available in DINC-COVID capture different ranges of receptor flexibility, and that this diversity is useful in finding alternative binding modes of ligands. Overall, our work highlights the importance of accounting for receptor flexibility in docking studies, and provides a platform for the identification of new inhibitors against SARS-CoV-2 proteins
    • 

    corecore