46 research outputs found

    Enhancing protein interaction prediction using deep learning and protein language models

    Full text link
    Proteins are large macromolecules that play critical roles in many cellular activities in living organisms. These include catalyzing metabolic reactions, mediating signal transduction, DNA replication, responding to stimuli, and transporting molecules, to name a few. Proteins perform their functions by interacting with other proteins and molecules. As a result, determining the nature of such interactions is critically important in many areas of biology and medicine. The primary structure of a protein refers to its specific sequence of amino acids, while the tertiary structure refers to its unique 3D shape, and the quaternary structure refers to the interaction of multiple protein subunits to form a larger, more complex structure. While the number of experimentally determined tertiary and quaternary structures are limited, databases of protein sequences continue to grow at an unprecedented rate, providing a wealth of information for training and improving sequence-based models. Recent developments in the sequence-based model using machine learning and deep learning has shown significant progress toward solving protein-related problems. Specifically, attention-based transformer models, a recent breakthrough in Natural Language Processing (NLP), has shown that large models trained on unlabeled data are able to learn powerful representations of protein sequences and can lead to significant improvements in understanding protein folding, function, and interactions, as well as in drug discovery and protein engineering. The research in this thesis has pursued two objectives using sequence-based modeling. The first is to use deep learning techniques based on NLP to address an important problem in cellular immune system studies, namely, predicting Major Histocompatibility Complex (MHC)-Peptide binding. The second is to improve the performance of the Cluspro docking server, a well-known protein-protein docking tool, in three ways: (i) integrating Cluspro with AlphaFold2, a well-known accurate protein structure predictor, for enhanced protein model docking, (ii) predicting distance maps to improve docking accuracy, and (iii) using regression techniques to rank protein clusters for better results

    Methods of Liver Stem Cell Therapy in Rodents as Models of Human Liver Regeneration in Hepatic Failure

    Get PDF
    Cell therapy is a promising intervention for treating liver diseases and liver failure. Different animal models of human liver cell therapy have been developed in recent years. Rats and mice are the most commonly used liver failure models. In fact, rodent models of hepatic failure have shown significant improvement in liver function after cell infusion. With the advent of stem-cell technologies, it is now possible to re-programme adult somatic cells such as skin or hair-follicle cells from individual patients to stem-like cells and differentiate them into liver cells. Such regenerative stem cells are highly promising in the personalization of cell therapy. The present review article will summarize current approaches to liver stem cell therapy with rodent models. In addition, we discuss common cell tracking techniques and how tracking data help to direct liver cell therapy research in animal models of hepatic failure

    Impact of Initial Population Density of the Dubas Bug, Ommatissus lybicus (Hemiptera: Tropiduchidae), on Oviposition Behaviour, Chlorophyll, Biomass and Nutritional Response of Date Palm (Phoenix dactylifera)

    Get PDF
    The Dubas bug (Ommatissus lybicus) is an economically significant pest of date palms. In this study, the effect of the population density of O. lybicus on chlorophyll, measured by the soil plant analysis development (SPAD) chlorophyll meter, palm biomass, and the nutritional composition of date palms, were investigated. A further objective was to determine significant relationships between the population density of O. lybicus, the number of honeydew droplets, and oviposited eggs. Reductions of up to 8–11% and 29–34% in chlorophyll content and plant biomass, respectively, were caused by infestations exceeding 300 nymphs per palm seedling. Increasing the population density of O. lybicus to 600 insects per palm decreased oviposition by females, suggesting intraspecific competition for resources. There was a significant relationship between honeydew droplets produced by the pest population and chlorophyll content in the rachis, suggesting that treatment can be triggered at 3–6 nymphs/leaflet. Egg oviposition was preferentially on the rachis. Ca, Mg, K, and P were the main nutrients affected by the activity of the pest. Mg content was associated with reduced chlorophyll content under increasing pest density, suggesting that supplemental nutrition can be potentially utilized to sustain chlorophyll and increase palm tolerance to pest infestation

    Tris(4,4′-bi-1,3-thia­zole-κ2 N,N′)iron(II) tetra­bromidoferrate(III) bromide

    Get PDF
    In the [Fe(4,4′-bit)3]2+ (4,4′-bit is 4,4′-bi-1,3-thia­zole) cation of the title compound, [Fe(C6H4N2S2)3][FeBr4]Br, the FeII atom (3 symmetry) is six-coordinated in a distorted octa­hedral geometry by six N atoms from three 4,4′-bit ligands. In the [FeBr4]− anion, the FeIII atom (3 symmetry) is four-coordinated in a distorted tetra­hedral geometry. In the crystal, inter­molecular C—H⋯Br hydrogen bonds and Br⋯π inter­actions [Br⋯centroid distances = 3.562 (3) and 3.765 (2) Å] link the cations and anions, stabilizing the structure

    Association between heavy metals and colon cancer: an ecological study based on geographical information systems in North-Eastern Iran

    Get PDF
    Background: Colorectal cancer has increased in Middle Eastern countries and exposure to environmental pollutants such as heavy metals has been implicated. However, data linking them to this disease are generally lacking. This study aimed to explore the spatial pattern of age-standardized incidence rate (ASR) of colon cancer and its potential association with the exposure level of the amount of heavy metals existing in rice produced in north-eastern Iran. Methods: Cancer data were drawn from the Iranian population-based cancer registry of Golestan Province, northeastern Iran. Samples of 69 rice milling factories were analysed for the concentration levels of cadmium, nickel, cobalt, copper, selenium, lead and zinc. The inverse distance weighting (IDW) algorithm was used to interpolate the concentration of this kind of heavy metals on the surface of the study area. Exploratory regression analysis was conducted to build ordinary least squares (OLS) models including every possible combination of the candidate explanatory variables and chose the most useful ones to show the association between heavy metals and the ASR of colon cancer. Results: The highest concentrations of heavy metals were found in the central part of the province and particularly counties with higher amount of cobalt were shown to be associated with higher ASR of men with colon cancer. In contrast, selenium concentrations were higher in areas with lower ASR of colon cancer in men. A significant regression equation for men with colon cancer was found (F(4,137) = 38.304, P < .000) with an adjusted R2 of 0.77. The predicted ASR of men colon cancer was − 58.36 with the coefficients for cobalt = 120.33; cadmium = 80.60; selenium = − 6.07; nickel = − 3.09; and zinc = − 0.41. The association of copper and lead with colon cancer in men was not significant. We did not find a significant outcome for colon cancer in women. Conclusion: Increased amounts of heavy metals in consumed rice may impact colon cancer incidence, both positively and negatively. While there were indications of an association between high cobalt concentrations and an increased risk for colon cancer, we found that high selenium concentrations might instead decrease the risk. Further investigations are needed to clarify if there are ecological or other reasons for these discrepancies. Regular monitoring of the amount of heavy metals in consumed rice is recommended.This study was supported by Golestan University of Medical Sciences (grant number of 90–10–1-30209)

    Spatial-time analysis of cardiovascular emergency medical requests: enlightening policy and practice

    Get PDF
    Background: Response time to cardiovascular emergency medical requests is an important indicator in reducing cardiovascular disease (CVD) -related mortality. This study aimed to visualize the spatial-time distribution of response time, scene time, and call-to-hospital time of these emergency requests. We also identified patterns of clusters of CVD-related calls. Methods: This cross-sectional study was conducted in Mashhad, north-eastern Iran, between August 2017 and December 2019. The response time to every CVD-related emergency medical request call was computed using spatial and classical statistical analyses. The Anselin Local Moran's I was performed to identify potential clusters in the patterns of CVD-related calls, response time, call-to-hospital arrival time, and scene-to-hospital arrival time at small area level (neighborhood level) in Mashhad, Iran. Results: There were 84,239 CVD-related emergency request calls, 61.64% of which resulted in the transport of patients to clinical centers by EMS, while 2.62% of callers (a total of 2218 persons) died before EMS arrival. The number of CVD-related emergency calls increased by almost 7% between 2017 and 2018, and by 19% between 2017 and 2019. The peak time for calls was between 9 p.m. and 1 a.m., and the lowest number of calls were recorded between 3 a.m. and 9 a.m. Saturday was the busiest day of the week in terms of call volume. There were statistically significant clusters in the pattern of CVD-related calls in the south-eastern region of Mashhad. Further, we found a large spatial variation in scene-to-hospital arrival time and call-to-hospital arrival time in the area under study. Conclusion: The use of geographical information systems and spatial analyses in modelling and quantifying EMS response time provides a new vein of knowledge for decision makers in emergency services management. Spatial as well as temporal clustering of EMS calls were present in the study area. The reasons for clustering of unfavorable time indices for EMS response requires further exploration. This approach enables policymakers to design tailored interventions to improve response time and reduce CVD-related mortality.This study was financially sponsored by Mashhad University of Medical Sciences (Project grant: 980861)

    Improved prediction of MHC-peptide binding using protein language models

    Get PDF
    Major histocompatibility complex Class I (MHC-I) molecules bind to peptides derived from intracellular antigens and present them on the surface of cells, allowing the immune system (T cells) to detect them. Elucidating the process of this presentation is essential for regulation and potential manipulation of the cellular immune system. Predicting whether a given peptide binds to an MHC molecule is an important step in the above process and has motivated the introduction of many computational approaches to address this problem. NetMHCPan, a pan-specific model for predicting binding of peptides to any MHC molecule, is one of the most widely used methods which focuses on solving this binary classification problem using shallow neural networks. The recent successful results of Deep Learning (DL) methods, especially Natural Language Processing (NLP-based) pretrained models in various applications, including protein structure determination, motivated us to explore their use in this problem. Specifically, we consider the application of deep learning models pretrained on large datasets of protein sequences to predict MHC Class I-peptide binding. Using the standard performance metrics in this area, and the same training and test sets, we show that our models outperform NetMHCpan4.1, currently considered as the-state-of-the-art

    Prediction of protein assemblies, the next frontier: The CASP14-CAPRI experiment

    Get PDF
    We present the results for CAPRI Round 50, the fourth joint CASP-CAPRI protein assembly prediction challenge. The Round comprised a total of twelve targets, including six dimers, three trimers, and three higher-order oligomers. Four of these were easy targets, for which good structural templates were available either for the full assembly, or for the main interfaces (of the higher-order oligomers). Eight were difficult targets for which only distantly related templates were found for the individual subunits. Twenty-five CAPRI groups including eight automatic servers submitted ~1250 models per target. Twenty groups including six servers participated in the CAPRI scoring challenge submitted ~190 models per target. The accuracy of the predicted models was evaluated using the classical CAPRI criteria. The prediction performance was measured by a weighted scoring scheme that takes into account the number of models of acceptable quality or higher submitted by each group as part of their five top-ranking models. Compared to the previous CASP-CAPRI challenge, top performing groups submitted such models for a larger fraction (70–75%) of the targets in this Round, but fewer of these models were of high accuracy. Scorer groups achieved stronger performance with more groups submitting correct models for 70–80% of the targets or achieving high accuracy predictions. Servers performed less well in general, except for the MDOCKPP and LZERD servers, who performed on par with human groups. In addition to these results, major advances in methodology are discussed, providing an informative overview of where the prediction of protein assemblies currently stands.Cancer Research UK, Grant/Award Number: FC001003; Changzhou Science and Technology Bureau, Grant/Award Number: CE20200503; Department of Energy and Climate Change, Grant/Award Numbers: DE-AR001213, DE-SC0020400, DE-SC0021303; H2020 European Institute of Innovation and Technology, Grant/Award Numbers: 675728, 777536, 823830; Institut national de recherche en informatique et en automatique (INRIA), Grant/Award Number: Cordi-S; Lietuvos Mokslo Taryba, Grant/Award Numbers: S-MIP-17-60, S-MIP-21-35; Medical Research Council, Grant/Award Number: FC001003; Japan Society for the Promotion of Science KAKENHI, Grant/Award Number: JP19J00950; Ministerio de Ciencia e Innovación, Grant/Award Number: PID2019-110167RB-I00; Narodowe Centrum Nauki, Grant/Award Numbers: UMO-2017/25/B/ST4/01026, UMO-2017/26/M/ST4/00044, UMO-2017/27/B/ST4/00926; National Institute of General Medical Sciences, Grant/Award Numbers: R21GM127952, R35GM118078, RM1135136, T32GM132024; National Institutes of Health, Grant/Award Numbers: R01GM074255, R01GM078221, R01GM093123, R01GM109980, R01GM133840, R01GN123055, R01HL142301, R35GM124952, R35GM136409; National Natural Science Foundation of China, Grant/Award Number: 81603152; National Science Foundation, Grant/Award Numbers: AF1645512, CCF1943008, CMMI1825941, DBI1759277, DBI1759934, DBI1917263, DBI20036350, IIS1763246, MCB1925643; NWO, Grant/Award Number: TOP-PUNT 718.015.001; Wellcome Trust, Grant/Award Number: FC00100

    An investigation of the use of the sustainable drainage for groundwater recharge: a case study

    No full text
    There is a considerable amount of evidence to suggest that the use of groundwater for water supplies in urban areas and developing cities is increasing as surface water supplies become polluted or fully exploited. Other factors include the increase in water demand occurring in response to population growth, with increasing use per capita. However, groundwater resources are only sustainable if the aquifers that provide them can be recharged, and in many localities, natural recharge is thought to be potentially, adversely affected by urbanisation, as this covers large areas with impermeable surfaces such as roads and buildings, which divert much needed water into surface water courses or artificial drainage. The aim of this research is therefore to investigate the use of sustainable drainage for groundwater recharge, using a case study of the area around Leighton Buzzard, in Bedfordshire, England. The detailed objectives of the research on which this thesis is based on includes: conducting comprehensive reviews of the geology, aquifers, groundwater pollution and statutory policies that relate to the study of the area. Within these generic studies, particular emphasis has been given to soil properties, infiltration design structure and the impact of urbanisation on groundwater recharge. Sustainability of groundwater resources have been considered as a primary objective for the authorities, groundwater sustainability and protection goes in parallel with conservation; therefore recharge of groundwater resources through the Sustainable urban drainage system (SuDS) achieves the sustainability objective of the environment, therefore SuDS is the most suitable and effective approach to recharge groundwater resources, to minimise environmental risk and to deliver future environmental benefits
    corecore