21 research outputs found

    Crystal structure of thebaine 6-O-demethylase from the morphine biosynthesis pathway

    Get PDF
    Thebaine 6-O-demethylase (T6ODM) from Papaver somniferum (opium poppy) is a key enzyme in the morphine biosynthesis pathway that belongs to the non-heme 2-oxoglutarate/Fe(II)-dependent dioxygenases (ODD) family. Initially, T6ODM was characterized as an enzyme catalyzing Odemethylation of thebaine to neopinone and oripavine to morphinone, however recently the substrate range of T6ODM was expanded to a number of various benzylisoquinoline alkaloids. Here, we present crystal structures of T6ODM in complexes with 2-oxoglutarate (T6ODM:2OG, PDB: 5O9W) and succinate (T6ODM:SIN, PDB: 5O7Y). The arrangement of the T6ODM’s active site is typical for proteins from the ODD family, but the enzyme is characterized by a large substrate binding cavity, whose volume can partially explain the T6ODM promiscuity. Moreover, the size of the cavity allows for binding of multiple molecules at once, posing a question about substrate-driven specificity of the enzyme

    The United States COVID-19 Forecast Hub dataset

    Get PDF
    Academic researchers, government agencies, industry groups, and individuals have produced forecasts at an unprecedented scale during the COVID-19 pandemic. To leverage these forecasts, the United States Centers for Disease Control and Prevention (CDC) partnered with an academic research lab at the University of Massachusetts Amherst to create the US COVID-19 Forecast Hub. Launched in April 2020, the Forecast Hub is a dataset with point and probabilistic forecasts of incident cases, incident hospitalizations, incident deaths, and cumulative deaths due to COVID-19 at county, state, and national, levels in the United States. Included forecasts represent a variety of modeling approaches, data sources, and assumptions regarding the spread of COVID-19. The goal of this dataset is to establish a standardized and comparable set of short-term forecasts from modeling teams. These data can be used to develop ensemble models, communicate forecasts to the public, create visualizations, compare models, and inform policies regarding COVID-19 mitigation. These open-source data are available via download from GitHub, through an online API, and through R packages

    Projected resurgence of COVID-19 in the United States in July—December 2021 resulting from the increased transmissibility of the Delta variant and faltering vaccination

    Get PDF
    In Spring 2021, the highly transmissible SARS-CoV-2 Delta variant began to cause increases in cases, hospitalizations, and deaths in parts of the United States. At the time, with slowed vaccination uptake, this novel variant was expected to increase the risk of pandemic resurgence in the US in summer and fall 2021. As part of the COVID-19 Scenario Modeling Hub, an ensemble of nine mechanistic models produced 6-month scenario projections for July–December 2021 for the United States. These projections estimated substantial resurgences of COVID-19 across the US resulting from the more transmissible Delta variant, projected to occur across most of the US, coinciding with school and business reopening. The scenarios revealed that reaching higher vaccine coverage in July–December 2021 reduced the size and duration of the projected resurgence substantially, with the expected impacts was largely concentrated in a subset of states with lower vaccination coverage. Despite accurate projection of COVID-19 surges occurring and timing, the magnitude was substantially underestimated 2021 by the models compared with the of the reported cases, hospitalizations, and deaths occurring during July–December, highlighting the continued challenges to predict the evolving COVID-19 pandemic. Vaccination uptake remains critical to limiting transmission and disease, particularly in states with lower vaccination coverage. Higher vaccination goals at the onset of the surge of the new variant were estimated to avert over 1.5 million cases and 21,000 deaths, although may have had even greater impacts, considering the underestimated resurgence magnitude from the model

    CheckMyBlob evaluation data set (TAMC)

    No full text
    A data set of ligands used to evaluate the CheckMyBlob method, described in the Kowiel et al. paper "Automatic recognition of ligands in electron density by machine learning methods". This data set attempts to repeat the experimental setup from Terwilliger et al. described in "Ligand identification using electron-density map correlations". It consists of ligands from X-ray diffraction experiments with 6–150 non-H atoms. Connected PDB ligands were labeled as single alphabetically ordered strings of hetero-compound codes, whereas unknown species, water molecules, standard amino acids, and nucleotides were excluded. Finally, the data set was limited to 200 most popular ligands. The resulting data set consisted of 161,758 examples with individual ligand counts ranging from 36,535 examples for GOL (glycerol) to 114 for LMG (1,2-distearoyl-monogalactosyl-diglyceride). For machine learning (classification) purposes, the target attribute is: res_name

    CheckMyBlob evaluation data set (CL)

    No full text
    A data set of ligands used to evaluate the CheckMyBlob method, described in the Kowiel et al. paper "Automatic recognition of ligands in electron density by machine learning methods". This data set repeats the setup used in the study of Carolan & Lamzin titled "Automated identification of crystallographic ligands using sparse-density representations". It consists of ligands from X-ray diffraction experiments with 1.0–2.5 Å resolution. Adjacent PDB ligands were not connected. Ligands were labeled according to the PDB naming convention. The data set was limited to the 82 ligand types listed by Carolan & Lamzin. The resulting data set consists of 121,360 examples with ligand counts ranging from 42,622 examples for SO4 to 16 for SPO (spheroidene). For machine learning (classification) purposes, the target attribute is: res_name

    Phase-Informed Bayesian Ensemble Models Improve Performance of COVID-19 Forecasts

    No full text
    Despite hundreds of methods published in the literature, forecasting epidemic dynamics remains challenging yet important. The challenges stem from multiple sources, including: the need for timely data, co-evolution of epidemic dynamics with behavioral and immunological adaptations, and the evolution of new pathogen strains. The ongoing COVID-19 pandemic highlighted these challenges; in an important article, Reich et al. did a comprehensive analysis highlighting many of these challenges. In this paper, we take another step in critically evaluating existing epidemic forecasting methods. Our methods are based on a simple yet crucial observation - epidemic dynamics go through a number of phases (waves). Armed with this understanding, we propose a modification to our deployed Bayesian ensembling case time series forecasting framework. We show that ensembling methods employing the phase information and using different weighting schemes for each phase can produce improved forecasts. We evaluate our proposed method with both the currently deployed model and the COVID-19 forecasthub models. The overall performance of the proposed model is consistent across the pandemic but more importantly, it is ranked third and first during two critical rapid growth phases in cases, regimes where the performance of most models from the CDC forecasting hub dropped significantly

    CheckMyBlob ligand data set (CMB)

    No full text
    <p>Ligand data set prepared for the CheckMyBlob study, described in <em>"Automatic recognition of ligands in electron density by machine learning methods"</em> by Kowiel, M. <em>et al.</em> It contains only structures from X-ray diffraction experiments determined to at least 4.0 Å resolution. Entries with R factor above 0.3 or ligands below 0.3 occupancy (according to wwPDB validation reports) were rejected. Only ligands with at least 2 non-H atoms were considered and structures with low ligand map correlation coefficients (RSCC < 0.6, RSZO <= 1, RSZD > 6.0) were removed. Apart from taking into account quality factors, we removed from the experimental data set all moieties that are not considered proper ligands. These included: unknown species, water molecules, standard amino acids, and selected nucleotides. Moreover, connected ligands (as per the naming convention in the PDB) were labeled as alphabetically ordered strings of hetero-compound codes (e.g., NAG-NAG-NAG-NAG). Finally, the data set was limited to 200 most popular ligands. The resulting data set consisted of 219,986 examples with individual ligand counts ranging from 48,490 examples for SO4 (sulfate ion) to 106 for A2G (n-acetyl-2-deoxy-2-amino-galactose). More details concerning data selection can be found in the paper of Kowiel <em>et al.</em></p> <p>For machine learning (classification) purposes, the target attribute is: <strong>res_name</strong>.</p

    Structural and immunologic characterization of bovine, horse, and rabbit serum albumins

    No full text
    Serum albumin (SA) is the most abundant plasma protein in mammals. SA is a multifunctional protein with extraordinary ligand binding capacity, making it a transporter molecule for a diverse range of metabolites, drugs, nutrients, metals and other molecules. Due to its ligand binding properties, albumins have wide clinical, pharmaceutical, and biochemical applications. Albumins are also allergenic, and exhibit a high degree of cross-reactivity due to significant sequence and structure similarity of SAs from different organisms. Here we present crystal structures of albumins from cattle (BSA), horse (ESA) and rabbit (RSA) serums. The structural data are correlated with the results of immunological studies of SAs. We also analyze the conservation or divergence of structures and sequences of SAs in the context of their potential allergenicity and cross-reactivity. In addition, we identified a previously uncharacterized ligand binding site in the structure of RSA, and calcium binding sites in the structure of BSA, which is the first serum albumin structure to contain metal ions
    corecore