48 research outputs found
Taxonomy and the Production of Semantic Phenotypes
Preprint of chapter appearing in "Studies on the Semantic Web: Volume 33: Application of Semantic Technology in Biodiversity Science"Taxonomists produce a myriad of phenotypic descriptions. Traditionally these are provided in terse (telegraphic) natural language. As seen in parallel within other fields of biology researchers are exploring ways to formalize parts of the taxonomic process so that aspects of it are more computational in nature. The currently used data formalizations, mechanisms for persisting data, applications, and computing approaches related to the production of semantic descriptions (phenotypes) are reviewed, they, and their adopters are limited in number. In order to move forward we step back and characterize taxonomists with respect to their typical workflow and tendencies. We then use these characteristics as a basis for exploring how we might create software that taxonomists will find intuitive within their cur-
rent workflows, providing interface examples as thought experiments.NSF - DBI-1356381NSF 0956049https://deepblue.lib.umich.edu/bitstream/2027.42/148811/1/yoder_proof.pdfDescription of yoder_proof.pdf : Proof of book chapte
A Functional Screen Provides Evidence for a Conserved, Regulatory, Juxtamembrane Phosphorylation Site in Guanylyl Cyclase A and B
Kinase homology domain (KHD) phosphorylation is required for activation of guanylyl cyclase (GC)-A and -B. Phosphopeptide mapping identified multiple phosphorylation sites in GC-A and GC-B, but these approaches have difficulty identifying sites in poorly detected peptides. Here, a functional screen was conducted to identify novel sites. Conserved serines or threonines in the KHDs of phosphorylated receptor GCs were mutated to alanine and tested for reduced hormone to detergent activity ratios. Mutation of Ser-489 in GC-B to alanine but not glutamate reduced the activity ratio to 60% of wild type (WT) levels. Similar results were observed with Ser-473, the homologous site in GC-A. Receptors containing glutamates for previously identified phosphorylation sites (GC-A-6E and GC-B-6E) were activated to ∼20% of WT levels but the additional glutamate substitution for S473 or S489 increased activity to near WT levels. Substrate-velocity assays indicated that GC-B-WT-S489E and GC-B-6E-S489E had lower Km values and that WT-GC-B-S489A, GC-B-6E and GC-B-6E-S489A had higher Km values than WT-GC-B. Homologous desensitization was enhanced when GC-A contained the S473E substitution, and GC-B-6E-S489E was resistant to inhibition by a calcium elevating treatment or protein kinase C activation – processes that dephosphorylate GC-B. Mass spectrometric detection of a synthetic phospho-Ser-473 containing peptide was 200–1300-fold less sensitive than other phosphorylated peptides and neither mass spectrometric nor 32PO4 co-migration studies detected phospho-Ser-473 or phospho-Ser-489 in cells. We conclude that Ser-473 and Ser-489 are Km-regulating phosphorylation sites that are difficult to detect using current methods
Deciphering the Code for Retroviral Integration Target Site Selection
Upon cell invasion, retroviruses generate a DNA copy of their RNA genome and integrate retroviral cDNA within host chromosomal DNA. Integration occurs throughout the host cell genome, but target site selection is not random. Each subgroup of retrovirus is distinguished from the others by attraction to particular features on chromosomes. Despite extensive efforts to identify host factors that interact with retrovirion components or chromosome features predictive of integration, little is known about how integration sites are selected. We attempted to identify markers predictive of retroviral integration by exploiting Precision-Recall methods for extracting information from highly skewed datasets to derive robust and discriminating measures of association. ChIPSeq datasets for more than 60 factors were compared with 14 retroviral integration datasets. When compared with MLV, PERV or XMRV integration sites, strong association was observed with STAT1, acetylation of H3 and H4 at several positions, and methylation of H2AZ, H3K4, and K9. By combining peaks from ChIPSeq datasets, a supermarker was identified that localized within 2 kB of 75% of MLV proviruses and detected differences in integration preferences among different cell types. The supermarker predicted the likelihood of integration within specific chromosomal regions in a cell-type specific manner, yielding probabilities for integration into proto-oncogene LMO2 identical to experimentally determined values. The supermarker thus identifies chromosomal features highly favored for retroviral integration, provides clues to the mechanism by which retrovirus integration sites are selected, and offers a tool for predicting cell-type specific proto-oncogene activation by retroviruses
Evaluation of individual and ensemble probabilistic forecasts of COVID-19 mortality in the United States
Short-term probabilistic forecasts of the trajectory of the COVID-19 pandemic in the United States have served as a visible and important communication channel between the scientific modeling community and both the general public and decision-makers. Forecasting models provide specific, quantitative, and evaluable predictions that inform short-term decisions such as healthcare staffing needs, school closures, and allocation of medical supplies. Starting in April 2020, the US COVID-19 Forecast Hub (https://covid19forecasthub.org/) collected, disseminated, and synthesized tens of millions of specific predictions from more than 90 different academic, industry, and independent research groups. A multimodel ensemble forecast that combined predictions from dozens of groups every week provided the most consistently accurate probabilistic forecasts of incident deaths due to COVID-19 at the state and national level from April 2020 through October 2021. The performance of 27 individual models that submitted complete forecasts of COVID-19 deaths consistently throughout this year showed high variability in forecast skill across time, geospatial units, and forecast horizons. Two-thirds of the models evaluated showed better accuracy than a naïve baseline model. Forecast accuracy degraded as models made predictions further into the future, with probabilistic error at a 20-wk horizon three to five times larger than when predicting at a 1-wk horizon. This project underscores the role that collaboration and active coordination between governmental public-health agencies, academic modeling teams, and industry partners can play in developing modern modeling capabilities to support local, state, and federal response to outbreaks
The United States COVID-19 Forecast Hub dataset
Academic researchers, government agencies, industry groups, and individuals have produced forecasts at an unprecedented scale during the COVID-19 pandemic. To leverage these forecasts, the United States Centers for Disease Control and Prevention (CDC) partnered with an academic research lab at the University of Massachusetts Amherst to create the US COVID-19 Forecast Hub. Launched in April 2020, the Forecast Hub is a dataset with point and probabilistic forecasts of incident cases, incident hospitalizations, incident deaths, and cumulative deaths due to COVID-19 at county, state, and national, levels in the United States. Included forecasts represent a variety of modeling approaches, data sources, and assumptions regarding the spread of COVID-19. The goal of this dataset is to establish a standardized and comparable set of short-term forecasts from modeling teams. These data can be used to develop ensemble models, communicate forecasts to the public, create visualizations, compare models, and inform policies regarding COVID-19 mitigation. These open-source data are available via download from GitHub, through an online API, and through R packages