110 research outputs found
Microbiome preterm birth DREAM challenge: Crowdsourcing machine learning approaches to advance preterm birth research
This research was carried out within the framework of the DREAM Community of Premature Births, of which UDC researchers Diego Fernández-Edreira and Carlos Fernández-Lozano, who have collaborated in the research, are members.Supplementary research data are available at https://www.cell.com/cms/10.1016/j.xcrm.2023.101350/attachment/e44bcada-f500-4f17-bc33-0ee5d39b3c4b/mmc1.pdf.[Abstract]: Every year, 11% of infants are born preterm with significant health consequences, with the vaginal microbiome a risk factor for preterm birth. We crowdsource models to predict (1) preterm birth (PTB; <37 weeks) or (2) early preterm birth (ePTB; <32 weeks) from 9 vaginal microbiome studies representing 3,578 samples from 1,268 pregnant individuals, aggregated from public raw data via phylogenetic harmonization. The predictive models are validated on two independent unpublished datasets representing 331 samples from 148 pregnant individuals. The top-performing models (among 148 and 121 submissions from 318 teams) achieve area under the receiver operator characteristic (AUROC) curve scores of 0.69 and 0.87 predicting PTB and ePTB, respectively. Alpha diversity, VALENCIA community state types, and composition are important features in the top-performing models, most of which are tree-based methods. This work is a model for translation of microbiome data into clinically relevant predictive models and to better understand preterm birth.We thank members of the Sirota Lab, University of California, San Francisco, for useful discussion. This study was supported by the March of Dimes (J.L.G., T.T.O., A.R., A.S.T., V.C., C.W.Y.H., R.J.W., K.J.F., G.A., I.K., J.B., A.N., J.G., Z.W., P.N., A.K., I.B., E.K., S.J., S.N., Y.S.L., P.R.B., D.A.M., S.V.L., J.A., D.K.S., N.Aghaeepour, J.C.C., M.S.) and R35GM138353 (N.Aghaeepour), 1R01HL139844 (N.Aghaeepour), 3P30AG066515 (N.Aghaeepour), 1R61NS114926 (N.Aghaeepour), 1R01AG058417 (N.Aghaeepour), R01HD105256 (N.Aghaeepour, M.S.), P01HD106414 (N.Aghaeepour), R01GM140464 (J.G., Z.W., G.C., Z.-Z.T.), NSF DMS-2054346 (J.G., Z.W., G.C., Z.-Z.T.); the Burroughs Welcome Fund (N.Aghaeepour); the Alfred E. Mann Foundation (N.Aghaeepour); and the Robertson Foundation (N.Aghaeepour). A.P.-L. and P.D.-G. are receiving honoraria from the IVI Foundation.United States. National Institute of General Medical Sciences; R35GM138353United States. National Institutes of Health; 1R01HL139844United States. National Institutes of Health; 3P30AG066515United States. National Institutes of Health; 1R61NS114926United States. National Institute on Aging; 1R01AG058417United States. National Institute of Child Health and Human Development; R01HD105256United States. National Institute of Child Health and Human Development; P01HD106414United States. National Institutes of Health; R01GM140464United States. National Science Foundation; DMS-205434
Microbiome preterm birth DREAM challenge: Crowdsourcing machine learning approaches to advance preterm birth research
Every year, 11% of infants are born preterm with significant health consequences, with the vaginal microbiome a risk factor for preterm birth. We crowdsource models to predict (1) preterm birth (PTB; \u3c37 \u3eweeks) or (2) early preterm birth (ePTB; \u3c32 \u3eweeks) from 9 vaginal microbiome studies representing 3,578 samples from 1,268 pregnant individuals, aggregated from public raw data via phylogenetic harmonization. The predictive models are validated on two independent unpublished datasets representing 331 samples from 148 pregnant individuals. The top-performing models (among 148 and 121 submissions from 318 teams) achieve area under the receiver operator characteristic (AUROC) curve scores of 0.69 and 0.87 predicting PTB and ePTB, respectively. Alpha diversity, VALENCIA community state types, and composition are important features in the top-performing models, most of which are tree-based methods. This work is a model for translation of microbiome data into clinically relevant predictive models and to better understand preterm birth
Recording behaviour of indoor-housed farm animals automatically using machine vision technology: a systematic review
Large-scale phenotyping of animal behaviour traits is time consuming and has led to increased demand for technologies that can automate these procedures. Automated tracking of animals has been successful in controlled laboratory settings, but recording from animals in large groups in highly variable farm settings presents challenges. The aim of this review is to provide a systematic overview of the advances that have occurred in automated, high throughput image detection of farm animal behavioural traits with welfare and production implications. Peer-reviewed publications written in English were reviewed systematically following Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. After identification, screening, and assessment for eligibility, 108 publications met these specifications and were included for qualitative synthesis. Data collected from the papers included camera specifications, housing conditions, group size, algorithm details, procedures, and results. Most studies utilized standard digital colour video cameras for data collection, with increasing use of 3D cameras in papers published after 2013. Papers including pigs (across production stages) were the most common (n = 63). The most common behaviours recorded included activity level, area occupancy, aggression, gait scores, resource use, and posture. Our review revealed many overlaps in methods applied to analysing behaviour, and most studies started from scratch instead of building upon previous work. Training and validation sample sizes were generally small (mean±s.d. groups = 3.8±5.8) and in data collection and testing took place in relatively controlled environments. To advance our ability to automatically phenotype behaviour, future research should build upon existing knowledge and validate technology under commercial settings and publications should explicitly describe recording conditions in detail to allow studies to be reproduced
Expanding the diversity of mycobacteriophages: insights into genome architecture and evolution.
Mycobacteriophages are viruses that infect mycobacterial hosts such as Mycobacterium smegmatis and Mycobacterium tuberculosis. All mycobacteriophages characterized to date are dsDNA tailed phages, and have either siphoviral or myoviral morphotypes. However, their genetic diversity is considerable, and although sixty-two genomes have been sequenced and comparatively analyzed, these likely represent only a small portion of the diversity of the mycobacteriophage population at large. Here we report the isolation, sequencing and comparative genomic analysis of 18 new mycobacteriophages isolated from geographically distinct locations within the United States. Although no clear correlation between location and genome type can be discerned, these genomes expand our knowledge of mycobacteriophage diversity and enhance our understanding of the roles of mobile elements in viral evolution. Expansion of the number of mycobacteriophages grouped within Cluster A provides insights into the basis of immune specificity in these temperate phages, and we also describe a novel example of apparent immunity theft. The isolation and genomic analysis of bacteriophages by freshman college students provides an example of an authentic research experience for novice scientists
The United States COVID-19 Forecast Hub dataset
Academic researchers, government agencies, industry groups, and individuals have produced forecasts at an unprecedented scale during the COVID-19 pandemic. To leverage these forecasts, the United States Centers for Disease Control and Prevention (CDC) partnered with an academic research lab at the University of Massachusetts Amherst to create the US COVID-19 Forecast Hub. Launched in April 2020, the Forecast Hub is a dataset with point and probabilistic forecasts of incident cases, incident hospitalizations, incident deaths, and cumulative deaths due to COVID-19 at county, state, and national, levels in the United States. Included forecasts represent a variety of modeling approaches, data sources, and assumptions regarding the spread of COVID-19. The goal of this dataset is to establish a standardized and comparable set of short-term forecasts from modeling teams. These data can be used to develop ensemble models, communicate forecasts to the public, create visualizations, compare models, and inform policies regarding COVID-19 mitigation. These open-source data are available via download from GitHub, through an online API, and through R packages
Robust estimation of bacterial cell count from optical density
Optical density (OD) is widely used to estimate the density of cells in liquid culture, but cannot be compared between instruments without a standardized calibration protocol and is challenging to relate to actual cell count. We address this with an interlaboratory study comparing three simple, low-cost, and highly accessible OD calibration protocols across 244 laboratories, applied to eight strains of constitutive GFP-expressing E. coli. Based on our results, we recommend calibrating OD to estimated cell count using serial dilution of silica microspheres, which produces highly precise calibration (95.5% of residuals <1.2-fold), is easily assessed for quality control, also assesses instrument effective linear range, and can be combined with fluorescence calibration to obtain units of Molecules of Equivalent Fluorescein (MEFL) per cell, allowing direct comparison and data fusion with flow cytometry measurements: in our study, fluorescence per cell measurements showed only a 1.07-fold mean difference between plate reader and flow cytometry data
Harnessing the NEON data revolution to advance open environmental science with a diverse and data-capable community
It is a critical time to reflect on the National Ecological Observatory Network (NEON) science to date as well as envision what research can be done right now with NEON (and other) data and what training is needed to enable a diverse user community. NEON became fully operational in May 2019 and has pivoted from planning and construction to operation and maintenance. In this overview, the history of and foundational thinking around NEON are discussed. A framework of open science is described with a discussion of how NEON can be situated as part of a larger data constellation—across existing networks and different suites of ecological measurements and sensors. Next, a synthesis of early NEON science, based on >100 existing publications, funded proposal efforts, and emergent science at the very first NEON Science Summit (hosted by Earth Lab at the University of Colorado Boulder in October 2019) is provided. Key questions that the ecology community will address with NEON data in the next 10 yr are outlined, from understanding drivers of biodiversity across spatial and temporal scales to defining complex feedback mechanisms in human–environmental systems. Last, the essential elements needed to engage and support a diverse and inclusive NEON user community are highlighted: training resources and tools that are openly available, funding for broad community engagement initiatives, and a mechanism to share and advertise those opportunities. NEON users require both the skills to work with NEON data and the ecological or environmental science domain knowledge to understand and interpret them. This paper synthesizes early directions in the community’s use of NEON data, and opportunities for the next 10 yr of NEON operations in emergent science themes, open science best practices, education and training, and community building
Ten simple rules for creating a scientific web application
The use of scientific web applications (SWApps) across biological and environmental sciences has grown exponentially over the past decades or so. Although quantitative evidence for such increased use in practice is scant, collectively, we have observed that these tools become more commonplace in teaching, outreach, and in science coproduction (e.g., as decision support tools). Despite the increased popularity of SWApps, researchers often receive little or no training in creating such tools. Although rolling out SWApps can be a relatively simple and quick process using modern, popular platforms like R shiny apps or Tableau dashboards, making them useful, usable, and sustainable is not. These 10 simple rules for creating a SWApp provide a foundation upon which researchers with little to no experience in web application design and development can consider, plan, and carry out SWApp projects
Relationships between GAT1 and PTSD, Depression, and Substance Use Disorder
Post-traumatic stress disorder (PTSD), Major Depressive Disorder (MDD), and Substance Use Disorder (SUD) have large public health impacts. Therefore, researchers have attempted to identify those at greatest risk for these phenotypes. PTSD, MDD, and SUD are in part genetically influenced. Additionally, genes in the glutamate and gamma-aminobutyric acid (GABA) system are implicated in the encoding of emotional and fear memories, and thus may impact these phenotypes. The current study examined the associations of single nucleotide polymorphisms in GAT1 individually, and at the gene level, using a principal components (PC) approach, with PTSD, PTSD comorbid with MDD, and PTSD comorbid with SUD in 486 combat-exposed veterans. Findings indicate that several GAT1 SNPs, as well as one of the GAT1 PCs, was associated with PTSD, with and without MDD and SUD comorbidity. The present study findings provide initial insights into one pathway by which shared genetic risk influences PTSD-MDD and PTSD-SUD comorbidities, and thus identify a high-risk group (based on genotype) on whom prevention and intervention efforts should be focused
- …