19 research outputs found
The metastasis-promoting protein S100A4 regulates mammary branching morphogenesis
AbstractHigh levels of the S100 calcium binding protein S100A4 also called fibroblast specific protein 1 (FSP1) have been established as an inducer of metastasis and indicator of poor prognosis in breast cancer. The mechanism by which S100A4 leads to increased cancer aggressiveness has yet to be established; moreover, the function of this protein in normal mammary gland biology has not been investigated. To address the role of S100A4 in normal mammary gland, its spatial and temporal expression patterns and possible function in branching morphogenesis were investigated. We show that the protein is expressed mainly in cells of the stromal compartment of adult humans, and during active ductal development, in pregnancy and in involution of mouse mammary gland. In 3D culture models, topical addition of S100A4 induced a significant increase in the TGFα mediated branching phenotype and a concomitant increase in expression of a previously identified branching morphogen, metalloproteinase-3 (MMP-3). These events were found to be dependent on MEK activation. Downregulation of S100A4 using shRNA significantly reduced TGFα induced branching and altered E-cadherin localization. These findings provide evidence that S100A4 is developmentally regulated and that it plays a functional role in mammary gland development, in concert with TGFα by activating MMP-3, and increasing invasion into the fat pad during branching. We suggest that S100A4-mediated effects during branching morphogenesis provide a plausible mechanism for how it may function in breast cancer progression
Climatic and geographic predictors of life history variation in Eastern Massasauga (Sistrurus catenatus): A range-wide synthesis
Elucidating how life history traits vary geographically is important to understanding variation in population dynamics. Because many aspects of ectotherm life history are climate-dependent, geographic variation in climate is expected to have a large impact on population dynamics through effects on annual survival, body size, growth rate, age at first reproduction, size-fecundity relationship, and reproductive frequency. The Eastern Massasauga (Sistrurus catenatus) is a small, imperiled North American rattlesnake with a distribution centered on the Great Lakes region, where lake effects strongly influence local conditions. To address Eastern Massasauga life history data gaps, we compiled data from 47 study sites representing 38 counties across the range. We used multimodel inference and general linear models with geographic coordinates and annual climate normals as explanatory variables to clarify patterns of variation in life history traits. We found strong evidence for geographic variation in six of nine life history variables. Adult female snout-vent length and neonate mass increased with increasing mean annual precipitation. Litter size decreased with increasing mean temperature, and the size-fecundity relationship and growth prior to first hibernation both increased with increasing latitude. The proportion of gravid females also increased with increasing latitude, but this relationship may be the result of geographically varying detection bias. Our results provide insights into ectotherm life history variation and fill critical data gaps, which will inform Eastern Massasauga conservation efforts by improving biological realism for models of population viability and climate change
Finishing the euchromatic sequence of the human genome
The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead
Synergies between centralized and federated approaches to data quality: a report from the national COVID cohort collaborative
Objective
In response to COVID-19, the informatics community united to aggregate as much clinical data as possible to characterize this new disease and reduce its impact through collaborative analytics. The National COVID Cohort Collaborative (N3C) is now the largest publicly available HIPAA limited dataset in US history with over 6.4 million patients and is a testament to a partnership of over 100 organizations.
Materials and Methods
We developed a pipeline for ingesting, harmonizing, and centralizing data from 56 contributing data partners using 4 federated Common Data Models. N3C data quality (DQ) review involves both automated and manual procedures. In the process, several DQ heuristics were discovered in our centralized context, both within the pipeline and during downstream project-based analysis. Feedback to the sites led to many local and centralized DQ improvements.
Results
Beyond well-recognized DQ findings, we discovered 15 heuristics relating to source Common Data Model conformance, demographics, COVID tests, conditions, encounters, measurements, observations, coding completeness, and fitness for use. Of 56 sites, 37 sites (66%) demonstrated issues through these heuristics. These 37 sites demonstrated improvement after receiving feedback.
Discussion
We encountered site-to-site differences in DQ which would have been challenging to discover using federated checks alone. We have demonstrated that centralized DQ benchmarking reveals unique opportunities for DQ improvement that will support improved research analytics locally and in aggregate.
Conclusion
By combining rapid, continual assessment of DQ with a large volume of multisite data, it is possible to support more nuanced scientific questions with the scale and rigor that they require
Isoforms and splice variant of transforming growth factor β–binding protein in rat hepatic stellate cells
Recommended from our members
The National COVID Cohort Collaborative: Clinical Characterization and Early Severity Prediction
The majority of U.S. reports of COVID-19 clinical characteristics, disease course, and treatments are from single health systems or focused on one domain. Here we report the creation of the National COVID Cohort Collaborative (N3C), a centralized, harmonized, high-granularity electronic health record repository that is the largest, most representative U.S. cohort of COVID-19 cases and controls to date. This multi-center dataset supports robust evidence-based development of predictive and diagnostic tools and informs critical care and policy.
In a retrospective cohort study of 1,926,526 patients from 34 medical centers nationwide, we stratified patients using a World Health Organization COVID-19 severity scale and demographics; we then evaluated differences between groups over time using multivariable logistic regression. We established vital signs and laboratory values among COVID-19 patients with different severities, providing the foundation for predictive analytics. The cohort included 174,568 adults with severe acute respiratory syndrome associated with SARS-CoV-2 (PCR >99% or antigen <1%) as well as 1,133,848 adult patients that served as lab-negative controls. Among 32,472 hospitalized patients, mortality was 11.6% overall and decreased from 16.4% in March/April 2020 to 8.6% in September/October 2020 (p = 0.002 monthly trend). In a multivariable logistic regression model, age, male sex, liver disease, dementia, African-American and Asian race, and obesity were independently associated with higher clinical severity. To demonstrate the utility of the N3C cohort for analytics, we used machine learning (ML) to predict clinical severity and risk factors over time. Using 64 inputs available on the first hospital day, we predicted a severe clinical course (death, discharge to hospice, invasive ventilation, or extracorporeal membrane oxygenation) using random forest and XGBoost models (AUROC 0.86 and 0.87 respectively) that were stable over time. The most powerful predictors in these models are patient age and widely available vital sign and laboratory values. The established expected trajectories for many vital signs and laboratory values among patients with different clinical severities validates observations from smaller studies, and provides comprehensive insight into COVID-19 characterization in U.S. patients.
This is the first description of an ongoing longitudinal observational study of patients seen in diverse clinical settings and geographical regions and is the largest COVID-19 cohort in the United States. Such data are the foundation for ML models that can be the basis for generalizable clinical decision support tools. The N3C Data Enclave is unique in providing transparent, reproducible, easily shared, versioned, and fully auditable data and analytic provenance for national-scale patient-level EHR data. The N3C is built for intensive ML analyses by academic, industry, and citizen scientists internationally. Many observational correlations can inform trial designs and care guidelines for this new disease
Relationship between latitude (untransformed) and age-zero annual growth as explained by the top-ranked model using AIC<sub>c</sub> (Table 3).
<p>The shaded area represents the smoothed 95% CI using t-based approximations. County and district abbreviations are as in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0172011#pone.0172011.g001" target="_blank">Fig 1</a>.</p
Relationship between latitude (untransformed) and size–fecundity (natural log back-transformed) as explained by the top-ranked model using AIC<sub>c</sub> (Table 3).
<p>Female size was held constant at 55.2 cm SVL based on the average size of adult females in Cass County, Michigan. The shaded area represents the smoothed 95% CI using t-based approximations. County and district abbreviations are as in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0172011#pone.0172011.g001" target="_blank">Fig 1</a>. The image of dam and offspring was taken within minutes of parturition in Cass County, Michigan (Photograph credit, E. T. Hileman).</p
Locations of Eastern Massasauga study sites (counties/districts shaded black) and the approximate historic range of the Eastern Massasauga (gray shading, from http://www.iucnredlist.org/).
<p>County and district codes: IA = Bremer, IA; IL.1 = Clinton, IL; IL.2 = DuPage, IL; IL.3 = Cook/ Lake, IL; IL.4 = Piatt, IL; IL.5 = Warren, IL, IL.6 = Will, IL; IN.1 = Hendricks, IN; IN.2 = LaGrange, IN; IN.3 = Marshall, IN; MI.1 = Barry, MI; MI.2 = Cass, MI; MI.3 = Kalkaska, MI; MI.4 = Lenawee, MI; MI.5 = Oakland, MI; MI.6 = Van Buren, MI; MI.7 = Washtenaw, MI; NY.1 = Genesee, NY; NY.2 = Onondaga, NY; OH.1 = Ashtabula, OH; OH.2 = Champaign, OH; OH.3 = Clark, OH; OH.4 = Greene, OH; OH.5 = Greene/ Warren, OH; OH.6 = Hardin, OH; OH.7 = Trumball, OH; OH.8 = Wyandot, OH; ONT.1 = Bruce, ONT; ONT.2 = Essex, ONT; ONT.3 = Muskoka, ONT; ONT.4 = Beausoliel Island, ONT; ONT.5 = Parry Sound District (1995–1996), ONT; ONT.6 = Parry Sound District (1992–2009), ONT; ONT.7 = Regional Municipality of Niagara, ONT; PA = Butler/ Venango, PA; WI.1 = Buffalo, WI; WI.2 = Juneau/ Monroe, WI. Reprinted and modified from [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0172011#pone.0172011.ref150" target="_blank">150</a>] under a CC BY license, with permission from [Collin P. Jaeger], original copyright [2016] (See <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0172011#pone.0172011.s003" target="_blank">S3 File</a>).</p